Hive doesn't have the level of SQL support that HAWQ provides especially around sub-selects. SparkSQL only support a subset of HiveQL, so the difference there is even bigger.
Sent from my iPhone > On Nov 13, 2015, at 9:39 AM, Biswas, Supriya <[email protected]> > wrote: > > Hello All – > > Hive 0.14 supports ACID and also supports transactions. Spark supports Hive > queries (HQL). > > Did anyone compare HAWQ with spark SQL or Hive HQL on Spark? > > Thanks. > > Supriyo Biswas > Architect – CPS Service Delivery > The Nielsen Company > Office (516) 682-6021/NETS 249-6021 > Cell (516) 353-6795 > www.nielsen.com > > From: Atri Sharma [mailto:[email protected]] > Sent: Friday, November 13, 2015 3:53 AM > To: [email protected] > Subject: Re: what is Hawq? > > Greenplum is open sourced. > > The main difference is between the two engines is that HAWQ is more for > Hadoop based systems whereas Greenplum is more towards regular FS. This is a > very high level difference between the two, the differences are more > detailed. But a single line difference between the two is the one I wrote. > > On 13 Nov 2015 14:20, "Adaryl "Bob" Wakefield, MBA" > <[email protected]> wrote: > Is Greenplum free? I heard they open sourced it but I haven’t found anything > but a community edition. > > Adaryl "Bob" Wakefield, MBA > Principal > Mass Street Analytics, LLC > 913.938.6685 > www.linkedin.com/in/bobwakefieldmba > Twitter: @BobLovesData > > From: dortmont > Sent: Friday, November 13, 2015 2:42 AM > To: [email protected] > Subject: Re: what is Hawq? > > I see the advantage of HAWQ compared to other Hadoop SQL engines. It looks > like the most mature solution on Hadoop thanks to the postgresql based engine. > > But why wouldn't I use Greenplum instead of HAWQ? It has even better > performance and it supports updates. > > Cheers > > 2015-11-13 7:45 GMT+01:00 Atri Sharma <[email protected]>: > +1 for transactions. > > I think a major plus point is that HAWQ supports transactions, and this > enables a lot of critical workloads to be done on HAWQ. > > On 13 Nov 2015 12:13, "Lei Chang" <[email protected]> wrote: > > Like what Bob said, HAWQ is a complete database and Drill is just a query > engine. > > And HAWQ has also a lot of other benefits over Drill, for example: > > 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can run > all TPCDS queries without any changes. And support almost all third party > tools, such as Tableau et al. > 2. Performance: proved the best in the hadoop world > 3. Scalability: high scalable via high speed UDP based interconnect. > 4. Transactions: as I know, drill does not support transactions. it is a > nightmare for end users to keep consistency. > 5. Advanced resource management: HAWQ has the most advanced resource > management. It natively supports YARN and easy to use hierarchical resource > queues. Resources can be managed and enforced on query and operator level. > > Cheers > Lei > > > On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA > <[email protected]> wrote: > There are a lot of tools that do a lot of things. Believe me it’s a full time > job keeping track of what is going on in the apache world. As I understand > it, Drill is just a query engine while Hawq is an actual database...some what > anyway. > > Adaryl "Bob" Wakefield, MBA > Principal > Mass Street Analytics, LLC > 913.938.6685 > www.linkedin.com/in/bobwakefieldmba > Twitter: @BobLovesData > > From: Will Wagner > Sent: Thursday, November 12, 2015 7:42 AM > To: [email protected] > Subject: Re: what is Hawq? > > Hi Lie, > > Great answer. > > I have a follow up question. > Everything HAWQ is capable of doing is already covered by Apache Drill. Why > do we need another tool? > > Thank you, > Will W > > On Nov 12, 2015 12:25 AM, "Lei Chang" <[email protected]> wrote: > > Hi Bob, > > Apache HAWQ is a Hadoop native SQL query engine that combines the key > technological advantages of MPP database with the scalability and convenience > of Hadoop. HAWQ reads data from and writes data to HDFS natively. HAWQ > delivers industry-leading performance and linear scalability. It provides > users the tools to confidently and successfully interact with petabyte range > data sets. HAWQ provides users with a complete, standards compliant SQL > interface. More specifically, HAWQ has the following features: > · On-premise or cloud deployment > · Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP extension > · Extremely high performance. many times faster than other Hadoop SQL > engine. > · World-class parallel optimizer > · Full transaction capability and consistency guarantee: ACID > · Dynamic data flow engine through high speed UDP based interconnect > · Elastic execution engine based on virtual segment & data locality > · Support multiple level partitioning and List/Range based > partitioned tables. > · Multiple compression method support: snappy, gzip, quicklz, RLE > · Multi-language user defined function support: python, perl, java, > c/c++, R > · Advanced machine learning and data mining functionalities through > MADLib > · Dynamic node expansion: in seconds > · Most advanced three level resource management: Integrate with YARN > and hierarchical resource queues. > · Easy access of all HDFS data and external system data (for example, > HBase) > · Hadoop Native: from storage (HDFS), resource management (YARN) to > deployment (Ambari). > · Authentication & Granular authorization: Kerberos, SSL and role > based access > · Advanced C/C++ access library to HDFS and YARN: libhdfs3 & libYARN > · Support most third party tools: Tableau, SAS et al. > · Standard connectivity: JDBC/ODBC > > And the link here can give you more information around hawq: > https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ > > > And please also see the answers inline to your specific questions: > > On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA > <[email protected]> wrote: > Silly question right? Thing is I’ve read a bit and watched some YouTube > videos and I’m still not quite sure what I can and can’t do with Hawq. Is it > a true database or is it like Hive where I need to use HCatalog? > > It is a true database, you can think it is like a parallel postgres but with > much more functionalities and it works natively in hadoop world. HCatalog is > not necessary. But you can read data registered in HCatalog with the new > feature "hcatalog integration". > > Can I write data intensive applications against it using ODBC? Does it > enforce referential integrity? Does it have stored procedures? > > ODBC: yes, both JDBC/ODBC are supported > referential integrity: currently not supported. > Stored procedures: yes. > > B. > > > Please let us know if you have any other questions. > > Cheers > Lei > > > >
