Greenplum is open sourced. The main difference is between the two engines is that HAWQ is more for Hadoop based systems whereas Greenplum is more towards regular FS. This is a very high level difference between the two, the differences are more detailed. But a single line difference between the two is the one I wrote. On 13 Nov 2015 14:20, "Adaryl "Bob" Wakefield, MBA" < [email protected]> wrote:
> Is Greenplum free? I heard they open sourced it but I haven’t found > anything but a community edition. > > Adaryl "Bob" Wakefield, MBA > Principal > Mass Street Analytics, LLC > 913.938.6685 > www.linkedin.com/in/bobwakefieldmba > Twitter: @BobLovesData > > *From:* dortmont <[email protected]> > *Sent:* Friday, November 13, 2015 2:42 AM > *To:* [email protected] > *Subject:* Re: what is Hawq? > > I see the advantage of HAWQ compared to other Hadoop SQL engines. It looks > like the most mature solution on Hadoop thanks to the postgresql based > engine. > > But why wouldn't I use Greenplum instead of HAWQ? It has even better > performance and it supports updates. > > Cheers > > 2015-11-13 7:45 GMT+01:00 Atri Sharma <[email protected]>: > >> +1 for transactions. >> >> I think a major plus point is that HAWQ supports transactions, and this >> enables a lot of critical workloads to be done on HAWQ. >> On 13 Nov 2015 12:13, "Lei Chang" <[email protected]> wrote: >> >>> >>> Like what Bob said, HAWQ is a complete database and Drill is just a >>> query engine. >>> >>> And HAWQ has also a lot of other benefits over Drill, for example: >>> >>> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can >>> run all TPCDS queries without any changes. And support almost all third >>> party tools, such as Tableau et al. >>> 2. Performance: proved the best in the hadoop world >>> 3. Scalability: high scalable via high speed UDP based interconnect. >>> 4. Transactions: as I know, drill does not support transactions. it is a >>> nightmare for end users to keep consistency. >>> 5. Advanced resource management: HAWQ has the most advanced resource >>> management. It natively supports YARN and easy to use hierarchical resource >>> queues. Resources can be managed and enforced on query and operator level. >>> >>> Cheers >>> Lei >>> >>> >>> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA < >>> [email protected]> wrote: >>> >>>> There are a lot of tools that do a lot of things. Believe me it’s a >>>> full time job keeping track of what is going on in the apache world. As I >>>> understand it, Drill is just a query engine while Hawq is an actual >>>> database...some what anyway. >>>> >>>> Adaryl "Bob" Wakefield, MBA >>>> Principal >>>> Mass Street Analytics, LLC >>>> 913.938.6685 >>>> www.linkedin.com/in/bobwakefieldmba >>>> Twitter: @BobLovesData >>>> >>>> *From:* Will Wagner <[email protected]> >>>> *Sent:* Thursday, November 12, 2015 7:42 AM >>>> *To:* [email protected] >>>> *Subject:* Re: what is Hawq? >>>> >>>> >>>> Hi Lie, >>>> >>>> Great answer. >>>> >>>> I have a follow up question. >>>> Everything HAWQ is capable of doing is already covered by Apache >>>> Drill. Why do we need another tool? >>>> >>>> Thank you, >>>> Will W >>>> On Nov 12, 2015 12:25 AM, "Lei Chang" <[email protected]> wrote: >>>> >>>>> >>>>> Hi Bob, >>>>> >>>>> >>>>> Apache HAWQ is a Hadoop native SQL query engine that combines the key >>>>> technological advantages of MPP database with the scalability and >>>>> convenience of Hadoop. HAWQ reads data from and writes data to HDFS >>>>> natively. HAWQ delivers industry-leading performance and linear >>>>> scalability. It provides users the tools to confidently and successfully >>>>> interact with petabyte range data sets. HAWQ provides users with a >>>>> complete, standards compliant SQL interface. More specifically, HAWQ has >>>>> the following features: >>>>> >>>>> - On-premise or cloud deployment >>>>> - Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP >>>>> extension >>>>> - Extremely high performance. many times faster than other Hadoop >>>>> SQL engine. >>>>> - World-class parallel optimizer >>>>> - Full transaction capability and consistency guarantee: ACID >>>>> - Dynamic data flow engine through high speed UDP based >>>>> interconnect >>>>> - Elastic execution engine based on virtual segment & data >>>>> locality >>>>> - Support multiple level partitioning and List/Range based >>>>> partitioned tables. >>>>> - Multiple compression method support: snappy, gzip, quicklz, RLE >>>>> - Multi-language user defined function support: python, perl, >>>>> java, c/c++, R >>>>> - Advanced machine learning and data mining functionalities >>>>> through MADLib >>>>> - Dynamic node expansion: in seconds >>>>> - Most advanced three level resource management: Integrate with >>>>> YARN and hierarchical resource queues. >>>>> - Easy access of all HDFS data and external system data (for >>>>> example, HBase) >>>>> - Hadoop Native: from storage (HDFS), resource management (YARN) >>>>> to deployment (Ambari). >>>>> - Authentication & Granular authorization: Kerberos, SSL and role >>>>> based access >>>>> - Advanced C/C++ access library to HDFS and YARN: libhdfs3 & >>>>> libYARN >>>>> - Support most third party tools: Tableau, SAS et al. >>>>> - Standard connectivity: JDBC/ODBC >>>>> >>>>> >>>>> And the link here can give you more information around hawq: >>>>> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ >>>>> >>>>> >>>>> And please also see the answers inline to your specific questions: >>>>> >>>>> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA < >>>>> [email protected]> wrote: >>>>> >>>>>> Silly question right? Thing is I’ve read a bit and watched some >>>>>> YouTube videos and I’m still not quite sure what I can and can’t do with >>>>>> Hawq. Is it a true database or is it like Hive where I need to use >>>>>> HCatalog? >>>>>> >>>>> >>>>> It is a true database, you can think it is like a parallel postgres >>>>> but with much more functionalities and it works natively in hadoop world. >>>>> HCatalog is not necessary. But you can read data registered in HCatalog >>>>> with the new feature "hcatalog integration". >>>>> >>>>> >>>>>> Can I write data intensive applications against it using ODBC? Does >>>>>> it enforce referential integrity? Does it have stored procedures? >>>>>> >>>>> >>>>> ODBC: yes, both JDBC/ODBC are supported >>>>> referential integrity: currently not supported. >>>>> Stored procedures: yes. >>>>> >>>>> >>>>>> B. >>>>>> >>>>> >>>>> >>>>> Please let us know if you have any other questions. >>>>> >>>>> Cheers >>>>> Lei >>>>> >>>>> >>>>> >>>> >>> >> >
