Greenplum is open sourced.

The main difference is between the two engines is that HAWQ is more for
Hadoop based systems whereas Greenplum is more towards regular FS. This is
a very high level difference between the two, the differences are more
detailed. But a single line difference between the two is the one I wrote.
On 13 Nov 2015 14:20, "Adaryl "Bob" Wakefield, MBA" <
[email protected]> wrote:

> Is Greenplum free? I heard they open sourced it but I haven’t found
> anything but a community edition.
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
> *From:* dortmont <[email protected]>
> *Sent:* Friday, November 13, 2015 2:42 AM
> *To:* [email protected]
> *Subject:* Re: what is Hawq?
>
> I see the advantage of HAWQ compared to other Hadoop SQL engines. It looks
> like the most mature solution on Hadoop thanks to the postgresql based
> engine.
>
> But why wouldn't I use Greenplum instead of HAWQ? It has even better
> performance and it supports updates.
>
> Cheers
>
> 2015-11-13 7:45 GMT+01:00 Atri Sharma <[email protected]>:
>
>> +1 for transactions.
>>
>> I think a major plus point is that HAWQ supports transactions,  and this
>> enables a lot of critical workloads to be done on HAWQ.
>> On 13 Nov 2015 12:13, "Lei Chang" <[email protected]> wrote:
>>
>>>
>>> Like what Bob said, HAWQ is a complete database and Drill is just a
>>> query engine.
>>>
>>> And HAWQ has also a lot of other benefits over Drill, for example:
>>>
>>> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can
>>> run all TPCDS queries without any changes. And support almost all third
>>> party tools, such as Tableau et al.
>>> 2. Performance: proved the best in the hadoop world
>>> 3. Scalability: high scalable via high speed UDP based interconnect.
>>> 4. Transactions: as I know, drill does not support transactions. it is a
>>> nightmare for end users to keep consistency.
>>> 5. Advanced resource management: HAWQ has the most advanced resource
>>> management. It natively supports YARN and easy to use hierarchical resource
>>> queues. Resources can be managed and enforced on query and operator level.
>>>
>>> Cheers
>>> Lei
>>>
>>>
>>> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA <
>>> [email protected]> wrote:
>>>
>>>> There are a lot of tools that do a lot of things. Believe me it’s a
>>>> full time job keeping track of what is going on in the apache world. As I
>>>> understand it, Drill is just a query engine while Hawq is an actual
>>>> database...some what anyway.
>>>>
>>>> Adaryl "Bob" Wakefield, MBA
>>>> Principal
>>>> Mass Street Analytics, LLC
>>>> 913.938.6685
>>>> www.linkedin.com/in/bobwakefieldmba
>>>> Twitter: @BobLovesData
>>>>
>>>> *From:* Will Wagner <[email protected]>
>>>> *Sent:* Thursday, November 12, 2015 7:42 AM
>>>> *To:* [email protected]
>>>> *Subject:* Re: what is Hawq?
>>>>
>>>>
>>>> Hi Lie,
>>>>
>>>> Great answer.
>>>>
>>>> I have a follow up question.
>>>> Everything HAWQ is capable of doing is already covered by Apache
>>>> Drill.  Why do we need another tool?
>>>>
>>>> Thank you,
>>>> Will W
>>>> On Nov 12, 2015 12:25 AM, "Lei Chang" <[email protected]> wrote:
>>>>
>>>>>
>>>>> Hi Bob,
>>>>>
>>>>>
>>>>> Apache HAWQ is a Hadoop native SQL query engine that combines the key
>>>>> technological advantages of MPP database with the scalability and
>>>>> convenience of Hadoop. HAWQ reads data from and writes data to HDFS
>>>>> natively. HAWQ delivers industry-leading performance and linear
>>>>> scalability. It provides users the tools to confidently and successfully
>>>>> interact with petabyte range data sets. HAWQ provides users with a
>>>>> complete, standards compliant SQL interface. More specifically, HAWQ has
>>>>> the following features:
>>>>>
>>>>>    - On-premise or cloud deployment
>>>>>    - Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP
>>>>>    extension
>>>>>    - Extremely high performance. many times faster than other Hadoop
>>>>>    SQL engine.
>>>>>    - World-class parallel optimizer
>>>>>    - Full transaction capability and consistency guarantee: ACID
>>>>>    - Dynamic data flow engine through high speed UDP based
>>>>>    interconnect
>>>>>    - Elastic execution engine based on virtual segment & data
>>>>>    locality
>>>>>    - Support multiple level partitioning and List/Range based
>>>>>    partitioned tables.
>>>>>    - Multiple compression method support: snappy, gzip, quicklz, RLE
>>>>>    - Multi-language user defined function support: python, perl,
>>>>>    java, c/c++, R
>>>>>    - Advanced machine learning and data mining functionalities
>>>>>    through MADLib
>>>>>    - Dynamic node expansion: in seconds
>>>>>    - Most advanced three level resource management: Integrate with
>>>>>    YARN and hierarchical resource queues.
>>>>>    - Easy access of all HDFS data and external system data (for
>>>>>    example, HBase)
>>>>>    - Hadoop Native: from storage (HDFS), resource management (YARN)
>>>>>    to deployment (Ambari).
>>>>>    - Authentication & Granular authorization: Kerberos, SSL and role
>>>>>    based access
>>>>>    - Advanced C/C++ access library to HDFS and YARN: libhdfs3 &
>>>>>    libYARN
>>>>>    - Support most third party tools: Tableau, SAS et al.
>>>>>    - Standard connectivity: JDBC/ODBC
>>>>>
>>>>>
>>>>> And the link here can give you more information around hawq:
>>>>> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ
>>>>>
>>>>>
>>>>> And please also see the answers inline to your specific questions:
>>>>>
>>>>> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Silly question right? Thing is I’ve read a bit and watched some
>>>>>> YouTube videos and I’m still not quite sure what I can and can’t do with
>>>>>> Hawq. Is it a true database or is it like Hive where I need to use
>>>>>> HCatalog?
>>>>>>
>>>>>
>>>>> It is a true database, you can think it is like a parallel postgres
>>>>> but with much more functionalities and it works natively in hadoop world.
>>>>> HCatalog is not necessary. But you can read data registered in HCatalog
>>>>> with the new feature "hcatalog integration".
>>>>>
>>>>>
>>>>>> Can I write data intensive applications against it using ODBC? Does
>>>>>> it enforce referential integrity? Does it have stored procedures?
>>>>>>
>>>>>
>>>>> ODBC: yes, both JDBC/ODBC are supported
>>>>> referential integrity: currently not supported.
>>>>> Stored procedures: yes.
>>>>>
>>>>>
>>>>>> B.
>>>>>>
>>>>>
>>>>>
>>>>> Please let us know if you have any other questions.
>>>>>
>>>>> Cheers
>>>>> Lei
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to