Re: what is Hawq?

Dan Baskette Fri, 13 Nov 2015 07:00:12 -0800

Hive doesn't have the level of SQL support that HAWQ provides especially around 
sub-selects.   SparkSQL only support a subset of HiveQL, so the difference 
there is even bigger.


Sent from my iPhone

> On Nov 13, 2015, at 9:39 AM, Biswas, Supriya <[email protected]> 
> wrote:
> 
> Hello All –
>  
> Hive 0.14 supports ACID and also supports transactions. Spark supports Hive 
> queries (HQL).
>  
> Did anyone compare HAWQ with spark SQL or Hive HQL on Spark?
>  
> Thanks.
>  
> Supriyo Biswas
> Architect – CPS Service Delivery
> The Nielsen Company
> Office (516) 682-6021/NETS 249-6021
> Cell     (516) 353-6795
> www.nielsen.com
>  
> From: Atri Sharma [mailto:[email protected]] 
> Sent: Friday, November 13, 2015 3:53 AM
> To: [email protected]
> Subject: Re: what is Hawq?
>  
> Greenplum is open sourced.
> 
> The main difference is between the two engines is that HAWQ is more for 
> Hadoop based systems whereas Greenplum is more towards regular FS. This is a 
> very high level difference between the two, the differences are more 
> detailed. But a single line difference between the two is the one I wrote.
> 
> On 13 Nov 2015 14:20, "Adaryl "Bob" Wakefield, MBA" 
> <[email protected]> wrote:
> Is Greenplum free? I heard they open sourced it but I haven’t found anything 
> but a community edition.
>  
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>  
> From: dortmont
> Sent: Friday, November 13, 2015 2:42 AM
> To: [email protected]
> Subject: Re: what is Hawq?
>  
> I see the advantage of HAWQ compared to other Hadoop SQL engines. It looks 
> like the most mature solution on Hadoop thanks to the postgresql based engine.
>  
> But why wouldn't I use Greenplum instead of HAWQ? It has even better 
> performance and it supports updates.
> 
> Cheers
>  
> 2015-11-13 7:45 GMT+01:00 Atri Sharma <[email protected]>:
> +1 for transactions.
> 
> I think a major plus point is that HAWQ supports transactions,  and this 
> enables a lot of critical workloads to be done on HAWQ.
> 
> On 13 Nov 2015 12:13, "Lei Chang" <[email protected]> wrote:
>  
> Like what Bob said, HAWQ is a complete database and Drill is just a query 
> engine.
>  
> And HAWQ has also a lot of other benefits over Drill, for example:
>  
> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can run 
> all TPCDS queries without any changes. And support almost all third party 
> tools, such as Tableau et al.
> 2. Performance: proved the best in the hadoop world
> 3. Scalability: high scalable via high speed UDP based interconnect.
> 4. Transactions: as I know, drill does not support transactions. it is a 
> nightmare for end users to keep consistency.
> 5. Advanced resource management: HAWQ has the most advanced resource 
> management. It natively supports YARN and easy to use hierarchical resource 
> queues. Resources can be managed and enforced on query and operator level.
>  
> Cheers
> Lei
>  
>  
> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA 
> <[email protected]> wrote:
> There are a lot of tools that do a lot of things. Believe me it’s a full time 
> job keeping track of what is going on in the apache world. As I understand 
> it, Drill is just a query engine while Hawq is an actual database...some what 
> anyway.
>  
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>  
> From: Will Wagner
> Sent: Thursday, November 12, 2015 7:42 AM
> To: [email protected]
> Subject: Re: what is Hawq?
>  
> Hi Lie,
> 
> Great answer.
> 
> I have a follow up question. 
> Everything HAWQ is capable of doing is already covered by Apache Drill.  Why 
> do we need another tool?
> 
> Thank you, 
> Will W
> 
> On Nov 12, 2015 12:25 AM, "Lei Chang" <[email protected]> wrote:
>  
> Hi Bob,
>  
> Apache HAWQ is a Hadoop native SQL query engine that combines the key 
> technological advantages of MPP database with the scalability and convenience 
> of Hadoop. HAWQ reads data from and writes data to HDFS natively. HAWQ 
> delivers industry-leading performance and linear scalability. It provides 
> users the tools to confidently and successfully interact with petabyte range 
> data sets. HAWQ provides users with a complete, standards compliant SQL 
> interface. More specifically, HAWQ has the following features:
> ·         On-premise or cloud deployment
> ·         Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP extension
> ·         Extremely high performance. many times faster than other Hadoop SQL 
> engine.
> ·         World-class parallel optimizer
> ·         Full transaction capability and consistency guarantee: ACID
> ·         Dynamic data flow engine through high speed UDP based interconnect
> ·         Elastic execution engine based on virtual segment & data locality
> ·         Support multiple level partitioning and List/Range based 
> partitioned tables.
> ·         Multiple compression method support: snappy, gzip, quicklz, RLE
> ·         Multi-language user defined function support: python, perl, java, 
> c/c++, R
> ·         Advanced machine learning and data mining functionalities through 
> MADLib
> ·         Dynamic node expansion: in seconds
> ·         Most advanced three level resource management: Integrate with YARN 
> and hierarchical resource queues.
> ·         Easy access of all HDFS data and external system data (for example, 
> HBase)
> ·         Hadoop Native: from storage (HDFS), resource management (YARN) to 
> deployment (Ambari).
> ·         Authentication & Granular authorization: Kerberos, SSL and role 
> based access
> ·         Advanced C/C++ access library to HDFS and YARN: libhdfs3 & libYARN
> ·         Support most third party tools: Tableau, SAS et al.
> ·         Standard connectivity: JDBC/ODBC
>  
> And the link here can give you more information around hawq: 
> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ 
>  
>  
> And please also see the answers inline to your specific questions:
>  
> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA 
> <[email protected]> wrote:
> Silly question right? Thing is I’ve read a bit and watched some YouTube 
> videos and I’m still not quite sure what I can and can’t do with Hawq. Is it 
> a true database or is it like Hive where I need to use HCatalog?
>  
> It is a true database, you can think it is like a parallel postgres but with 
> much more functionalities and it works natively in hadoop world. HCatalog is 
> not necessary. But you can read data registered in HCatalog with the new 
> feature "hcatalog integration".
>  
> Can I write data intensive applications against it using ODBC? Does it 
> enforce referential integrity? Does it have stored procedures?
>  
> ODBC: yes, both JDBC/ODBC are supported
> referential integrity: currently not supported.
> Stored procedures: yes.
>  
> B.
>  
>  
> Please let us know if you have any other questions.
>  
> Cheers
> Lei
>  
>  
>  
>

Re: what is Hawq?

Reply via email to