+1 for transactions. I think a major plus point is that HAWQ supports transactions, and this enables a lot of critical workloads to be done on HAWQ. On 13 Nov 2015 12:13, "Lei Chang" <[email protected]> wrote:
> > Like what Bob said, HAWQ is a complete database and Drill is just a query > engine. > > And HAWQ has also a lot of other benefits over Drill, for example: > > 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can > run all TPCDS queries without any changes. And support almost all third > party tools, such as Tableau et al. > 2. Performance: proved the best in the hadoop world > 3. Scalability: high scalable via high speed UDP based interconnect. > 4. Transactions: as I know, drill does not support transactions. it is a > nightmare for end users to keep consistency. > 5. Advanced resource management: HAWQ has the most advanced resource > management. It natively supports YARN and easy to use hierarchical resource > queues. Resources can be managed and enforced on query and operator level. > > Cheers > Lei > > > On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA < > [email protected]> wrote: > >> There are a lot of tools that do a lot of things. Believe me it’s a full >> time job keeping track of what is going on in the apache world. As I >> understand it, Drill is just a query engine while Hawq is an actual >> database...some what anyway. >> >> Adaryl "Bob" Wakefield, MBA >> Principal >> Mass Street Analytics, LLC >> 913.938.6685 >> www.linkedin.com/in/bobwakefieldmba >> Twitter: @BobLovesData >> >> *From:* Will Wagner <[email protected]> >> *Sent:* Thursday, November 12, 2015 7:42 AM >> *To:* [email protected] >> *Subject:* Re: what is Hawq? >> >> >> Hi Lie, >> >> Great answer. >> >> I have a follow up question. >> Everything HAWQ is capable of doing is already covered by Apache Drill. >> Why do we need another tool? >> >> Thank you, >> Will W >> On Nov 12, 2015 12:25 AM, "Lei Chang" <[email protected]> wrote: >> >>> >>> Hi Bob, >>> >>> >>> Apache HAWQ is a Hadoop native SQL query engine that combines the key >>> technological advantages of MPP database with the scalability and >>> convenience of Hadoop. HAWQ reads data from and writes data to HDFS >>> natively. HAWQ delivers industry-leading performance and linear >>> scalability. It provides users the tools to confidently and successfully >>> interact with petabyte range data sets. HAWQ provides users with a >>> complete, standards compliant SQL interface. More specifically, HAWQ has >>> the following features: >>> >>> - On-premise or cloud deployment >>> - Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP >>> extension >>> - Extremely high performance. many times faster than other Hadoop >>> SQL engine. >>> - World-class parallel optimizer >>> - Full transaction capability and consistency guarantee: ACID >>> - Dynamic data flow engine through high speed UDP based interconnect >>> - Elastic execution engine based on virtual segment & data locality >>> - Support multiple level partitioning and List/Range based >>> partitioned tables. >>> - Multiple compression method support: snappy, gzip, quicklz, RLE >>> - Multi-language user defined function support: python, perl, java, >>> c/c++, R >>> - Advanced machine learning and data mining functionalities through >>> MADLib >>> - Dynamic node expansion: in seconds >>> - Most advanced three level resource management: Integrate with YARN >>> and hierarchical resource queues. >>> - Easy access of all HDFS data and external system data (for >>> example, HBase) >>> - Hadoop Native: from storage (HDFS), resource management (YARN) to >>> deployment (Ambari). >>> - Authentication & Granular authorization: Kerberos, SSL and role >>> based access >>> - Advanced C/C++ access library to HDFS and YARN: libhdfs3 & libYARN >>> - Support most third party tools: Tableau, SAS et al. >>> - Standard connectivity: JDBC/ODBC >>> >>> >>> And the link here can give you more information around hawq: >>> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ >>> >>> >>> And please also see the answers inline to your specific questions: >>> >>> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA < >>> [email protected]> wrote: >>> >>>> Silly question right? Thing is I’ve read a bit and watched some YouTube >>>> videos and I’m still not quite sure what I can and can’t do with Hawq. Is >>>> it a true database or is it like Hive where I need to use HCatalog? >>>> >>> >>> It is a true database, you can think it is like a parallel postgres but >>> with much more functionalities and it works natively in hadoop world. >>> HCatalog is not necessary. But you can read data registered in HCatalog >>> with the new feature "hcatalog integration". >>> >>> >>>> Can I write data intensive applications against it using ODBC? Does it >>>> enforce referential integrity? Does it have stored procedures? >>>> >>> >>> ODBC: yes, both JDBC/ODBC are supported >>> referential integrity: currently not supported. >>> Stored procedures: yes. >>> >>> >>>> B. >>>> >>> >>> >>> Please let us know if you have any other questions. >>> >>> Cheers >>> Lei >>> >>> >>> >> >
