Hi Bob, In HAWQ, we use MVCC for transactions, and writes does not disturb read. And more information around the implementation can be found in the paper we published:
[1] Lei Chang et al: HAWQ: a massively parallel processing SQL engine in hadoop <https://github.com/changleicn/publications/raw/master/hawq-sigmod-2014.pdf>. SIGMOD Conference 2014: 1223-1234 And I think HAWQ is a good fit your real time analytics use case. Cheers Lei On Fri, Nov 13, 2015 at 9:41 AM, Adaryl "Bob" Wakefield, MBA < [email protected]> wrote: > So what I’ve been looking for is a low cost high performance distributed > relational database. I’ve looked at in memory database but all those guys > seem to be optimized for a transactional use case. I work in a world where > I want to deliver real time analytics. I want to be able to hammer the > warehouse with writes while not disturbing reads. There is one buzz term I > didn’t see in here: Mulit version concurrency control. > > In the early years of my career, I would design databases without > enforcing referential integrity leaving that up to the application. Having > worked for years and seeing what people do to databases, I would be > concerned about implementing something where a check on users has been > removed. > > Adaryl "Bob" Wakefield, MBA > Principal > Mass Street Analytics, LLC > 913.938.6685 > www.linkedin.com/in/bobwakefieldmba > Twitter: @BobLovesData > > *From:* Lei Chang <[email protected]> > *Sent:* Thursday, November 12, 2015 2:25 AM > *To:* [email protected] > *Subject:* Re: what is Hawq? > > > Hi Bob, > > > Apache HAWQ is a Hadoop native SQL query engine that combines the key > technological advantages of MPP database with the scalability and > convenience of Hadoop. HAWQ reads data from and writes data to HDFS > natively. HAWQ delivers industry-leading performance and linear > scalability. It provides users the tools to confidently and successfully > interact with petabyte range data sets. HAWQ provides users with a > complete, standards compliant SQL interface. More specifically, HAWQ has > the following features: > > - On-premise or cloud deployment > - Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP extension > - Extremely high performance. many times faster than other Hadoop SQL > engine. > - World-class parallel optimizer > - Full transaction capability and consistency guarantee: ACID > - Dynamic data flow engine through high speed UDP based interconnect > - Elastic execution engine based on virtual segment & data locality > - Support multiple level partitioning and List/Range based partitioned > tables. > - Multiple compression method support: snappy, gzip, quicklz, RLE > - Multi-language user defined function support: python, perl, java, > c/c++, R > - Advanced machine learning and data mining functionalities through > MADLib > - Dynamic node expansion: in seconds > - Most advanced three level resource management: Integrate with YARN > and hierarchical resource queues. > - Easy access of all HDFS data and external system data (for example, > HBase) > - Hadoop Native: from storage (HDFS), resource management (YARN) to > deployment (Ambari). > - Authentication & Granular authorization: Kerberos, SSL and role > based access > - Advanced C/C++ access library to HDFS and YARN: libhdfs3 & libYARN > - Support most third party tools: Tableau, SAS et al. > - Standard connectivity: JDBC/ODBC > > > And the link here can give you more information around hawq: > https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ > > > And please also see the answers inline to your specific questions: > > On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA < > [email protected]> wrote: > >> Silly question right? Thing is I’ve read a bit and watched some YouTube >> videos and I’m still not quite sure what I can and can’t do with Hawq. Is >> it a true database or is it like Hive where I need to use HCatalog? >> > > It is a true database, you can think it is like a parallel postgres but > with much more functionalities and it works natively in hadoop world. > HCatalog is not necessary. But you can read data registered in HCatalog > with the new feature "hcatalog integration". > > >> Can I write data intensive applications against it using ODBC? Does it >> enforce referential integrity? Does it have stored procedures? >> > > ODBC: yes, both JDBC/ODBC are supported > referential integrity: currently not supported. > Stored procedures: yes. > > >> B. >> > > > Please let us know if you have any other questions. > > Cheers > Lei > > >
