Re: what is Hawq?

Lei Chang Thu, 12 Nov 2015 22:25:53 -0800

Hi Bob,

In HAWQ, we use MVCC for transactions, and writes does not disturb read.
And more information around the implementation can be found in the paper we
published:


[1] Lei Chang et al: HAWQ: a massively parallel processing SQL engine in
hadoop
<https://github.com/changleicn/publications/raw/master/hawq-sigmod-2014.pdf>.
SIGMOD Conference 2014: 1223-1234

And I think HAWQ is a good fit your real time analytics use case.

Cheers
Lei



On Fri, Nov 13, 2015 at 9:41 AM, Adaryl "Bob" Wakefield, MBA <
[email protected]> wrote:

> So what I’ve been looking for is  a low cost high performance distributed
> relational database. I’ve looked at in memory database but all those guys
> seem to be optimized for a transactional use case. I work in a world where
> I want to deliver real time analytics. I want to be able to hammer the
> warehouse with writes while not disturbing reads. There is one buzz term I
> didn’t see in here: Mulit version concurrency control.
>
> In the early years of my career, I would design databases without
> enforcing referential integrity leaving that up to the application. Having
> worked for years and seeing what people do to databases, I would be
> concerned about implementing something where a check on users has been
> removed.
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
> *From:* Lei Chang <[email protected]>
> *Sent:* Thursday, November 12, 2015 2:25 AM
> *To:* [email protected]
> *Subject:* Re: what is Hawq?
>
>
> Hi Bob,
>
>
> Apache HAWQ is a Hadoop native SQL query engine that combines the key
> technological advantages of MPP database with the scalability and
> convenience of Hadoop. HAWQ reads data from and writes data to HDFS
> natively. HAWQ delivers industry-leading performance and linear
> scalability. It provides users the tools to confidently and successfully
> interact with petabyte range data sets. HAWQ provides users with a
> complete, standards compliant SQL interface. More specifically, HAWQ has
> the following features:
>
>    - On-premise or cloud deployment
>    - Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP extension
>    - Extremely high performance. many times faster than other Hadoop SQL
>    engine.
>    - World-class parallel optimizer
>    - Full transaction capability and consistency guarantee: ACID
>    - Dynamic data flow engine through high speed UDP based interconnect
>    - Elastic execution engine based on virtual segment & data locality
>    - Support multiple level partitioning and List/Range based partitioned
>    tables.
>    - Multiple compression method support: snappy, gzip, quicklz, RLE
>    - Multi-language user defined function support: python, perl, java,
>    c/c++, R
>    - Advanced machine learning and data mining functionalities through
>    MADLib
>    - Dynamic node expansion: in seconds
>    - Most advanced three level resource management: Integrate with YARN
>    and hierarchical resource queues.
>    - Easy access of all HDFS data and external system data (for example,
>    HBase)
>    - Hadoop Native: from storage (HDFS), resource management (YARN) to
>    deployment (Ambari).
>    - Authentication & Granular authorization: Kerberos, SSL and role
>    based access
>    - Advanced C/C++ access library to HDFS and YARN: libhdfs3 & libYARN
>    - Support most third party tools: Tableau, SAS et al.
>    - Standard connectivity: JDBC/ODBC
>
>
> And the link here can give you more information around hawq:
> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ
>
>
> And please also see the answers inline to your specific questions:
>
> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA <
> [email protected]> wrote:
>
>> Silly question right? Thing is I’ve read a bit and watched some YouTube
>> videos and I’m still not quite sure what I can and can’t do with Hawq. Is
>> it a true database or is it like Hive where I need to use HCatalog?
>>
>
> It is a true database, you can think it is like a parallel postgres but
> with much more functionalities and it works natively in hadoop world.
> HCatalog is not necessary. But you can read data registered in HCatalog
> with the new feature "hcatalog integration".
>
>
>> Can I write data intensive applications against it using ODBC? Does it
>> enforce referential integrity? Does it have stored procedures?
>>
>
> ODBC: yes, both JDBC/ODBC are supported
> referential integrity: currently not supported.
> Stored procedures: yes.
>
>
>> B.
>>
>
>
> Please let us know if you have any other questions.
>
> Cheers
> Lei
>
>
>

Re: what is Hawq?

Reply via email to