On Fri, Nov 13, 2015 at 07:39PM, Bob Marshall wrote: > I stand corrected. But I had a question: > > In Pivotal Hadoop HDFS, we added truncate to support transaction. The
Not to be picky, but truncate was added to the standard HDFS starting from 2.7 (HDFS-3107). Perhaps has been backported by Pivotal later on? :) > signature of the truncate is as follows. void truncate(Path src, long > length) throws IOException; The truncate() function truncates the file to > the size which is less or equal to the file length. Ift he size of the file > is smaller than the target length, an IOException is thrown.This is > different from Posix truncate semantics. The rationale behind is HDFS does > not support overwriting at any position. > > Does this mean I need to run a modified HDFS to run HAWQ? > > Robert L Marshall > Senior Consultant | Avalon Consulting, LLC > <http://www.avalonconsult.com/>c: (210) 853-7041 > LinkedIn <http://www.linkedin.com/company/avalon-consulting-llc> | Google+ > <http://www.google.com/+AvalonConsultingLLC> | Twitter > <https://twitter.com/avalonconsult> > ------------------------------------------------------------------------------------------------------------- > This message (including any attachments) contains confidential information > intended for a specific individual and purpose, and is protected by law. If > you are not the intended recipient, you should delete this message. Any > disclosure, copying, or distribution of this message, or the taking of any > action based on it, is strictly prohibited. > > On Fri, Nov 13, 2015 at 7:16 PM, Dan Baskette <[email protected]> wrote: > > > But HAWQ does manage its own storage on HDFS. You can leverage native > > hawq format or Parquet. It's PXF functions allows the querying of files in > > other formats. So, by your (and my) definition it is indeed a database. > > > > Sent from my iPhone > > > > On Nov 13, 2015, at 7:08 PM, Bob Marshall <[email protected]> > > wrote: > > > > Chhavi Joshi is right on the money. A database is both a query execution > > tool and a data storage backend. HAWQ is executing against native Hadoop > > storage, i.e. HBase, HDFS, etc. > > > > Robert L Marshall > > Senior Consultant | Avalon Consulting, LLC > > <http://www.avalonconsult.com/>c: (210) 853-7041 > > LinkedIn <http://www.linkedin.com/company/avalon-consulting-llc> | Google+ > > <http://www.google.com/+AvalonConsultingLLC> | Twitter > > <https://twitter.com/avalonconsult> > > > > ------------------------------------------------------------------------------------------------------------- > > This message (including any attachments) contains confidential information > > intended for a specific individual and purpose, and is protected by law. > > If > > you are not the intended recipient, you should delete this message. Any > > disclosure, copying, or distribution of this message, or the taking of any > > action based on it, is strictly prohibited. > > > > On Fri, Nov 13, 2015 at 10:41 AM, Chhavi Joshi < > > [email protected]> wrote: > > > >> If you have HAWQ greenplum integration you can create the external tables > >> in greenplum like HIVE. > >> > >> For uploading the data into tables just need to put the file into > >> hdfs.(same like external tables in HIVE) > >> > >> > >> > >> > >> > >> I still believe HAWQ is only the SQL query engine not a database. > >> > >> > >> > >> Chhavi > >> > >> *From:* Atri Sharma [mailto:[email protected]] > >> *Sent:* Friday, November 13, 2015 3:53 AM > >> > >> *To:* [email protected] > >> *Subject:* Re: what is Hawq? > >> > >> > >> > >> Greenplum is open sourced. > >> > >> The main difference is between the two engines is that HAWQ is more for > >> Hadoop based systems whereas Greenplum is more towards regular FS. This is > >> a very high level difference between the two, the differences are more > >> detailed. But a single line difference between the two is the one I wrote. > >> > >> On 13 Nov 2015 14:20, "Adaryl "Bob" Wakefield, MBA" < > >> [email protected]> wrote: > >> > >> Is Greenplum free? I heard they open sourced it but I haven’t found > >> anything but a community edition. > >> > >> > >> > >> Adaryl "Bob" Wakefield, MBA > >> Principal > >> Mass Street Analytics, LLC > >> 913.938.6685 > >> www.linkedin.com/in/bobwakefieldmba > >> Twitter: @BobLovesData > >> > >> > >> > >> *From:* dortmont <[email protected]> > >> > >> *Sent:* Friday, November 13, 2015 2:42 AM > >> > >> *To:* [email protected] > >> > >> *Subject:* Re: what is Hawq? > >> > >> > >> > >> I see the advantage of HAWQ compared to other Hadoop SQL engines. It > >> looks like the most mature solution on Hadoop thanks to the postgresql > >> based engine. > >> > >> > >> > >> But why wouldn't I use Greenplum instead of HAWQ? It has even better > >> performance and it supports updates. > >> > >> > >> Cheers > >> > >> > >> > >> 2015-11-13 7:45 GMT+01:00 Atri Sharma <[email protected]>: > >> > >> +1 for transactions. > >> > >> I think a major plus point is that HAWQ supports transactions, and this > >> enables a lot of critical workloads to be done on HAWQ. > >> > >> On 13 Nov 2015 12:13, "Lei Chang" <[email protected]> wrote: > >> > >> > >> > >> Like what Bob said, HAWQ is a complete database and Drill is just a query > >> engine. > >> > >> > >> > >> And HAWQ has also a lot of other benefits over Drill, for example: > >> > >> > >> > >> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can > >> run all TPCDS queries without any changes. And support almost all third > >> party tools, such as Tableau et al. > >> > >> 2. Performance: proved the best in the hadoop world > >> > >> 3. Scalability: high scalable via high speed UDP based interconnect. > >> > >> 4. Transactions: as I know, drill does not support transactions. it is a > >> nightmare for end users to keep consistency. > >> > >> 5. Advanced resource management: HAWQ has the most advanced resource > >> management. It natively supports YARN and easy to use hierarchical resource > >> queues. Resources can be managed and enforced on query and operator level. > >> > >> > >> > >> Cheers > >> > >> Lei > >> > >> > >> > >> > >> > >> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA < > >> [email protected]> wrote: > >> > >> There are a lot of tools that do a lot of things. Believe me it’s a full > >> time job keeping track of what is going on in the apache world. As I > >> understand it, Drill is just a query engine while Hawq is an actual > >> database...some what anyway. > >> > >> > >> > >> Adaryl "Bob" Wakefield, MBA > >> Principal > >> Mass Street Analytics, LLC > >> 913.938.6685 > >> www.linkedin.com/in/bobwakefieldmba > >> Twitter: @BobLovesData > >> > >> > >> > >> *From:* Will Wagner <[email protected]> > >> > >> *Sent:* Thursday, November 12, 2015 7:42 AM > >> > >> *To:* [email protected] > >> > >> *Subject:* Re: what is Hawq? > >> > >> > >> > >> Hi Lie, > >> > >> Great answer. > >> > >> I have a follow up question. > >> Everything HAWQ is capable of doing is already covered by Apache Drill. > >> Why do we need another tool? > >> > >> Thank you, > >> Will W > >> > >> On Nov 12, 2015 12:25 AM, "Lei Chang" <[email protected]> wrote: > >> > >> > >> > >> Hi Bob, > >> > >> > >> > >> Apache HAWQ is a Hadoop native SQL query engine that combines the key > >> technological advantages of MPP database with the scalability and > >> convenience of Hadoop. HAWQ reads data from and writes data to HDFS > >> natively. HAWQ delivers industry-leading performance and linear > >> scalability. It provides users the tools to confidently and successfully > >> interact with petabyte range data sets. HAWQ provides users with a > >> complete, standards compliant SQL interface. More specifically, HAWQ has > >> the following features: > >> > >> · On-premise or cloud deployment > >> > >> · Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP > >> extension > >> > >> · Extremely high performance. many times faster than other > >> Hadoop SQL engine. > >> > >> · World-class parallel optimizer > >> > >> · Full transaction capability and consistency guarantee: ACID > >> > >> · Dynamic data flow engine through high speed UDP based > >> interconnect > >> > >> · Elastic execution engine based on virtual segment & data > >> locality > >> > >> · Support multiple level partitioning and List/Range based > >> partitioned tables. > >> > >> · Multiple compression method support: snappy, gzip, quicklz, > >> RLE > >> > >> · Multi-language user defined function support: python, perl, > >> java, c/c++, R > >> > >> · Advanced machine learning and data mining functionalities > >> through MADLib > >> > >> · Dynamic node expansion: in seconds > >> > >> · Most advanced three level resource management: Integrate with > >> YARN and hierarchical resource queues. > >> > >> · Easy access of all HDFS data and external system data (for > >> example, HBase) > >> > >> · Hadoop Native: from storage (HDFS), resource management (YARN) > >> to deployment (Ambari). > >> > >> · Authentication & Granular authorization: Kerberos, SSL and > >> role based access > >> > >> · Advanced C/C++ access library to HDFS and YARN: libhdfs3 & > >> libYARN > >> > >> · Support most third party tools: Tableau, SAS et al. > >> > >> · Standard connectivity: JDBC/ODBC > >> > >> > >> > >> And the link here can give you more information around hawq: > >> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ > >> > >> > >> > >> > >> > >> And please also see the answers inline to your specific questions: > >> > >> > >> > >> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA < > >> [email protected]> wrote: > >> > >> Silly question right? Thing is I’ve read a bit and watched some YouTube > >> videos and I’m still not quite sure what I can and can’t do with Hawq. Is > >> it a true database or is it like Hive where I need to use HCatalog? > >> > >> > >> > >> It is a true database, you can think it is like a parallel postgres but > >> with much more functionalities and it works natively in hadoop world. > >> HCatalog is not necessary. But you can read data registered in HCatalog > >> with the new feature "hcatalog integration". > >> > >> > >> > >> Can I write data intensive applications against it using ODBC? Does it > >> enforce referential integrity? Does it have stored procedures? > >> > >> > >> > >> ODBC: yes, both JDBC/ODBC are supported > >> > >> referential integrity: currently not supported. > >> > >> Stored procedures: yes. > >> > >> > >> > >> B. > >> > >> > >> > >> > >> > >> Please let us know if you have any other questions. > >> > >> > >> > >> Cheers > >> > >> Lei > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> ------------------------------ > >> > >> ============================================================================================================================ > >> Disclaimer: This message and the information contained herein is > >> proprietary and confidential and subject to the Tech Mahindra policy > >> statement, you may review the policy at > >> http://www.techmahindra.com/Disclaimer.html externally > >> http://tim.techmahindra.com/tim/disclaimer.html internally within > >> TechMahindra. > >> > >> ============================================================================================================================ > >> > >> > >
signature.asc
Description: Digital signature
