No, truncate was added to Apache Hadoop https://issues.apache.org/jira/plugins/servlet/mobile#issue/hdfs-3107
Sent from my iPhone > On Nov 13, 2015, at 7:39 PM, Bob Marshall <[email protected]> wrote: > > I stand corrected. But I had a question: > > In Pivotal Hadoop HDFS, we added truncate to support transaction. The > signature of the truncate is as follows. void truncate(Path src, long length) > throws IOException; The truncate() function truncates the file to the size > which is less or equal to the file length. Ift he size of the file is smaller > than the target length, an IOException is thrown.This is different from Posix > truncate semantics. The rationale behind is HDFS does not support overwriting > at any position. > > Does this mean I need to run a modified HDFS to run HAWQ? > > Robert L Marshall > Senior Consultant | Avalon Consulting, LLC > c: (210) 853-7041 > LinkedIn | Google+ | Twitter > ------------------------------------------------------------------------------------------------------------- > This message (including any attachments) contains confidential information > intended for a specific individual and purpose, and is protected by law. If > you are not the intended recipient, you should delete this message. Any > disclosure, copying, or distribution of this message, or the taking of any > action based on it, is strictly prohibited. > >> On Fri, Nov 13, 2015 at 7:16 PM, Dan Baskette <[email protected]> wrote: >> But HAWQ does manage its own storage on HDFS. You can leverage native hawq >> format or Parquet. It's PXF functions allows the querying of files in other >> formats. So, by your (and my) definition it is indeed a database. >> >> Sent from my iPhone >> >>> On Nov 13, 2015, at 7:08 PM, Bob Marshall <[email protected]> >>> wrote: >>> >>> Chhavi Joshi is right on the money. A database is both a query execution >>> tool and a data storage backend. HAWQ is executing against native Hadoop >>> storage, i.e. HBase, HDFS, etc. >>> >>> Robert L Marshall >>> Senior Consultant | Avalon Consulting, LLC >>> c: (210) 853-7041 >>> LinkedIn | Google+ | Twitter >>> ------------------------------------------------------------------------------------------------------------- >>> This message (including any attachments) contains confidential information >>> intended for a specific individual and purpose, and is protected by law. If >>> you are not the intended recipient, you should delete this message. Any >>> disclosure, copying, or distribution of this message, or the taking of any >>> action based on it, is strictly prohibited. >>> >>>> On Fri, Nov 13, 2015 at 10:41 AM, Chhavi Joshi >>>> <[email protected]> wrote: >>>> If you have HAWQ greenplum integration you can create the external tables >>>> in greenplum like HIVE. >>>> >>>> For uploading the data into tables just need to put the file into >>>> hdfs.(same like external tables in HIVE) >>>> >>>> >>>> >>>> >>>> >>>> I still believe HAWQ is only the SQL query engine not a database. >>>> >>>> >>>> >>>> Chhavi >>>> >>>> From: Atri Sharma [mailto:[email protected]] >>>> Sent: Friday, November 13, 2015 3:53 AM >>>> >>>> >>>> To: [email protected] >>>> Subject: Re: what is Hawq? >>>> >>>> >>>> Greenplum is open sourced. >>>> >>>> The main difference is between the two engines is that HAWQ is more for >>>> Hadoop based systems whereas Greenplum is more towards regular FS. This is >>>> a very high level difference between the two, the differences are more >>>> detailed. But a single line difference between the two is the one I wrote. >>>> >>>> On 13 Nov 2015 14:20, "Adaryl "Bob" Wakefield, MBA" >>>> <[email protected]> wrote: >>>> >>>> Is Greenplum free? I heard they open sourced it but I haven’t found >>>> anything but a community edition. >>>> >>>> >>>> >>>> Adaryl "Bob" Wakefield, MBA >>>> Principal >>>> Mass Street Analytics, LLC >>>> 913.938.6685 >>>> www.linkedin.com/in/bobwakefieldmba >>>> Twitter: @BobLovesData >>>> >>>> >>>> >>>> From: dortmont >>>> >>>> Sent: Friday, November 13, 2015 2:42 AM >>>> >>>> To: [email protected] >>>> >>>> Subject: Re: what is Hawq? >>>> >>>> >>>> >>>> I see the advantage of HAWQ compared to other Hadoop SQL engines. It looks >>>> like the most mature solution on Hadoop thanks to the postgresql based >>>> engine. >>>> >>>> >>>> >>>> But why wouldn't I use Greenplum instead of HAWQ? It has even better >>>> performance and it supports updates. >>>> >>>> >>>> Cheers >>>> >>>> >>>> >>>> 2015-11-13 7:45 GMT+01:00 Atri Sharma <[email protected]>: >>>> >>>> +1 for transactions. >>>> >>>> I think a major plus point is that HAWQ supports transactions, and this >>>> enables a lot of critical workloads to be done on HAWQ. >>>> >>>> On 13 Nov 2015 12:13, "Lei Chang" <[email protected]> wrote: >>>> >>>> >>>> >>>> Like what Bob said, HAWQ is a complete database and Drill is just a query >>>> engine. >>>> >>>> >>>> >>>> And HAWQ has also a lot of other benefits over Drill, for example: >>>> >>>> >>>> >>>> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can >>>> run all TPCDS queries without any changes. And support almost all third >>>> party tools, such as Tableau et al. >>>> >>>> 2. Performance: proved the best in the hadoop world >>>> >>>> 3. Scalability: high scalable via high speed UDP based interconnect. >>>> >>>> 4. Transactions: as I know, drill does not support transactions. it is a >>>> nightmare for end users to keep consistency. >>>> >>>> 5. Advanced resource management: HAWQ has the most advanced resource >>>> management. It natively supports YARN and easy to use hierarchical >>>> resource queues. Resources can be managed and enforced on query and >>>> operator level. >>>> >>>> >>>> >>>> Cheers >>>> >>>> Lei >>>> >>>> >>>> >>>> >>>> >>>> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA >>>> <[email protected]> wrote: >>>> >>>> There are a lot of tools that do a lot of things. Believe me it’s a full >>>> time job keeping track of what is going on in the apache world. As I >>>> understand it, Drill is just a query engine while Hawq is an actual >>>> database...some what anyway. >>>> >>>> >>>> >>>> Adaryl "Bob" Wakefield, MBA >>>> Principal >>>> Mass Street Analytics, LLC >>>> 913.938.6685 >>>> www.linkedin.com/in/bobwakefieldmba >>>> Twitter: @BobLovesData >>>> >>>> >>>> >>>> From: Will Wagner >>>> >>>> Sent: Thursday, November 12, 2015 7:42 AM >>>> >>>> To: [email protected] >>>> >>>> Subject: Re: what is Hawq? >>>> >>>> >>>> >>>> Hi Lie, >>>> >>>> Great answer. >>>> >>>> I have a follow up question. >>>> Everything HAWQ is capable of doing is already covered by Apache Drill. >>>> Why do we need another tool? >>>> >>>> Thank you, >>>> Will W >>>> >>>> On Nov 12, 2015 12:25 AM, "Lei Chang" <[email protected]> wrote: >>>> >>>> >>>> >>>> Hi Bob, >>>> >>>> >>>> >>>> Apache HAWQ is a Hadoop native SQL query engine that combines the key >>>> technological advantages of MPP database with the scalability and >>>> convenience of Hadoop. HAWQ reads data from and writes data to HDFS >>>> natively. HAWQ delivers industry-leading performance and linear >>>> scalability. It provides users the tools to confidently and successfully >>>> interact with petabyte range data sets. HAWQ provides users with a >>>> complete, standards compliant SQL interface. More specifically, HAWQ has >>>> the following features: >>>> · On-premise or cloud deployment >>>> >>>> · Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP >>>> extension >>>> >>>> · Extremely high performance. many times faster than other Hadoop >>>> SQL engine. >>>> >>>> · World-class parallel optimizer >>>> >>>> · Full transaction capability and consistency guarantee: ACID >>>> >>>> · Dynamic data flow engine through high speed UDP based >>>> interconnect >>>> >>>> · Elastic execution engine based on virtual segment & data locality >>>> >>>> · Support multiple level partitioning and List/Range based >>>> partitioned tables. >>>> >>>> · Multiple compression method support: snappy, gzip, quicklz, RLE >>>> >>>> · Multi-language user defined function support: python, perl, >>>> java, c/c++, R >>>> >>>> · Advanced machine learning and data mining functionalities >>>> through MADLib >>>> >>>> · Dynamic node expansion: in seconds >>>> >>>> · Most advanced three level resource management: Integrate with >>>> YARN and hierarchical resource queues. >>>> >>>> · Easy access of all HDFS data and external system data (for >>>> example, HBase) >>>> >>>> · Hadoop Native: from storage (HDFS), resource management (YARN) >>>> to deployment (Ambari). >>>> >>>> · Authentication & Granular authorization: Kerberos, SSL and role >>>> based access >>>> >>>> · Advanced C/C++ access library to HDFS and YARN: libhdfs3 & >>>> libYARN >>>> >>>> · Support most third party tools: Tableau, SAS et al. >>>> >>>> · Standard connectivity: JDBC/ODBC >>>> >>>> >>>> >>>> And the link here can give you more information around hawq: >>>> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ >>>> >>>> >>>> >>>> >>>> >>>> And please also see the answers inline to your specific questions: >>>> >>>> >>>> >>>> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA >>>> <[email protected]> wrote: >>>> >>>> Silly question right? Thing is I’ve read a bit and watched some YouTube >>>> videos and I’m still not quite sure what I can and can’t do with Hawq. Is >>>> it a true database or is it like Hive where I need to use HCatalog? >>>> >>>> >>>> >>>> It is a true database, you can think it is like a parallel postgres but >>>> with much more functionalities and it works natively in hadoop world. >>>> HCatalog is not necessary. But you can read data registered in HCatalog >>>> with the new feature "hcatalog integration". >>>> >>>> >>>> >>>> Can I write data intensive applications against it using ODBC? Does it >>>> enforce referential integrity? Does it have stored procedures? >>>> >>>> >>>> >>>> ODBC: yes, both JDBC/ODBC are supported >>>> >>>> referential integrity: currently not supported. >>>> >>>> Stored procedures: yes. >>>> >>>> >>>> >>>> B. >>>> >>>> >>>> >>>> >>>> >>>> Please let us know if you have any other questions. >>>> >>>> >>>> >>>> Cheers >>>> >>>> Lei >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> ============================================================================================================================ >>>> Disclaimer: This message and the information contained herein is >>>> proprietary and confidential and subject to the Tech Mahindra policy >>>> statement, you may review the policy at >>>> http://www.techmahindra.com/Disclaimer.html externally >>>> http://tim.techmahindra.com/tim/disclaimer.html internally within >>>> TechMahindra. >>>> ============================================================================================================================ >
