Re: Parquet vs. other Open Source Columnar Formats
Hello, Be aware that Avro and Protobuf are general serialization formats, not columnar ones such as Parquet or ORC. They are good for RPC or row-wise streaming whereas the latter two are perfect for analytics. Uwe > Am 09.05.2019 um 20:33 schrieb David Mollitor : > > I'm sure there are many different opinions on the matter, but in regards to > Avro, I would say it is becoming more and more of a niche player. > > Many folks are choosing to go with Google Protobufs for RPC and Parquet/ORC > for analytic workloads. > >> On Thu, May 9, 2019 at 2:30 PM Brian Bowman wrote: >> >> All, >> >> Is it fair to say that Parquet is fast becoming the dominate open source >> columnar storage format? How do those of you with long-term Hadoop >> experience see this? For example, is Parquet overtaking ORC and Avro? >> >> Thanks, >> >> Brian >>
Re: Parquet vs. other Open Source Columnar Formats
I'm sure there are many different opinions on the matter, but in regards to Avro, I would say it is becoming more and more of a niche player. Many folks are choosing to go with Google Protobufs for RPC and Parquet/ORC for analytic workloads. On Thu, May 9, 2019 at 2:30 PM Brian Bowman wrote: > All, > > Is it fair to say that Parquet is fast becoming the dominate open source > columnar storage format? How do those of you with long-term Hadoop > experience see this? For example, is Parquet overtaking ORC and Avro? > > Thanks, > > Brian >
Parquet vs. other Open Source Columnar Formats
All, Is it fair to say that Parquet is fast becoming the dominate open source columnar storage format? How do those of you with long-term Hadoop experience see this? For example, is Parquet overtaking ORC and Avro? Thanks, Brian
[jira] [Commented] (PARQUET-1572) Clarify the definition of timestamp types
[ https://issues.apache.org/jira/browse/PARQUET-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836470#comment-16836470 ] ASF GitHub Bot commented on PARQUET-1572: - zivanfi commented on pull request #130: PARQUET-1572: Clarify the definition of timestamp types URL: https://github.com/apache/parquet-format/pull/130 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Clarify the definition of timestamp types > - > > Key: PARQUET-1572 > URL: https://issues.apache.org/jira/browse/PARQUET-1572 > Project: Parquet > Issue Type: Task > Components: parquet-format >Reporter: Zoltan Ivanfi >Assignee: Zoltan Ivanfi >Priority: Major > Labels: pull-request-available > > The current definition only makes sense for the isUtcAdjusted=true case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PARQUET-1572) Clarify the definition of timestamp types
[ https://issues.apache.org/jira/browse/PARQUET-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated PARQUET-1572: Labels: pull-request-available (was: ) > Clarify the definition of timestamp types > - > > Key: PARQUET-1572 > URL: https://issues.apache.org/jira/browse/PARQUET-1572 > Project: Parquet > Issue Type: Task > Components: parquet-format >Reporter: Zoltan Ivanfi >Assignee: Zoltan Ivanfi >Priority: Major > Labels: pull-request-available > > The current definition only makes sense for the isUtcAdjusted=true case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PARQUET-1572) Clarify the definition of timestamp types
Zoltan Ivanfi created PARQUET-1572: -- Summary: Clarify the definition of timestamp types Key: PARQUET-1572 URL: https://issues.apache.org/jira/browse/PARQUET-1572 Project: Parquet Issue Type: Task Components: parquet-format Reporter: Zoltan Ivanfi Assignee: Zoltan Ivanfi The current definition only makes sense for the isUtcAdjusted=true case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PARQUET-1555) Bump snappy-java to 1.1.7.3
[ https://issues.apache.org/jira/browse/PARQUET-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836280#comment-16836280 ] ASF GitHub Bot commented on PARQUET-1555: - zivanfi commented on pull request #632: PARQUET-1555: Bump snappy-java to 1.1.7.3 URL: https://github.com/apache/parquet-mr/pull/632 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Bump snappy-java to 1.1.7.3 > --- > > Key: PARQUET-1555 > URL: https://issues.apache.org/jira/browse/PARQUET-1555 > Project: Parquet > Issue Type: Bug > Components: parquet-mr >Affects Versions: 1.10.0 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > > Just to make sure that it compiles well against the latest 1.1.7.3 for Java9 > compatibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PARQUET-1557) Replace deprecated Apache Avro methods
[ https://issues.apache.org/jira/browse/PARQUET-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836114#comment-16836114 ] ASF GitHub Bot commented on PARQUET-1557: - zivanfi commented on pull request #636: PARQUET-1557 Replace deprecated Avro methods URL: https://github.com/apache/parquet-mr/pull/636 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Replace deprecated Apache Avro methods > -- > > Key: PARQUET-1557 > URL: https://issues.apache.org/jira/browse/PARQUET-1557 > Project: Parquet > Issue Type: Improvement > Components: parquet-mr >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)