[jira] [Commented] (PARQUET-1886) CompressionCodec Provider-aware Compression Codec Lookup for parquet-mr

2020-07-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161636#comment-17161636 ] ASF GitHub Bot commented on PARQUET-1886: - XinDongIntel opened a new pull request #803: URL:

[jira] [Created] (PARQUET-1886) CompressionCodec Provider-aware Compression Codec Lookup for parquet-mr

2020-07-20 Thread XinDong (Jira)
XinDong created PARQUET-1886: Summary: CompressionCodec Provider-aware Compression Codec Lookup for parquet-mr Key: PARQUET-1886 URL: https://issues.apache.org/jira/browse/PARQUET-1886 Project: Parquet

[jira] [Commented] (PARQUET-1830) Vectorized API to support Column Index in Apache Spark

2020-07-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161573#comment-17161573 ] Xinli Shang commented on PARQUET-1830: -- [~FelixKJose]Do we have Spark task created for

[jira] [Commented] (PARQUET-1739) Make Spark SQL support Column indexes

2020-07-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161566#comment-17161566 ] Xinli Shang commented on PARQUET-1739: -- [~yumwang], Can you share is the implementation is done in

[GitHub] [parquet-mr] dossett commented on pull request #702: PARQUET-1684: dont store default protobuf values as null for proto3

2020-07-20 Thread GitBox
dossett commented on pull request #702: URL: https://github.com/apache/parquet-mr/pull/702#issuecomment-661286742 cc @gszadovszky it looks like you are driving 1.11.1 (apologies if that is not the case) This is an automated

[jira] [Commented] (PARQUET-1684) [parquet-protobuf] default protobuf field values are stored as nulls

2020-07-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161481#comment-17161481 ] ASF GitHub Bot commented on PARQUET-1684: - dossett commented on pull request #702: URL:

[GitHub] [parquet-mr] dossett commented on pull request #702: PARQUET-1684: dont store default protobuf values as null for proto3

2020-07-20 Thread GitBox
dossett commented on pull request #702: URL: https://github.com/apache/parquet-mr/pull/702#issuecomment-661282325 Can it be considered for 1.11.1? I see a release candidate is out. This is an automated message from the

[jira] [Commented] (PARQUET-1885) [parquet-protobuf] Pass descriptor to ProtoWriteSupport constructor

2020-07-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161473#comment-17161473 ] ASF GitHub Bot commented on PARQUET-1885: - mauliksoneji opened a new pull request #802: URL:

[GitHub] [parquet-mr] mauliksoneji opened a new pull request #802: PARQUET-1885: Pass descriptor to ProtoWriteSupport constructor

2020-07-20 Thread GitBox
mauliksoneji opened a new pull request #802: URL: https://github.com/apache/parquet-mr/pull/802 addresses https://issues.apache.org/jira/browse/PARQUET-1885 This is an automated message from the Apache Git Service. To

[jira] [Updated] (PARQUET-1885) [parquet-protobuf] Pass descriptor to ProtoWriteSupport constructor

2020-07-20 Thread Maulik Soneji (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maulik Soneji updated PARQUET-1885: --- Affects Version/s: 1.11.0 1.10.1 > [parquet-protobuf] Pass

[jira] [Created] (PARQUET-1885) [parquet-protobuf] Pass descriptor to ProtoWriteSupport constructor

2020-07-20 Thread Maulik Soneji (Jira)
Maulik Soneji created PARQUET-1885: -- Summary: [parquet-protobuf] Pass descriptor to ProtoWriteSupport constructor Key: PARQUET-1885 URL: https://issues.apache.org/jira/browse/PARQUET-1885 Project:

[jira] [Updated] (PARQUET-1885) [parquet-protobuf] Pass descriptor to ProtoWriteSupport constructor

2020-07-20 Thread Maulik Soneji (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maulik Soneji updated PARQUET-1885: --- Component/s: parquet-mr > [parquet-protobuf] Pass descriptor to ProtoWriteSupport

Re: How to incrementally store timeseries in Parquet files for efficient retrieval?

2020-07-20 Thread Tim Armstrong
The usual solution is to partition the data based on the criteria you want to filter by. E.g. for Hive tables, you would partition by date and have a separate directory per date. If you have a relatively modern version of Parquet, stats and page indices will allow the reader to filter out files

[jira] [Commented] (PARQUET-14) Pig and Hive cannot read repeated groups written with parquet-protobuf

2020-07-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161135#comment-17161135 ] ASF GitHub Bot commented on PARQUET-14: --- NathanHowell closed pull request #14: URL:

[GitHub] [parquet-mr] NathanHowell closed pull request #14: PARQUET-14: Support lifted coercions of file schemas into compatible read schemas.

2020-07-20 Thread GitBox
NathanHowell closed pull request #14: URL: https://github.com/apache/parquet-mr/pull/14 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to