[jira] [Created] (PARQUET-460) Parquet files concat tool

2016-01-23 Thread flykobe cheng (JIRA)
flykobe cheng created PARQUET-460: - Summary: Parquet files concat tool Key: PARQUET-460 URL: https://issues.apache.org/jira/browse/PARQUET-460 Project: Parquet Issue Type: Improvement

parquet-format parquet.thrift struct ColumnMetaData problem

2016-01-23 Thread Tenghuan He
Hi everyone, In parquet.thrift the definition of struct ColumnMetaData 1. The field "path_in_schema" is a string list, should not there be only one path in the schema for a specified column? And in parquet-hadoop the corresponding class "ColumnChunkMetaData" there is the field

[jira] [Commented] (PARQUET-459) Improve handling of null values

2016-01-23 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113835#comment-15113835 ] Wes McKinney commented on PARQUET-459: -- Do you have a patch for PARQUET-428 somewhere? Re:

[jira] [Comment Edited] (PARQUET-459) Improve handling of null values

2016-01-23 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113835#comment-15113835 ] Wes McKinney edited comment on PARQUET-459 at 1/23/16 4:50 PM: --- Do you have

[jira] [Commented] (PARQUET-459) Improve handling of null values

2016-01-23 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114022#comment-15114022 ] Wes McKinney commented on PARQUET-459: -- The value decoders are already internally buffering arrays

Re: parquet-format parquet.thrift struct ColumnMetaData problem

2016-01-23 Thread Nong Li
Inline. On Sat, Jan 23, 2016 at 8:48 AM, Tenghuan He wrote: > Hi everyone, > > In parquet.thrift the definition of struct ColumnMetaData > >1. > >The field "path_in_schema" is a string list, should not there be only >one path in the schema for a specified

[jira] [Commented] (PARQUET-453) Refactor parquet_reader.cc into a ParquetFileReader::DebugPrint method

2016-01-23 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114000#comment-15114000 ] Wes McKinney commented on PARQUET-453: -- This is done as part of

[jira] [Commented] (PARQUET-451) Add a RowGroup reader interface class

2016-01-23 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113999#comment-15113999 ] Wes McKinney commented on PARQUET-451: -- This is done in

Re: Parquet for very wide table

2016-01-23 Thread Nong Li
I expect this to be difficult. This is roughly 3 orders of magnitude more than even a typical wide table use case. Answers inline. On Thu, Jan 21, 2016 at 2:10 PM, Krishna wrote: > We are considering using Parquet for storing a matrix that is dense and > very, very wide

[jira] [Commented] (PARQUET-459) Improve handling of null values

2016-01-23 Thread Deepak Majeti (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114012#comment-15114012 ] Deepak Majeti commented on PARQUET-459: --- [~wesmckinn] I made a pull request for PARQUET-428 here