[jira] [Comment Edited] (PARQUET-319) Define the parquet bloom filter statistics in parquet format

2015-06-26 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600583#comment-14600583 ] Ferdinand Xu edited comment on PARQUET-319 at 6/26/15 7:02 AM:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-26 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602532#comment-14602532 ] Ryan Blue commented on PARQUET-41: -- Thanks for working on this, [~Ferd], it's great to be

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-26 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603401#comment-14603401 ] Ryan Blue commented on PARQUET-41: -- I just posted a google spreadsheet to back up the num

[jira] [Comment Edited] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-26 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602532#comment-14602532 ] Ryan Blue edited comment on PARQUET-41 at 6/26/15 7:53 PM: --- Than

Question regarding the use of TaskAttemptContext on ParquetOutputFormat

2015-06-26 Thread Sergio Pena
Hi, I see ParquetRecordWriterWrapper constructor is getting/initializing a TaskAttemptID object that will be passed to the getRecordWriter(TaskAttemptContext taskAttemptContext, Path file) method of ParquetOutputFormat. But this method only gets the Configuration and CompressionCodeName objects to

Re: Question regarding the use of TaskAttemptContext on ParquetOutputFormat

2015-06-26 Thread Ryan Blue
I thought the wrapper was translating from the mapred API used by Hive to the mapreduce API that Parquet implements. If there is a better way to do this that is less expensive, I think that would be a good change. rb On 06/26/2015 04:01 PM, Sergio Pena wrote: Hi, I see ParquetRecordWriterWra