[jira] [Updated] (PARQUET-791) Predicate pushing down on missing columns should work on UserDefinedPredicate too

2016-12-05 Thread Liang-Chi Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated PARQUET-791:

Description: This is related to PARQUET-389. PARQUET-389 fixes the 
predicate pushing down on missing columns. But it doesn't fix it for 
UserDefinedPredicate.  (was: This is related to PARQUET-389. PARQUET-389 fixes 
the predicate pushing down on missing columns. But it doesn't fix it for )

> Predicate pushing down on missing columns should work on UserDefinedPredicate 
> too
> -
>
> Key: PARQUET-791
> URL: https://issues.apache.org/jira/browse/PARQUET-791
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Liang-Chi Hsieh
>
> This is related to PARQUET-389. PARQUET-389 fixes the predicate pushing down 
> on missing columns. But it doesn't fix it for UserDefinedPredicate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PARQUET-791) Predicate pushing down on missing columns should work on UserDefinedPredicate too

2016-12-05 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created PARQUET-791:
---

 Summary: Predicate pushing down on missing columns should work on 
UserDefinedPredicate too
 Key: PARQUET-791
 URL: https://issues.apache.org/jira/browse/PARQUET-791
 Project: Parquet
  Issue Type: Bug
  Components: parquet-mr
Reporter: Liang-Chi Hsieh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks

2016-12-05 Thread Michael Allman (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724105#comment-15724105
 ] 

Michael Allman commented on PARQUET-783:


Any chance we can get this in 1.9.1? This bug makes parquet 1.9 unusable for us.

> H2SeekableInputStream does not close its underlying FSDataInputStream, 
> leading to connection leaks
> --
>
> Key: PARQUET-783
> URL: https://issues.apache.org/jira/browse/PARQUET-783
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Affects Versions: 1.9.0
>Reporter: Michael Allman
>Assignee: Michael Allman
>Priority: Critical
> Fix For: 1.10.0
>
>
> {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In 
> the process, it opens a new {{FSDataInputStream}} and wraps it. However, 
> {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, 
> when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is 
> not closed. As a result, these stale connections can exhaust a clusters' data 
> nodes' connection resources and lead to mysterious HDFS read failures in HDFS 
> clients, e.g.
> {noformat}
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: 
> BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks

2016-12-05 Thread Michael Allman (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Allman updated PARQUET-783:
---
Affects Version/s: (was: 1.9.1)

> H2SeekableInputStream does not close its underlying FSDataInputStream, 
> leading to connection leaks
> --
>
> Key: PARQUET-783
> URL: https://issues.apache.org/jira/browse/PARQUET-783
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Affects Versions: 1.9.0
>Reporter: Michael Allman
>Assignee: Michael Allman
>Priority: Critical
> Fix For: 1.10.0
>
>
> {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In 
> the process, it opens a new {{FSDataInputStream}} and wraps it. However, 
> {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, 
> when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is 
> not closed. As a result, these stale connections can exhaust a clusters' data 
> nodes' connection resources and lead to mysterious HDFS read failures in HDFS 
> clients, e.g.
> {noformat}
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: 
> BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PARQUET-786) parquet-tools README incorrectly has 'java jar' instead of 'java -jar'

2016-12-05 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PARQUET-786.
---
   Resolution: Fixed
Fix Version/s: 1.10.0

Merged in:
https://github.com/apache/parquet-mr/pull/386
https://github.com/apache/parquet-mr/commit/7987a544cce59537467621114b400f670c71d722

> parquet-tools README incorrectly has 'java jar' instead of 'java -jar'
> --
>
> Key: PARQUET-786
> URL: https://issues.apache.org/jira/browse/PARQUET-786
> Project: Parquet
>  Issue Type: Bug
>Reporter: Mark Nelson
>Assignee: Mark Nelson
> Fix For: 1.10.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PARQUET-786) parquet-tools README incorrectly has 'java jar' instead of 'java -jar'

2016-12-05 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PARQUET-786:
--
Assignee: Mark Nelson

> parquet-tools README incorrectly has 'java jar' instead of 'java -jar'
> --
>
> Key: PARQUET-786
> URL: https://issues.apache.org/jira/browse/PARQUET-786
> Project: Parquet
>  Issue Type: Bug
>Reporter: Mark Nelson
>Assignee: Mark Nelson
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-790) Close Parquet github account to avoid confusion

2016-12-05 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723677#comment-15723677
 ] 

Wes McKinney commented on PARQUET-790:
--

You could rename the organization to something more clear like "legacy-parquet" 

> Close Parquet github account to avoid confusion
> ---
>
> Key: PARQUET-790
> URL: https://issues.apache.org/jira/browse/PARQUET-790
> Project: Parquet
>  Issue Type: Wish
>Reporter: Julien Le Dem
>
> The old github repo has significant history (github issues and PRs) that we'd 
> like to maintain.
> But at the same time it is confusing and people mistake it for the main repo.
> We need a solution for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PARQUET-789) [C++] Catch and translate ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}

2016-12-05 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated PARQUET-789:
-
Assignee: Wes McKinney

PR: https://github.com/apache/parquet-cpp/pull/201

> [C++] Catch and translate ParquetException in 
> parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}
> --
>
> Key: PARQUET-789
> URL: https://issues.apache.org/jira/browse/PARQUET-789
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>
> Exceptions are mostly uncaught in the implementations of these methods



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PARQUET-789) [C++] Catch ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}

2016-12-05 Thread Wes McKinney (JIRA)
Wes McKinney created PARQUET-789:


 Summary: [C++] Catch ParquetException in 
parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}
 Key: PARQUET-789
 URL: https://issues.apache.org/jira/browse/PARQUET-789
 Project: Parquet
  Issue Type: Bug
  Components: parquet-cpp
Reporter: Wes McKinney


Exceptions are mostly uncaught in the implementations of these methods



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PARQUET-789) [C++] Catch and translate ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}

2016-12-05 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated PARQUET-789:
-
Summary: [C++] Catch and translate ParquetException in 
parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}  (was: [C++] Catch 
ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, 
ReadFlatTable}})

> [C++] Catch and translate ParquetException in 
> parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}
> --
>
> Key: PARQUET-789
> URL: https://issues.apache.org/jira/browse/PARQUET-789
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Wes McKinney
>
> Exceptions are mostly uncaught in the implementations of these methods



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks

2016-12-05 Thread Michael Allman (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722771#comment-15722771
 ] 

Michael Allman commented on PARQUET-783:


Hi [~gszadovszky]. Thanks for the advice. I don't seem to be able to assign 
this ticket to myself. Maybe you can do that for me?

> H2SeekableInputStream does not close its underlying FSDataInputStream, 
> leading to connection leaks
> --
>
> Key: PARQUET-783
> URL: https://issues.apache.org/jira/browse/PARQUET-783
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Affects Versions: 1.9.0, 1.9.1
>Reporter: Michael Allman
>Priority: Critical
>
> {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In 
> the process, it opens a new {{FSDataInputStream}} and wraps it. However, 
> {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, 
> when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is 
> not closed. As a result, these stale connections can exhaust a clusters' data 
> nodes' connection resources and lead to mysterious HDFS read failures in HDFS 
> clients, e.g.
> {noformat}
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: 
> BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PARQUET-788) [C++] Reference Impala / Apache Impala (incubating) in LICENSE

2016-12-05 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved PARQUET-788.
--
Resolution: Fixed

Issue resolved by pull request 200
[https://github.com/apache/parquet-cpp/pull/200]

> [C++] Reference Impala / Apache Impala (incubating) in LICENSE
> --
>
> Key: PARQUET-788
> URL: https://issues.apache.org/jira/browse/PARQUET-788
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-cpp
>Reporter: Wes McKinney
>Assignee: Wes McKinney
> Fix For: cpp-0.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks

2016-12-05 Thread Gabor Szadovszky (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721918#comment-15721918
 ] 

Gabor Szadovszky commented on PARQUET-783:
--

Hi [~michael],
As you have linked the PR you should press the button "Submit Patch" so that 
the status of this jira will be highlighted for the committers.
I would also suggest assigning the JIRA for yourself.
Thanks a lot.

> H2SeekableInputStream does not close its underlying FSDataInputStream, 
> leading to connection leaks
> --
>
> Key: PARQUET-783
> URL: https://issues.apache.org/jira/browse/PARQUET-783
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Affects Versions: 1.9.0, 1.9.1
>Reporter: Michael Allman
>Priority: Critical
>
> {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In 
> the process, it opens a new {{FSDataInputStream}} and wraps it. However, 
> {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, 
> when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is 
> not closed. As a result, these stale connections can exhaust a clusters' data 
> nodes' connection resources and lead to mysterious HDFS read failures in HDFS 
> clients, e.g.
> {noformat}
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: 
> BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)