[jira] [Updated] (PARQUET-791) Predicate pushing down on missing columns should work on UserDefinedPredicate too
[ https://issues.apache.org/jira/browse/PARQUET-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated PARQUET-791: Description: This is related to PARQUET-389. PARQUET-389 fixes the predicate pushing down on missing columns. But it doesn't fix it for UserDefinedPredicate. (was: This is related to PARQUET-389. PARQUET-389 fixes the predicate pushing down on missing columns. But it doesn't fix it for ) > Predicate pushing down on missing columns should work on UserDefinedPredicate > too > - > > Key: PARQUET-791 > URL: https://issues.apache.org/jira/browse/PARQUET-791 > Project: Parquet > Issue Type: Bug > Components: parquet-mr >Reporter: Liang-Chi Hsieh > > This is related to PARQUET-389. PARQUET-389 fixes the predicate pushing down > on missing columns. But it doesn't fix it for UserDefinedPredicate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PARQUET-791) Predicate pushing down on missing columns should work on UserDefinedPredicate too
Liang-Chi Hsieh created PARQUET-791: --- Summary: Predicate pushing down on missing columns should work on UserDefinedPredicate too Key: PARQUET-791 URL: https://issues.apache.org/jira/browse/PARQUET-791 Project: Parquet Issue Type: Bug Components: parquet-mr Reporter: Liang-Chi Hsieh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks
[ https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724105#comment-15724105 ] Michael Allman commented on PARQUET-783: Any chance we can get this in 1.9.1? This bug makes parquet 1.9 unusable for us. > H2SeekableInputStream does not close its underlying FSDataInputStream, > leading to connection leaks > -- > > Key: PARQUET-783 > URL: https://issues.apache.org/jira/browse/PARQUET-783 > Project: Parquet > Issue Type: Bug > Components: parquet-mr >Affects Versions: 1.9.0 >Reporter: Michael Allman >Assignee: Michael Allman >Priority: Critical > Fix For: 1.10.0 > > > {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In > the process, it opens a new {{FSDataInputStream}} and wraps it. However, > {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, > when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is > not closed. As a result, these stale connections can exhaust a clusters' data > nodes' connection resources and lead to mysterious HDFS read failures in HDFS > clients, e.g. > {noformat} > org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: > BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks
[ https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated PARQUET-783: --- Affects Version/s: (was: 1.9.1) > H2SeekableInputStream does not close its underlying FSDataInputStream, > leading to connection leaks > -- > > Key: PARQUET-783 > URL: https://issues.apache.org/jira/browse/PARQUET-783 > Project: Parquet > Issue Type: Bug > Components: parquet-mr >Affects Versions: 1.9.0 >Reporter: Michael Allman >Assignee: Michael Allman >Priority: Critical > Fix For: 1.10.0 > > > {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In > the process, it opens a new {{FSDataInputStream}} and wraps it. However, > {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, > when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is > not closed. As a result, these stale connections can exhaust a clusters' data > nodes' connection resources and lead to mysterious HDFS read failures in HDFS > clients, e.g. > {noformat} > org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: > BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (PARQUET-786) parquet-tools README incorrectly has 'java jar' instead of 'java -jar'
[ https://issues.apache.org/jira/browse/PARQUET-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem resolved PARQUET-786. --- Resolution: Fixed Fix Version/s: 1.10.0 Merged in: https://github.com/apache/parquet-mr/pull/386 https://github.com/apache/parquet-mr/commit/7987a544cce59537467621114b400f670c71d722 > parquet-tools README incorrectly has 'java jar' instead of 'java -jar' > -- > > Key: PARQUET-786 > URL: https://issues.apache.org/jira/browse/PARQUET-786 > Project: Parquet > Issue Type: Bug >Reporter: Mark Nelson >Assignee: Mark Nelson > Fix For: 1.10.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PARQUET-786) parquet-tools README incorrectly has 'java jar' instead of 'java -jar'
[ https://issues.apache.org/jira/browse/PARQUET-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated PARQUET-786: -- Assignee: Mark Nelson > parquet-tools README incorrectly has 'java jar' instead of 'java -jar' > -- > > Key: PARQUET-786 > URL: https://issues.apache.org/jira/browse/PARQUET-786 > Project: Parquet > Issue Type: Bug >Reporter: Mark Nelson >Assignee: Mark Nelson > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-790) Close Parquet github account to avoid confusion
[ https://issues.apache.org/jira/browse/PARQUET-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723677#comment-15723677 ] Wes McKinney commented on PARQUET-790: -- You could rename the organization to something more clear like "legacy-parquet" > Close Parquet github account to avoid confusion > --- > > Key: PARQUET-790 > URL: https://issues.apache.org/jira/browse/PARQUET-790 > Project: Parquet > Issue Type: Wish >Reporter: Julien Le Dem > > The old github repo has significant history (github issues and PRs) that we'd > like to maintain. > But at the same time it is confusing and people mistake it for the main repo. > We need a solution for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PARQUET-789) [C++] Catch and translate ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}
[ https://issues.apache.org/jira/browse/PARQUET-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated PARQUET-789: - Assignee: Wes McKinney PR: https://github.com/apache/parquet-cpp/pull/201 > [C++] Catch and translate ParquetException in > parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}} > -- > > Key: PARQUET-789 > URL: https://issues.apache.org/jira/browse/PARQUET-789 > Project: Parquet > Issue Type: Bug > Components: parquet-cpp >Reporter: Wes McKinney >Assignee: Wes McKinney > > Exceptions are mostly uncaught in the implementations of these methods -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PARQUET-789) [C++] Catch ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}
Wes McKinney created PARQUET-789: Summary: [C++] Catch ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}} Key: PARQUET-789 URL: https://issues.apache.org/jira/browse/PARQUET-789 Project: Parquet Issue Type: Bug Components: parquet-cpp Reporter: Wes McKinney Exceptions are mostly uncaught in the implementations of these methods -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PARQUET-789) [C++] Catch and translate ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}
[ https://issues.apache.org/jira/browse/PARQUET-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated PARQUET-789: - Summary: [C++] Catch and translate ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}} (was: [C++] Catch ParquetException in parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}}) > [C++] Catch and translate ParquetException in > parquet::arrow::FileReader::{ReadFlatColumn, ReadFlatTable}} > -- > > Key: PARQUET-789 > URL: https://issues.apache.org/jira/browse/PARQUET-789 > Project: Parquet > Issue Type: Bug > Components: parquet-cpp >Reporter: Wes McKinney > > Exceptions are mostly uncaught in the implementations of these methods -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks
[ https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722771#comment-15722771 ] Michael Allman commented on PARQUET-783: Hi [~gszadovszky]. Thanks for the advice. I don't seem to be able to assign this ticket to myself. Maybe you can do that for me? > H2SeekableInputStream does not close its underlying FSDataInputStream, > leading to connection leaks > -- > > Key: PARQUET-783 > URL: https://issues.apache.org/jira/browse/PARQUET-783 > Project: Parquet > Issue Type: Bug > Components: parquet-mr >Affects Versions: 1.9.0, 1.9.1 >Reporter: Michael Allman >Priority: Critical > > {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In > the process, it opens a new {{FSDataInputStream}} and wraps it. However, > {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, > when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is > not closed. As a result, these stale connections can exhaust a clusters' data > nodes' connection resources and lead to mysterious HDFS read failures in HDFS > clients, e.g. > {noformat} > org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: > BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (PARQUET-788) [C++] Reference Impala / Apache Impala (incubating) in LICENSE
[ https://issues.apache.org/jira/browse/PARQUET-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved PARQUET-788. -- Resolution: Fixed Issue resolved by pull request 200 [https://github.com/apache/parquet-cpp/pull/200] > [C++] Reference Impala / Apache Impala (incubating) in LICENSE > -- > > Key: PARQUET-788 > URL: https://issues.apache.org/jira/browse/PARQUET-788 > Project: Parquet > Issue Type: New Feature > Components: parquet-cpp >Reporter: Wes McKinney >Assignee: Wes McKinney > Fix For: cpp-0.1.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks
[ https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721918#comment-15721918 ] Gabor Szadovszky commented on PARQUET-783: -- Hi [~michael], As you have linked the PR you should press the button "Submit Patch" so that the status of this jira will be highlighted for the committers. I would also suggest assigning the JIRA for yourself. Thanks a lot. > H2SeekableInputStream does not close its underlying FSDataInputStream, > leading to connection leaks > -- > > Key: PARQUET-783 > URL: https://issues.apache.org/jira/browse/PARQUET-783 > Project: Parquet > Issue Type: Bug > Components: parquet-mr >Affects Versions: 1.9.0, 1.9.1 >Reporter: Michael Allman >Priority: Critical > > {{ParquetFileReader}} opens a {{SeekableInputStream}} to read a footer. In > the process, it opens a new {{FSDataInputStream}} and wraps it. However, > {{H2SeekableInputStream}} does not override the {{close}} method. Therefore, > when {{ParquetFileReader}} closes it, the underlying {{FSDataInputStream}} is > not closed. As a result, these stale connections can exhaust a clusters' data > nodes' connection resources and lead to mysterious HDFS read failures in HDFS > clients, e.g. > {noformat} > org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: > BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)