[jira] [Comment Edited] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Yujiang Zhong (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575528#comment-17575528 ] Yujiang Zhong edited comment on PARQUET-2160 at 8/5/22 5:59 AM:

[jira] [Comment Edited] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Yujiang Zhong (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575528#comment-17575528 ] Yujiang Zhong edited comment on PARQUET-2160 at 8/5/22 6:00 AM:

[jira] [Commented] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Yujiang Zhong (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575802#comment-17575802 ] Yujiang Zhong commented on PARQUET-2160: {quote}Hmm it does need to allocate extra heap memory

[jira] [Comment Edited] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Adam Binford (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575830#comment-17575830 ] Adam Binford edited comment on PARQUET-2160 at 8/5/22 1:11 PM: ---

[jira] [Commented] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Adam Binford (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575830#comment-17575830 ] Adam Binford commented on PARQUET-2160: --- ??Which parquet version you're using? There are some fix

[jira] [Comment Edited] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Adam Binford (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575830#comment-17575830 ] Adam Binford edited comment on PARQUET-2160 at 8/5/22 1:05 PM: ---

[jira] [Comment Edited] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Adam Binford (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575830#comment-17575830 ] Adam Binford edited comment on PARQUET-2160 at 8/5/22 1:08 PM: ---

Re: Fail to read back written large parquet file

2022-08-05 Thread Steve Loughran
tha has to be an integer wraparound...something is using a signed int for position, so when it goes above 2GB it goes negative, and a seek(negative value) is rejected. fix: find the variable and make it a long On Thu, 4 Aug 2022 at 11:09, Jozef Vilcek wrote: > I came across a case where a

[jira] [Comment Edited] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Adam Binford (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575830#comment-17575830 ] Adam Binford edited comment on PARQUET-2160 at 8/5/22 1:08 PM: ---

[jira] [Comment Edited] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Adam Binford (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575830#comment-17575830 ] Adam Binford edited comment on PARQUET-2160 at 8/5/22 1:07 PM: ---

[jira] [Commented] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575952#comment-17575952 ] Chao Sun commented on PARQUET-2160: --- {quote} ... only it happens after the decompress call, may I ask

[jira] [Commented] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2022-08-05 Thread Yujiang Zhong (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575918#comment-17575918 ] Yujiang Zhong commented on PARQUET-2160: {quote}The reason I can generate so much off heap

[jira] [Commented] (PARQUET-2149) Implement async IO for Parquet file reader

2022-08-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17576056#comment-17576056 ] ASF GitHub Bot commented on PARQUET-2149: - parthchandra commented on code in PR #968: URL:

Re: Fail to read back written large parquet file

2022-08-05 Thread Chao Sun
Seems the file was corrupted during write. There's a similar issue https://issues.apache.org/jira/browse/PARQUET-2164 we found recently. On Fri, Aug 5, 2022 at 3:40 AM Steve Loughran wrote: > > tha has to be an integer wraparound...something is using a signed int for > position, so when it goes

[GitHub] [parquet-mr] parthchandra commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-08-05 Thread GitBox
parthchandra commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r927128065 ## parquet-hadoop/src/main/java/org/apache/parquet/HadoopReadOptions.java: ## @@ -61,9 +65,10 @@ private HadoopReadOptions(boolean useSignedStringMinMax,