[ https://issues.apache.org/jira/browse/DRILL-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706953#comment-17706953 ]
ASF GitHub Bot commented on DRILL-8416: --------------------------------------- jnturton opened a new pull request, #2784: URL: https://github.com/apache/drill/pull/2784 # [DRILL-8416](https://issues.apache.org/jira/browse/DRILL-8416): Memory leak when the async Parquet reader skips empty pages ## Description A regression introduced by the Parquet reader clean-up released in Drill 1.20 has meant that buffers used for (non-empty) compressed data holding _empty_ dictionary or data pages which are skipped are not freed. Because empty pages are uncommon in real data this bug went undetected for a long time. ## Documentation N/A ## Testing New unit test. > Memory leak when the async Parquet reader skips empty pages > ----------------------------------------------------------- > > Key: DRILL-8416 > URL: https://issues.apache.org/jira/browse/DRILL-8416 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet > Affects Versions: 1.21.0 > Reporter: Matthias Rosenthaler > Assignee: James Turton > Priority: Major > Fix For: 1.21.1 > > Attachments: example.parquet, meta_steps.parquet > > > If I try to query ( > {code:java} > SELECT * FROM > `hdfs.data`.`./v2/meta_steps/me-2023-03-20-13-15-30-inv230021-kontrollsystemf39st9qrx20-03-2/meta_steps.parquet`{code} > ) the following parquet file which is stored on hadoop file system I am > getting the following error: > {code:java} > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (64) > Allocator(op:0:0:1:ParquetRowGroupScan) 1000000/64/34688/10000000000 > (res/actual/peak/limit){code} > Everything is working fine with drill version 1.19. > If I select only columns without NULL values, the query also works in 1.21.0: > {code:java} > SELECT `name`,`type` FROM > `hdfs.data`.`./v2/meta_steps/me-2023-03-20-13-15-30-inv230021-kontrollsystemf39st9qrx20-03-2/meta_steps.parquet`{code} > Generated a new example.parquet with pyarrow 8.0.0 and a float column with > NULL valuues and the same error happened. -- This message was sent by Atlassian Jira (v8.20.10#820010)