Tim Armstrong created IMPALA-6383: ------------------------------------- Summary: Memory from previous row groups can accumulate in Parquet scanner Key: IMPALA-6383 URL: https://issues.apache.org/jira/browse/IMPALA-6383 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 2.12.0 Reporter: Tim Armstrong Assignee: Tim Armstrong
I ran across this bug when working on porting scanners to the new buffer pool. Before that the only symptom of the failures was excessive memory consumption, but with the reservations they become easy-to-detect hard failures. The problem is in HdfsParquetScanner::NextRowGroup(), which calls InitColumns() on column readers, which starts scans, which allocate memory. The problem is that, if the row group is skipped because of dictionary predicates or some other error, the scans aren't cancelled and the I/O buffers aren't releated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)