Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17071 )

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
......................................................................

IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_

We had a DCHECK in ScalarColumnReader::MaterializeValueBatch() that
checked that 'num_buffered_values_' is greater or equal to the
number of cached values in the Parquet definition level decoder.

In SkipTopLevelRows() we used decoder.ReadLevel() which loaded
the cache of the decoder with probably more values than the
actual value count. It is because literal runs are stored in groups
of 8, i.e. there might be padding zeros at the end.

Alternatively we can fill the cache of the decoder with
CacheNextBatch(num_vals). In this case we won't load more values
than the actual value count.

Testing
 * until this patch TestParquetStats::test_page_index was flaky
   because of this issue
 * I tested the solution on a hacked Impala that randomly generated
   skip ranges

Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Reviewed-on: http://gerrit.cloudera.org:8080/17071
Reviewed-by: Zoltan Borok-Nagy <borokna...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
---
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-level-decoder.h
2 files changed, 18 insertions(+), 3 deletions(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to