Yongzhi Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16019


Change subject: IMPALA-9809: A query with multi-aggregation functions on 
particular dataset crashes impala daemon
......................................................................

IMPALA-9809: A query with multi-aggregation functions on particular
dataset crashes impala daemon

In streaming-aggregation-node.cc , when replicate_input_ is true
and num_aggs > 1, it will call AddBatchStreaming several
times(more than 1), each time, the out_batch will be used.
If a row is not cached, the value will be saved in the out_batch,
and out_batch's row count will be increased.
The row_count did not set back to 0 when next while loop. Therefore
in out_batch, it is possible that not all the tuples are non-null.
(For example the rows added when agg_idx = 1, only tuple with 1 not
null; the rows added when when agg_idx = 2, only tuple with 2 not
null). But in grouping-aggregation-ir.cc, the serialize out code is
start from very beginning of out_batch for a agg_idx, it has good
chance to hit null tuple.

Fix the issue by only serialize the tuples being added by
current function call.

Tests:
Manual tests
Unit tests

Change-Id: I06d73171cdc40bdbb15960573030ac7fc94a7e16
---
M be/src/exec/grouping-aggregator-ir.cc
A testdata/data/local_parquet_tbl/_SUCCESS
A 
testdata/data/local_parquet_tbl/part-00000-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00001-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00002-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00003-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00004-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00005-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00006-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00007-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00008-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00009-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00010-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/data/local_parquet_tbl/part-00011-fafc2cd0-f5c8-4fbb-ac3f-717447d67af8-c000.snappy.parquet
A 
testdata/workloads/functional-query/queries/QueryTest/min-multiple-distinct-aggs.test
M tests/query_test/test_aggregation.py
16 files changed, 30 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/16019/1
--
To view, visit http://gerrit.cloudera.org:8080/16019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I06d73171cdc40bdbb15960573030ac7fc94a7e16
Gerrit-Change-Number: 16019
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen <yc...@cloudera.com>

Reply via email to