Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20997


Change subject: IMPALA-12783: Nested struct with varlen data crashes
......................................................................

IMPALA-12783: Nested struct with varlen data crashes

If a struct ("main") is within an array and contains two child structs
("s1" ans "s2") which both contain strings (or other varlen data),
Impala crashes when this struct is re-materialised (for example in a
sort with limit) if codegen is enabled.

To reproduce:

In Hive:
 create table nested (arr ARRAY<STRUCT<s1: STRUCT<str1: STRING>, s2:
   STRUCT<str2: STRING>>>) stored as parquet;
 insert into nested values (array( named_struct("s1",
   named_struct("str1", "A string that is long"), "s2",
   named_struct("str2", "Another string that is long") )));

In Impala:
 select 1, arr from nested order by 1 limit 1;

This is because in the codegen'd code, when checking if the strings
("str1" and "str2" in the example) are NULL, we incorrectly calculate
the offset of their null indicator bytes from the memory address of
their containing struct, not from the beginning of the "master tuple",
which in this case is the item tuple of the array.

Note that the null indicators of struct members are always at the end of
the tuple containing the struct (recursively), i.e. the master tuple.

This change corrects the behaviour, passing the master tuple to
functions that need it.

Testing:
 - extended the column 'arr_contains_nested_struct' in table
   'collection_struct_mix' to include two nested structs with string
   members. Updated existing queries, which now cover the problem.

Change-Id: Ide2b63f8b18633f38fbe939a17db923606ccb101
---
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M testdata/datasets/functional/functional_schema_template.sql
M 
testdata/workloads/functional-query/queries/QueryTest/mixed-collections-and-structs.test
M testdata/workloads/functional-query/queries/QueryTest/sort-complex.test
M testdata/workloads/functional-query/queries/QueryTest/top-n-complex.test
6 files changed, 85 insertions(+), 75 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/20997/1
--
To view, visit http://gerrit.cloudera.org:8080/20997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ide2b63f8b18633f38fbe939a17db923606ccb101
Gerrit-Change-Number: 20997
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>

Reply via email to