Michael Ho created IMPALA-6258:
----------------------------------
Summary: Uninitialized tuple pointers in row batch for empty rows
Key: IMPALA-6258
URL: https://issues.apache.org/jira/browse/IMPALA-6258
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 2.11.0
Reporter: Michael Ho
During [code review|https://gerrit.cloudera.org/#/c/8623/] of IMPALA-6187, it
was noticed that the tuple pointers in the generated row batches may not be
initialized if a tuple has byte size 0. It's unclear if there may be edge cases
in which the code may be de-referencing these uninitialized tuple pointers.
[~tarmstrong] came up with the following example:
{noformat}
SELECT /* +straight_join */ COUNT(t1.id)
FROM functional.alltypessmall t1
LEFT OUTER JOIN (
SELECT /* +straight_join */ IFNULL(t2.int_col, 1) AS c
FROM functional.alltypessmall t2
LEFT OUTER JOIN functional.alltypestiny t3 ON t2.id < 1000
) v ON t1.int_col = v.c;
The relevant part of the plan is:
| 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED]
|
| | hash predicates: t1.int_col = if(TupleIsNull(1, 2), NULL,
ifnull(t2.int_col, 1)) |
| | fk/pk conjuncts: assumed fk/pk
|
| | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB
|
| | tuple-ids=0,1N,2N row-size=16B cardinality=100
|
| |
|
| |--08:EXCHANGE [HASH(if(TupleIsNull(1, 2), NULL, ifnull(t2.int_col, 1)))]
|
| | | mem-estimate=0B mem-reservation=0B
|
| | | tuple-ids=1,2N row-size=8B cardinality=100
|
| | |
|
| | F01:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
| | Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B
|
| | 03:NESTED LOOP JOIN [LEFT OUTER JOIN, BROADCAST]
|
| | | join predicates: t2.id < 1000
|
| | | mem-estimate=0B mem-reservation=0B
|
| | | tuple-ids=1,2N row-size=8B cardinality=100
|
| | |
|
| | |--06:EXCHANGE [BROADCAST]
|
| | | | mem-estimate=0B mem-reservation=0B
|
| | | | tuple-ids=2 row-size=0B cardinality=8
|
| | | |
|
| | | F02:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
| | | Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B
|
| | | 02:SCAN HDFS [functional.alltypestiny t3, RANDOM]
|
| | | partitions=4/4 files=4 size=460B
|
| | | stats-rows=8 extrapolated-rows=disabled
|
| | | table stats: rows=8 size=unavailable
|
| | | column stats: all
|
| | | mem-estimate=32.00MB mem-reservation=0B
|
| | | tuple-ids=2 row-size=0B cardinality=8
|
| | |
|
| | 01:SCAN HDFS [functional.alltypessmall t2, RANDOM]
|
| | partitions=4/4 files=4 size=6.32KB
|
| | stats-rows=100 extrapolated-rows=disabled
|
| | table stats: rows=100 size=unavailable
|
| | column stats: all
|
| | mem-estimate=32.00MB mem-reservation=0B
|
| | tuple-ids=1 row-size=8B cardinality=100
|
{noformat}
We need to investigate whether these uninitialized pointers can pose problems
and if so, we should fix them by setting these empty tuples with a dummy
non-NULL pointer.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)