[
https://issues.apache.org/jira/browse/IMPALA-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554286#comment-17554286
]
ASF subversion and git services commented on IMPALA-11280:
----------------------------------------------------------
Commit 2744f46fbd921dafe9b63f4a0011b2237ee07c5f in impala's branch
refs/heads/master from Gabor Kaszab
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2744f46fb ]
IMPALA-11280: Join node incorrectly picks up unnest(array) predicates
The expectation for predicates on unnested arrays is that they are
either picked up by the SCAN node or the UNNEST node for evaluation. If
there is only one array being unnested then the SCAN node, otherwise
the UNNEST node will be responsible for the evaluation. However, if
there is a JOIN node involved where the JOIN construction happens
before creating the UNNEST node then the JOIN node incorrectly picks
up the predicates for the unnested arrays as well. This patch is to fix
this behaviour.
Tests:
- Added E2E tests to cover result correctness.
- Added planner tests to verify that the desired node picks up the
predicates for unnested arrays.
Change-Id: I89fed4eef220ca513b259f0e2649cdfbe43c797a
Reviewed-on: http://gerrit.cloudera.org:8080/18614
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Zipping unnest hits DCHECK when querying from a view that has an IN operator
> ----------------------------------------------------------------------------
>
> Key: IMPALA-11280
> URL: https://issues.apache.org/jira/browse/IMPALA-11280
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.1.0
> Reporter: Gabor Kaszab
> Assignee: Gabor Kaszab
> Priority: Major
> Labels: complextype
>
> *Repro steps:*
> 1) Create a view that returns arrays and has an IN operator in the WHERE
> clause:
> {code:java}
> drop view if exists unnest_bug_view;
> create view unnest_bug_view as (
> select id, arr1, arr2
> from functional_parquet.complextypes_arrays
> where id % 2 = 1 and id in (select id from functional_parquet.alltypestiny)
> ); {code}
> 2) Unnest the arrays and filter by the unnested values in an outer SELECT:
> {code:java}
> select
> id,
> unnested_arr1,
> unnested_arr2
> from
> (select
> id,
> unnest(arr1) as unnested_arr1,
> unnest(arr2) as unnested_arr2
> from unnest_bug_view) a
> where a.unnested_arr1 < 5; {code}
> This hits a DCHECK in RowDescriptor::GetTupleIdx()
>
>
> {code:java}
> descriptors.cc:467] 5643fd6cdd5cece3:77942ead00000000] Check failed: id <
> tuple_idx_map_.size() (3 vs. 2) RowDescriptor: Tuple(id=0 size=29
> slots=[Slot(id=2 type=INT col_path=[0] offset=24 null=(offset=28 mask=4)
> slot_idx=2 field_idx=2), Slot(id=3 type=ARRAY col_path=[1]
> children_tuple_id=3 offset=0 null=(offset=28 mask=1) slot_idx=0 field_idx=0),
> Slot(id=5 type=ARRAY col_path=[2] children_tuple_id=4 offset=12
> null=(offset=28 mask=2) slot_idx=1 field_idx=1)] tuple_path=[])
> Tuple(id=1 size=5 slots=[Slot(id=0 type=INT col_path=[2] offset=0
> null=(offset=4 mask=1) slot_idx=0 field_idx=0)] tuple_path=[])
> *** Check failure stack trace: ***
> @ 0x36fe72c google::LogMessage::Fail()
> @ 0x36fffdc google::LogMessage::SendToLog()
> @ 0x36fe08a google::LogMessage::Flush()
> @ 0x3701c48 google::LogMessageFatal::~LogMessageFatal()
> @ 0x12e47ab impala::RowDescriptor::GetTupleIdx()
> @ 0x1b378f5 impala::SlotRef::Init()
> @ 0x1b25fea impala::ScalarExpr::Init()
> @ 0x1b665b2 impala::ScalarFnCall::Init()
> @ 0x1b2c44e impala::ScalarExpr::Create()
> @ 0x1b2c5df impala::ScalarExpr::Create()
> @ 0x1b2c6a0 impala::ScalarExpr::Create()
> @ 0x19ad286 impala::PartitionedHashJoinPlanNode::Init()
> @ 0x18b5d8d impala::PlanNode::CreateTreeHelper()
> @ 0x18b5cd9 impala::PlanNode::CreateTreeHelper()
> @ 0x18b5e48 impala::PlanNode::CreateTree()
> @ 0x12f4ca7 impala::FragmentState::Init()
> @ 0x12f839c impala::FragmentState::CreateFragmentStateMap()
> @ 0x126cedb impala::QueryState::StartFInstances()
> @ 0x125c4df impala::QueryExecMgr::ExecuteQueryHelper()
> {code}
>
>
> Some notes about the repro:
> - The inside of the select (without filtering on the unnested value) is OK.
> - If I unnest only one array then this is OK.
> - If I remove the IN clause from the view’s DDL then the query runs well.
>
> {*}Update{*}:
> I managed to do a repro without creating an actual view. This might reduce
> the complexity with the tuple/slot IDs for the investigation.
> {code:java}
> select id, unnested_arr1, unnested_arr2 from (
> select id, unnest(arr1) as unnested_arr1, unnest(arr2) as unnested_arr2
> from functional_parquet.complextypes_arrays
> where id in (select id from functional_parquet.alltypestiny)) a
> where a.unnested_arr1 < 5 {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]