[ 
https://issues.apache.org/jira/browse/IMPALA-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554286#comment-17554286
 ] 

ASF subversion and git services commented on IMPALA-11280:
----------------------------------------------------------

Commit 2744f46fbd921dafe9b63f4a0011b2237ee07c5f in impala's branch 
refs/heads/master from Gabor Kaszab
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2744f46fb ]

IMPALA-11280: Join node incorrectly picks up unnest(array) predicates

The expectation for predicates on unnested arrays is that they are
either picked up by the SCAN node or the UNNEST node for evaluation. If
there is only one array being unnested then the SCAN node, otherwise
the UNNEST node will be responsible for the evaluation. However, if
there is a JOIN node involved where the JOIN construction happens
before creating the UNNEST node then the JOIN node incorrectly picks
up the predicates for the unnested arrays as well. This patch is to fix
this behaviour.

Tests:
  - Added E2E tests to cover result correctness.
  - Added planner tests to verify that the desired node picks up the
    predicates for unnested arrays.

Change-Id: I89fed4eef220ca513b259f0e2649cdfbe43c797a
Reviewed-on: http://gerrit.cloudera.org:8080/18614
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Zipping unnest hits DCHECK when querying from a view that has an IN operator
> ----------------------------------------------------------------------------
>
>                 Key: IMPALA-11280
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11280
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.1.0
>            Reporter: Gabor Kaszab
>            Assignee: Gabor Kaszab
>            Priority: Major
>              Labels: complextype
>
> *Repro steps:*
> 1) Create a view that returns arrays and has an IN operator in the WHERE 
> clause:
> {code:java}
> drop view if exists unnest_bug_view;
> create view unnest_bug_view as (
>   select id, arr1, arr2
>   from functional_parquet.complextypes_arrays
>   where id % 2 = 1 and id in (select id from functional_parquet.alltypestiny)
> ); {code}
> 2) Unnest the arrays and filter by the unnested values in an outer SELECT:
> {code:java}
> select
>   id,
>   unnested_arr1,
>   unnested_arr2
> from
>   (select
>      id,
>      unnest(arr1) as unnested_arr1,
>      unnest(arr2) as unnested_arr2
>    from unnest_bug_view) a
> where a.unnested_arr1 < 5; {code}
> This hits a DCHECK in RowDescriptor::GetTupleIdx()
>  
>  
> {code:java}
> descriptors.cc:467] 5643fd6cdd5cece3:77942ead00000000] Check failed: id < 
> tuple_idx_map_.size() (3 vs. 2) RowDescriptor: Tuple(id=0 size=29 
> slots=[Slot(id=2 type=INT col_path=[0] offset=24 null=(offset=28 mask=4) 
> slot_idx=2 field_idx=2), Slot(id=3 type=ARRAY col_path=[1] 
> children_tuple_id=3 offset=0 null=(offset=28 mask=1) slot_idx=0 field_idx=0), 
> Slot(id=5 type=ARRAY col_path=[2] children_tuple_id=4 offset=12 
> null=(offset=28 mask=2) slot_idx=1 field_idx=1)] tuple_path=[])
> Tuple(id=1 size=5 slots=[Slot(id=0 type=INT col_path=[2] offset=0 
> null=(offset=4 mask=1) slot_idx=0 field_idx=0)] tuple_path=[])
> *** Check failure stack trace: ***
>     @          0x36fe72c  google::LogMessage::Fail()
>     @          0x36fffdc  google::LogMessage::SendToLog()
>     @          0x36fe08a  google::LogMessage::Flush()
>     @          0x3701c48  google::LogMessageFatal::~LogMessageFatal()
>     @          0x12e47ab  impala::RowDescriptor::GetTupleIdx()
>     @          0x1b378f5  impala::SlotRef::Init()
>     @          0x1b25fea  impala::ScalarExpr::Init()
>     @          0x1b665b2  impala::ScalarFnCall::Init()
>     @          0x1b2c44e  impala::ScalarExpr::Create()
>     @          0x1b2c5df  impala::ScalarExpr::Create()
>     @          0x1b2c6a0  impala::ScalarExpr::Create()
>     @          0x19ad286  impala::PartitionedHashJoinPlanNode::Init()
>     @          0x18b5d8d  impala::PlanNode::CreateTreeHelper()
>     @          0x18b5cd9  impala::PlanNode::CreateTreeHelper()
>     @          0x18b5e48  impala::PlanNode::CreateTree()
>     @          0x12f4ca7  impala::FragmentState::Init()
>     @          0x12f839c  impala::FragmentState::CreateFragmentStateMap()
>     @          0x126cedb  impala::QueryState::StartFInstances()
>     @          0x125c4df  impala::QueryExecMgr::ExecuteQueryHelper()
> {code}
>  
>  
> Some notes about the repro:
>  - The inside of the select (without filtering on the unnested value) is OK.
>  - If I unnest only one array then this is OK.
>  - If I remove the IN clause from the view’s DDL then the query runs well.
>  
> {*}Update{*}:
> I managed to do a repro without creating an actual view. This might reduce 
> the complexity with the tuple/slot IDs for the investigation.
> {code:java}
> select id, unnested_arr1, unnested_arr2 from (
> select id, unnest(arr1) as unnested_arr1, unnest(arr2) as unnested_arr2
>   from functional_parquet.complextypes_arrays
>   where id in (select id from functional_parquet.alltypestiny)) a
> where a.unnested_arr1 < 5 {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to