[ 
https://issues.apache.org/jira/browse/IMPALA-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871310#comment-17871310
 ] 

ASF subversion and git services commented on IMPALA-13272:
----------------------------------------------------------

Commit c4230cff2c25b205d20b78553d4d28a799f26ed2 in impala's branch 
refs/heads/master from Daniel Becker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c4230cff2 ]

IMPALA-13272: Analytic function of collections can lead to crash

The following query leads to DCHECK in debug builds and may cause more
subtle issues in RELEASE builds:

  select
    row_no
  from (
    select
      arr.small,
      row_number() over (order by arr.inner_struct1.str) as row_no
    from functional_parquet.collection_struct_mix t,
        t.arr_contains_nested_struct arr
  ) res;

The problem is in AnalyticPlanner.createSortInfo(). Because it is an
array unnesting operation, there are two tuples from which we try to add
slot descriptors to the sorting tuple: the array item tuple (which we'll
need) and the main tuple (which we don't actually need). The main tuple
contains the slot desc for the array. It is marked as materialised, so
we add it to the sorting tuple, but its child 'small' is not
materialised, which leads to the error.

This change solves the problem by only adding slot descs to the sorting
tuple if they are fully materialised, i.e. they and all their children
recursively are also materialised.

Testing:
 - added test queries in sort-complex.test.

Change-Id: I71d1fa28ad4ff2e1a8fc5b91d3fc271c33765190
Reviewed-on: http://gerrit.cloudera.org:8080/21643
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Analytic function of collections can lead to crash
> --------------------------------------------------
>
>                 Key: IMPALA-13272
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13272
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.4.0
>            Reporter: Csaba Ringhofer
>            Assignee: Daniel Becker
>            Priority: Critical
>
> Using Impala's test data the following query leads to DCHECK in debug builds 
> and may cause more subtle issues in RELEASE builds:
> {code}
> select
>   row_no
> from (
>          select
>                arr.small,
>                row_number() over (
>                 order by arr.inner_struct1.str) as row_no
>          from functional_parquet.collection_struct_mix t, 
> t.arr_contains_nested_struct arr
>        ) res
> {code}
> The following DCHECK is hit:
> {code}
> tuple.h:296 Check failed: offset != -1
> {code}
> The problem seems to be with arr.small, which is referenced in the inline 
> view, but not used in the outer query - removing it from the inline view or 
> adding it to the outer select leads to avoiding the bug. The problem seems 
> related to materialization - offset==-1 means that the slot is not 
> materialized, but the Parquet scanner still tries to materialize it.
> It is not clear yet which commit introduced the bug or whether this is a bug 
> in the planner or the backend. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to