[
https://issues.apache.org/jira/browse/IMPALA-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Kaszab updated IMPALA-2490:
---------------------------------
Labels: complextype debugging supportability usability (was: debugging
supportability usability)
> RowsReturned profile counter may be wrong with nested types
> -----------------------------------------------------------
>
> Key: IMPALA-2490
> URL: https://issues.apache.org/jira/browse/IMPALA-2490
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.3.0
> Reporter: Matthew Jacobs
> Assignee: Abhishek Rawat
> Priority: Major
> Labels: complextype, debugging, supportability, usability
>
> We don't have a consistent way of accounting for rows returned in operators
> between Close/Reset cycles. While we should have a way of determining the
> total number of rows returned by an operator (we reset the counter in
> ExecNode::Reset), there appear to be issues with the accounting in some
> places.
> When executing the following query, the rows returned by the NLJ operator
> appear to be wrong:
> {code}
> create table test3 as select t1.field_102.field_104.field_107 c1
> FROM table_3 t1
> INNER JOIN t1.field_86 t2
> INNER JOIN t1.field_102.field_104.field_108.field_110 t3
> INNER JOIN table_5 t4
> WHERE
> NOT EXISTS (SELECT
> tt1.pos AS int_col
> FROM t1.field_102.field_104.field_108.field_110 tt1
> CROSS JOIN t1.field_86 tt2
> WHERE
> ((tt1.pos) IN (tt1.pos, -581.8)) AND (((t1.field_85) = (tt2.key)) AND
> ((t1.field_82) = (tt2.value.field_94))))
> {code}
> The # of rows inserted does not match the number of rows returned by its
> child, the NLJ:
> {code}
> HdfsTableSink:(Total: 1m31s, non-child: 1m31s, % non-child: 100.00%)
> - BytesWritten: 5.36 GB (5760571200)
> - CompressTimer: 0ns
> - EncodeTimer: 1m22s
> - FilesCreated: 1 (1)
> - FinalizePartitionFileTimer: 14.38ms
> - HdfsWriteTimer: 8s058ms
> - PartitionsCreated: 1 (1)
> - PeakMemoryUsage: 50.00 KB (51200)
> - RowsInserted: 615.57M (615574800)
> - TmpFileCreateTimer: 14.754ms
> NESTED_LOOP_JOIN_NODE (id=12):(Total: 1m33s, non-child: 1m31s, %
> non-child: 98.01%)
> - BuildRows: 600 (600)
> - BuildTime: 32.750us
> - PeakMemoryUsage: 4.09 MB (4284416)
> - ProbeRows: 1.02K (1024)
> - ProbeTime: 0ns
> - RowsReturned: 1.14B (1136695648)
> - RowsReturnedRate: 12.22 M/sec
> {code}
> The code used to increment/set the rows_returned_counter_ does not appear to
> be correct.
> There is no workaround.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]