[
https://issues.apache.org/jira/browse/IMPALA-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046011#comment-17046011
]
Tim Armstrong commented on IMPALA-9176:
---------------------------------------
I think the way I'd like to tackle this is with the auxiliary structure of
FlatRowPtr objects. I looked at the amount of code required to have separate
iterators, and it would add a lot of duplication. There's also a lot of
overhead for iterating over the stream, which we do once per probe row, since
we need to fix up all the string and collection pointers for each row on each
iteration. This does add 8 bytes per row but I think it's worth it.
> Make access to null-aware partition from PartitionedHashJoinNode read-only
> --------------------------------------------------------------------------
>
> Key: IMPALA-9176
> URL: https://issues.apache.org/jira/browse/IMPALA-9176
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Labels: multithreading
>
> Currently the accesses to null_aware_partition() are logically read-only
> (since the rows and other state is not mutated) and only accesses the build
> row when pinned, but is implemented using the built-in read iterator of
> BufferedTupleStream. This would prevent sharing of the build side for
> null-aware anti-join.
> We need to either allow multiple read iterators for a pinned stream, or build
> an auxiliary structure, e.g. an array of Tuple ptrs or FlatRowPtr.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]