[
https://issues.apache.org/jira/browse/DRILL-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624335#comment-16624335
]
Paul Rogers commented on DRILL-6754:
------------------------------------
As it turns out, the idea of using an SV2 to reorder a record batch is not
actually that useful. A query is distributed across multiple parallel
fragments, each of which produces a stream of batches. SQL ORDER BY applies to
the whole set, not just a single batch.
The only place that an SV2 is used to reorder a batch is within the external
sort. These batches are then merged (to create an SV4) or merged with spilled
to disk.
No where else in Drill will we ever use an SV2 for batch reordering. And, in
sort, we would not know if the SV2 actually reorders a batch, or if the batch
was already sorted, without doing a scan of the batch.
Perhaps this ticket could explain the situation in which batch reordering might
be useful?
> Add a field to SV2 to indicate if the SV2 reorders the Record Batch
> -------------------------------------------------------------------
>
> Key: DRILL-6754
> URL: https://issues.apache.org/jira/browse/DRILL-6754
> Project: Apache Drill
> Issue Type: Improvement
> Reporter: Karthikeyan Manivannan
> Assignee: Karthikeyan Manivannan
> Priority: Major
>
> The optimization in DRILL-6687 is not correct if an SV2 is used to re-order
> rows in the record batch. Currently, this is not a problem because none of
> the reordering operators (SORT, TOPN) use an SV2. SORT has code for SV2 but
> it is disabled.
> Adding a field to SV2 to indicate if the SV2 reorders the Record Batch would
> allow the safe application of the DRILL-6687 optimization.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)