[ 
https://issues.apache.org/jira/browse/DRILL-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5350:
-------------------------------
    Fix Version/s:     (was: 1.11.0)
                   1.12.0

Feature was implemented, but implementation had a bug. Deferring until a proper 
solution can be worked out. (Turns out to be quite hard to simply shuffle 
vectors from one container to another...)

> Performance: skip merge for single-batch sort
> ---------------------------------------------
>
>                 Key: DRILL-5350
>                 URL: https://issues.apache.org/jira/browse/DRILL-5350
>             Project: Apache Drill
>          Issue Type: Sub-task
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>             Fix For: 1.12.0
>
>
> The external sort uses the classic two-step sort/merge process:
> * Sort each incoming batch. (Optionally spill batches when needed.)
> * Merge batches to create the final output.
> The external sort uses two distinct merge phases: one if all batches are in 
> memory, another if some batches were spilled. The memory merge is obviously 
> the fastest.
> A special case occurs when the sort sees only a single batch of data. In this 
> case, that one batch is already sorted: there is no reason to also run the 
> merge phase. Skipping the merge will speed up small "operational" queries.
> The effect of the optimization was measured using low-level unit tests that 
> set up the sort and measured just the sort run time, omitting normal query 
> overhead. Each run consisted of two phases. In the first phase, the test code 
> was run five times to warm the JVM and Drill code cache. Then, the "money' 
> run ran another five times. Run times where then averaged.
> Data consisted of 64K rows of a very simple schema: (INT, VARCHAR(5)).
> Run time without the optimization: 39 ms.
> Run time with the optimization: 25 ms.
> The result is about a 46% improvement.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to