Pritesh Maker updated DRILL-5350:
    Fix Version/s:     (was: 1.13.0)

> Performance: skip merge for single-batch sort
> ---------------------------------------------
>                 Key: DRILL-5350
>                 URL: https://issues.apache.org/jira/browse/DRILL-5350
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
> The external sort uses the classic two-step sort/merge process:
> * Sort each incoming batch. (Optionally spill batches when needed.)
> * Merge batches to create the final output.
> The external sort uses two distinct merge phases: one if all batches are in 
> memory, another if some batches were spilled. The memory merge is obviously 
> the fastest.
> A special case occurs when the sort sees only a single batch of data. In this 
> case, that one batch is already sorted: there is no reason to also run the 
> merge phase. Skipping the merge will speed up small "operational" queries.
> The effect of the optimization was measured using low-level unit tests that 
> set up the sort and measured just the sort run time, omitting normal query 
> overhead. Each run consisted of two phases. In the first phase, the test code 
> was run five times to warm the JVM and Drill code cache. Then, the "money' 
> run ran another five times. Run times where then averaged.
> Data consisted of 64K rows of a very simple schema: (INT, VARCHAR(5)).
> Run time without the optimization: 39 ms.
> Run time with the optimization: 25 ms.
> The result is about a 46% improvement.

This message was sent by Atlassian JIRA

Reply via email to