[jira] [Updated] (DRILL-5350) Performance: skip merge for single-batch sort

Paul Rogers (JIRA) Thu, 06 Apr 2017 14:06:06 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Paul Rogers updated DRILL-5350:
-------------------------------
    Issue Type: Sub-task  (was: Improvement)
        Parent: DRILL-5325

> Performance: skip merge for single-batch sort
> ---------------------------------------------
>
>                 Key: DRILL-5350
>                 URL: https://issues.apache.org/jira/browse/DRILL-5350
>             Project: Apache Drill
>          Issue Type: Sub-task
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>             Fix For: 1.11.0
>
>
> The external sort uses the classic two-step sort/merge process:
> * Sort each incoming batch. (Optionally spill batches when needed.)
> * Merge batches to create the final output.
> The external sort uses two distinct merge phases: one if all batches are in 
> memory, another if some batches were spilled. The memory merge is obviously 
> the fastest.
> A special case occurs when the sort sees only a single batch of data. In this 
> case, that one batch is already sorted: there is no reason to also run the 
> merge phase. Skipping the merge will speed up small "operational" queries.
> The effect of the optimization was measured using low-level unit tests that 
> set up the sort and measured just the sort run time, omitting normal query 
> overhead. Each run consisted of two phases. In the first phase, the test code 
> was run five times to warm the JVM and Drill code cache. Then, the "money' 
> run ran another five times. Run times where then averaged.
> Data consisted of 64K rows of a very simple schema: (INT, VARCHAR(5)).
> Run time without the optimization: 39 ms.
> Run time with the optimization: 25 ms.
> The result is about a 46% improvement.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (DRILL-5350) Performance: skip merge for single-batch sort

Reply via email to