Paul Rogers created DRILL-5366:
----------------------------------
Summary: Use generic copier for wide rows in external sort
Key: DRILL-5366
URL: https://issues.apache.org/jira/browse/DRILL-5366
Project: Apache Drill
Issue Type: Sub-task
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
Fix For: 1.11.0
The external sort makes use of a "priority copier" to copy rows at two times:
* When merging data during spilling
* When merging data for an in-memory sort
As with all such Drill operators, the code works by generating two local
variables per column, then generating two blocks of code per column (one for
setup, one for the actual copy.)
This works fine for rows with few columns. But, in rows with many columns (such
as for queries against JSON documents), the amount of code produced becomes
very large. This introduces extra overhead to generate, compile and store the
extra code.
DRILL-5125 found the same issue in the Selection Vector remover. By applying
the fix from that ticket to the external sort, we reap 24% savings.
Consider a unit test that runs only the copier. Create a single record batch
with 1000 columns and 64K rows. Use the copier to produce a set of smaller
output batches. Such a test factors out all the overhead of running a query.
* Run time for a generated copier: 17 secs.
* Run time with the generic copier: 13 secs
* Savings: 4 seconds or 24%.
(13 seconds is still a very long time to process 64K rows. There may be
optimizations to be had in the priority queue implementation as well, but that
is a separate issue.)
To be conservative, provide a config option to enable the feature, perhaps by
setting a threshold of the number of columns that must be present to use the
generic version. That way, if folks feel that the generated version is faster
for narrow rows, the generated version can be used. And each user can decide
the point at which the costs of bulky code outweighs the performance costs.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)