justinborromeo opened a new issue #7036: [Proposal] K-Way Merge for 
Time-Ordered Scans
URL: https://github.com/apache/incubator-druid/issues/7036
 
 
   # Motivation
   
   The motivation is to support time-ordering for scan queries that return 
greater than 100000 rows.  In PR #7024 (which is in review), the ability to 
time-order small result sets (<100K rows) in memory was added.  It would be 
nice to have this capability for larger result sets so that SELECT queries can 
be avoided (time ordering is the only advantage that SELECT has over SCAN).  
However, sorting larger datasets in-memory runs the risk of Broker nodes 
encountering OOMEs.  A different approach needs to be used.  See #6088 for more 
information.
   
   # Proposed Changes
   
   ## Overview
   
   Scan queries return rows in a streaming manner.  Each Historical node 
executes a query by using a single-threaded sequential scan of target segments. 
 The query runner streams rows to the client/broker as they're discovered.
   
   ## Required changes
   
   To properly implement this feature, a k-way merge would need to occur at 
both the Historical level (to sort rows from segments) and the Broker level (to 
sort rows from each Historical and cache) for time-ordered results to be 
returned to the client.
   
   ## Principles
   
   - The change should not affect the performance of non-time-ordered scan 
queries.
   - The performance impact of sorting should be as small as possible.
   
   ## Implementation
   
   ### Historical-level
   
   Since the existing scan query runner on the Historical is unable to look at 
all the segments at once, it is impossible to do a k-way merge using the 
existing query runner.  I propose creating a new query runner factory called 
something like `OrderedScanQueryRunnerFactor` which would return a single 
runner when mergeRunners() is called.  This runner would open each segment 
file, perform a k-way merge on the files, and return a Sequence of 
ScanResultValues.  The # of segments queried would have to be capped at some 
value to prevent the server from running out of memory.
   
   ### Broker-level
   
   The k-way merge at the broker level is easier to implement in that the 
existing Sequence.flatMerge() can be used.
   
   # Changed Interfaces
   
   - No breaking API changes.
   - Returned rows won't be returned with their segmentId (same as in #7024).
   - New properties:
   
       - druid.query.scan.maxSegmentsToTimeOrder
           - Description: The maximum # of segments that a  Historical is 
allowed to scan in a time-ordered query.  Query fails if number of segments is 
above this  limit
           - Where: Historical 
           - Default:  TBD
           - Format: int
   
   # Migration / Operational impact
   
   No migration needed.
   
   # Future work
   
   A file-based k-way merge will be needed for large queries that touch more 
segments than the threshold.
   
   
   
   Feel free to let me know if something I wrote is incorrect from a technical 
point of view.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to