paul-rogers commented on PR #13168: URL: https://github.com/apache/druid/pull/13168#issuecomment-1272683109
Now we can check our understanding of how your feature fits into the stack. There are two parts: sorts and merges. Let's tackle the sort first. There are multiple ways to handle the custom sort. One (undesireable) way is to let the Broker do the work: ```text 3. Broker: unbatch results from h historicals into a single array(or list), sort it, and return the sorted results as a sequence 4. Historical: concat results from s segments into a single sequence 5. Segment: concat results from c cursors into a single sequence 6. Cursor: unordered read r rows into b batches ``` This works, but it puts unwanted amounts of load (in memory and compute) on the Broker. Better to distribute the sort. My earlier note suggested doing the sort at the segment level -- in part because that is easiest to explain: ```text 3. Broker: ordered merge of results from h historicals: merging the rows from `ScanResultValue` batches to create new batches 4. Historical: as above, for the s segments 5. Segment: unbatch results from c cursors into a single array(or list), sort it, and return the sorted results as a sequence 6. Cursor: unordered read r rows into b batches ``` The advantage of the above is that if steps 5 and 6 run in a single thread, we do one big sort rather than a bunch of small sorts and a big merge. My hunch is that the single big sort would be faster. Of course, if we did step 6 in separate threads for each cursor, then we'd want to get maximum parallelism and so we'd want to use the approach which I think you prefer: ```text 4. Historical: Ordered merge of s segments into a single sequence 5. Segment: Ordered merge of c cursors into a single sequence 5.5. Ordered merge of b batches into a single sequence 6. Cursor: unordered read r rows into b batches, sort each batch ``` There is one more variation possible: ```text 5. Segment: Ordered merge of c cursors into a single sequence 5.5. Combine all r rows into a single list & sort 6. Cursor: unordered read r rows into b batches ``` All of these variations work. There is just a trade-off of the cost of the sort and the cost of the merge. The memory footprint is the same: all rows need to be in memory somewhere in order to be sorted. They are all in one big list (the Broker approach) or in multiple smaller lists (the other approaches.) *Editorial aside: this kind of detailed cost-tradeoff and query planning is best done by software, such as the Calcite planner. Us humans find the above mind-numbingly complex. As a result, the output of a human-generated query plan is often some combination of messy, buggy and sub-optimal performance. This is one reason we're discussing moving to a more traditional operator-and-planner based approach: let the computer do the boring stuff.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
