paul-rogers commented on PR #13168:
URL: https://github.com/apache/druid/pull/13168#issuecomment-1272673941

   @599166320, you've taken on a helpful improvement in a complex area. Would 
be great if we could be in the same room and use a whiteboard to achieve a 
common understanding of how Druid works in this area, and the design of your 
improvement. The next best thing is for me to state the assumptions which I 
have, so we can see if we have that shared understanding.
   
   We need sorting to fit into the existing scan query structure. Here's my 
understanding of the gist of a scan query "logical plan", from the root 
(client) down to the leaf (scan):
   
   ```text
   1. Project scan rows into SQL rows.
   2. Unbatch `ScanResultValue` batches into rows.
   3. Broker: merge results from h historicals into a single sequence
   4. Historical: merge results from s segments into a single sequence
   5. Segment: merge results from c cursors into a single sequence
   6. Cursor: read r rows into b batches each of size < ~20K rows, so that b = 
ceil(r / 20K).
   ```
   
   If our query has no sort order, the results are unordered. In SQL, this 
means that the order is non-deterministic. So, the easy way to do the above 
merges is just a first-in-first out concatenation.
   
   In Druid, segments are always sorted by time. So, if our query is ordered by 
`__time__, then the logical plan would be:
   
   ```text
   3. Broker: ordered merge of results from h historicals: merging the rows 
from `ScanResultValue` batches to create new batches
   4. Historical: as above, for the s segments
   5. Segment: concatenate the results from the c cursor since, I believe, the 
cursors are already time-ordered
   6. Cursor: read r rows which are time ordered
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to