[GitHub] [druid] 599166320 commented on pull request #13168: ScanQuery supports multi column orderBy queries

GitBox Thu, 06 Oct 2022 09:17:57 -0700


599166320 commented on PR #13168:
URL: https://github.com/apache/druid/pull/13168#issuecomment-1270349887


   @paul-rogers 
   I have done the following work in this commit:
   
   1. For the sorting of ordinary columns, when traversing the segment, I 
prevent the scanquery object from passing the orderByLimit parameter to the 
cursor. (that is, the simplest solution mentioned above)
   
   2. Improve and add more unit tests
   
   3. Performance optimization mentioned above
   
   
   
   The following points should be noted:
   
   I added an `n-way merge strategy` to ScanQueryRunnerFactory. Eventually, I 
did not add a limit to the generated Sequence<ScanResultValue> like 
nWayMergeAndLimit. I don't think it is necessary.
   
   This is because the merge operation of both data nodes 
`(Historical/Realtime)` and `brokers` will be implemented in 
ScanQueryOrderByLimitRowIterator.
   
   
   
   Another thing to note is that I didn't put the sorting of    `__time` and 
the sorting of ordinary columns together, because `__time` is special. In 
Druid, `__time` is actually a special index. Unlike ordinary sorting, it must 
traverse all sorted data.
   
   
   When you review again, see if there is anything else to improve?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] 599166320 commented on pull request #13168: ScanQuery supports multi column orderBy queries

Reply via email to