gortiz opened a new pull request, #17328:
URL: https://github.com/apache/pinot/pull/17328

   This PR reverts a theoretical optimization introduced in #2237.
   
   This optimization affects queries like:
   
   ```sql
   SELECT whatever
   FROM table
   ORDER BY col
   LIMIT x
   ```
   
   MSE blindly adds an exchange here (I think sometimes it is not needed, but 
that is not the point of this PR). We need to have a sort-with-limit in the 
receiver stage to preserve semantics, but we can also add a sort-with-limit in 
the sender stage to avoid sending extra blocks and, eventually, enable k-merge 
sort.
   
   The PR #2237 modifies the `PinotSortExchangeCopyRule` rule only to add the 
sort-with-limit when:
   - There is a limit
   - The limit is smaller than 10k rows
   
   The first is justified on the fact that we don't use k-merge sort, and 
therefore, the sending side sort is useless.
   I don't understand the reason why the <10k rows condition was added. We have 
found cases where huge limits are used by not adding this filter, we end up 
sending too many rows that we don't actually need.
   
   This PR removes the second condition, so as long as there is a limit, the 
sender side will sort and apply the limit on their side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to