gortiz commented on issue #18667: URL: https://github.com/apache/pinot/issues/18667#issuecomment-4750846595
> As part of this work I also wanted to propose multi-column sort leveraged by Minions and storing segment metadata to detect the same. While the realtime segments can commit based on the sort order for uploaded segments or segments with older table config we can rely on minions to re-sort the segments. That is cool. But is it actually necessary? I can see 2 scenarios when ORDER BY A, B is executed on a segment sorted by A: - A is already selective enough. All rows with equal A will be stored in order. Inside this partition, they won't be sorted by B, but there shouldn't be that many elements, so sorting inside the partition should be pretty fast. - A is not very selective (imagine 2 values for 1M rows). Then sorting by A is not useful, we could be sorting by B instead. I'm not saying the multi-col sorter isn't useful, but it's probably not a high priority. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
