siddharthteotia opened a new pull request #4535: Implement DISTINCT clause URL: https://github.com/apache/incubator-pinot/pull/4535 Implementation of DISTINCT in Pinot execution engine. Design doc -- https://docs.google.com/document/d/1Tv51HO5M5S0e18W6GzLYqkdESYHpOvExVbRZISSjj-k/edit# Testing: (1) Added unit tests to existing InnerSegment and InterSegment queries tests. (2) Additional unit tests using custom data generator and result verification. (3) Cluster integration tests and result verification by running the queries against H2 Not done in this PR: (1) Handling ORDER BY for DISTINCT queries. Once the 'group by + order by' feature PR merges, a follow-up PR will come to integrate with generic order by utility/service being implemented as part of that work (2) Dictionary based results -- As indicated in the design doc, we can potentially get the resultset from dictionary if the query is on a single column and without filter. (3) Handling the CalciteSQL parser/compiler -> pinot query path. Once we build PinotQuery with appropriate info on executing DISTINCT, the underneath execution engine code doesn't have to be changed for executing SQL (4) Some special handling is probably needed for Floats/Doubles. Looking into it and might do in the same PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
