siddharthteotia opened a new pull request #4535: Implement DISTINCT clause
URL: https://github.com/apache/incubator-pinot/pull/4535
 
 
   Implementation of DISTINCT in Pinot execution engine. 
   
   Design doc -- 
https://docs.google.com/document/d/1Tv51HO5M5S0e18W6GzLYqkdESYHpOvExVbRZISSjj-k/edit#
   
   Testing:
   
   (1) Added unit tests to existing InnerSegment and InterSegment queries tests.
   (2) Additional unit tests using custom data generator and result 
verification.
   (3) Cluster integration tests and result verification by running the queries 
against H2
   
   Not done in this PR:
   
   (1) Handling ORDER BY for DISTINCT queries. Once the 'group by + order by' 
feature PR merges, a follow-up PR will come to integrate with generic order by 
utility/service being implemented as part of that work
   
   (2) Dictionary based results -- As indicated in the design doc, we can 
potentially get the resultset from dictionary if the query is on a single 
column and without filter.
   
   (3) Handling the CalciteSQL parser/compiler -> pinot query path. Once we 
build PinotQuery with appropriate info on executing DISTINCT, the underneath 
execution engine code doesn't have to be changed for executing SQL
   
   (4) Some special handling is probably needed for Floats/Doubles. Looking 
into it and might do in the same PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to