[ 
https://issues.apache.org/jira/browse/HIVE-23031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060439#comment-17060439
 ] 

Zoltan Haindrich commented on HIVE-23031:
-----------------------------------------

I should have been more brief in the description :)
Thank you [~bslim] for thinking it through; I wanted to first concentrate on 
doing only the rewriting and do it for a concreate sketch impl (hll) - and see 
how well that works.

bq. One sketches return an approximate and user want exact reporting.

I don't want to force every query into this world - probably a feature toggle 
could be used to enable it.

bq. how you will be mapping the sketching implementation to actual execution 
given that there is multiple sketches algorithms

I thinked that rewriting for the udfs which are of the desired sketch family 
would make it happen.

bq. let's treat whatever sketch you have in mind as a UDF and maybe add some as 
defaults udf that are trusted by the system

Right now I'm not sure how this could be incorporated; but I keep this idea in 
mind - it could make it more easily customizable...



> Add option to enable transparent rewrite of count(distinct) into sketch 
> functions
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-23031
>                 URL: https://issues.apache.org/jira/browse/HIVE-23031
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to