[
https://issues.apache.org/jira/browse/HIVE-23031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060439#comment-17060439
]
Zoltan Haindrich commented on HIVE-23031:
-----------------------------------------
I should have been more brief in the description :)
Thank you [~bslim] for thinking it through; I wanted to first concentrate on
doing only the rewriting and do it for a concreate sketch impl (hll) - and see
how well that works.
bq. One sketches return an approximate and user want exact reporting.
I don't want to force every query into this world - probably a feature toggle
could be used to enable it.
bq. how you will be mapping the sketching implementation to actual execution
given that there is multiple sketches algorithms
I thinked that rewriting for the udfs which are of the desired sketch family
would make it happen.
bq. let's treat whatever sketch you have in mind as a UDF and maybe add some as
defaults udf that are trusted by the system
Right now I'm not sure how this could be incorporated; but I keep this idea in
mind - it could make it more easily customizable...
> Add option to enable transparent rewrite of count(distinct) into sketch
> functions
> ---------------------------------------------------------------------------------
>
> Key: HIVE-23031
> URL: https://issues.apache.org/jira/browse/HIVE-23031
> Project: Hive
> Issue Type: Sub-task
> Reporter: Zoltan Haindrich
> Assignee: Zoltan Haindrich
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)