[
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nishant Bangarwa updated HIVE-20683:
------------------------------------
Attachment: HIVE-20683.patch
> Add the Ability to push Dynamic Between and Bloom filters to Druid
> ------------------------------------------------------------------
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
> Issue Type: New Feature
> Components: Druid integration
> Reporter: Nishant Bangarwa
> Assignee: Nishant Bangarwa
> Priority: Major
> Attachments: HIVE-20683.patch
>
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details -
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan.
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf.
> # During execution phase, before sending the query to druid in
> DruidQueryBasedRecordReader we will deserialize this filter, translate it
> into a DruidDimFilter and add it to existing DruidQuery. Tez executor
> already ensures that when we start reading results from the record reader,
> all the dynamic values are initialized.
> # Explaining a druid query also prints the query sent to druid as
> {{druid.json.query}}. We also need to make sure to update the druid query
> with the filters. During explain we do not have the actual values for the
> dynamic values, so instead of values we will print the dynamic expression
> itself as part of druid query.
> Note:- This work needs druid to be updated to version 0.13.0
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)