[ 
https://issues.apache.org/jira/browse/KUDU-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bankim Bhavsar updated KUDU-3286:
---------------------------------
    Docs Text: 
Updated hash computation for empty strings in the FastHash implementation to 
conform with the
handling in Apache Impala. For Bloom filter predicate pushdown feature that 
uses FastHash,
this makes the Kudu client older than version 1.15.0 incompatible with Kudu 
server version 1.15.0
and Kudu client version at or newer than 1.15.0 incompatible with Kudu server 
version earlier than
1.15.0. Both client library and Kudu server need to be updated to version 
1.15.0 or above if using
the Bloom filter predicate feature.

Manifestations of this incompatibility are following messages in the logs:

- "Not implemented: call requires unsupported application feature flags: 4".
- "Not implemented: call requires unsupported application feature flags: 5".

> Add special handling for empty strings for Bloom filter predicate push down
> ---------------------------------------------------------------------------
>
>                 Key: KUDU-3286
>                 URL: https://issues.apache.org/jira/browse/KUDU-3286
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Bankim Bhavsar
>            Assignee: Bankim Bhavsar
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Fast hash used with Bloom filter predicate pushdown has special handling for 
> nullptr.
> [https://github.com/apache/kudu/blob/master/src/kudu/util/hash_util.h#L95]
> However there isn't any special handling for empty objects/strings. Fast hash 
> for an empty string with seed=0 generates a hash value of 0. This doesn't set 
> any bits in Bloom filter and as a result empty strings are reported as not 
> present.
> Impala uses the direct bloom filter approach and includes special handling 
> for empty strings.
> [https://github.com/apache/impala/blob/master/be/src/runtime/raw-value.inline.h#L352]
> This leads to discrepancy between Impala and Kudu and returns incorrect join 
> results.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to