Bankim Bhavsar has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17376 )

Change subject: [util] Add special handling for empty strings in FastHash
......................................................................

[util] Add special handling for empty strings in FastHash

Impala uses special handling for empty strings while hashing
https://github.com/apache/impala/blob/master/be/src/runtime/raw-value.inline.h#L352

This leads to discrepancy between hash values for empty
strings between Impala and Kudu.
This change basically matches the behavior with Impala.

Changes to the implementation of hashing function for empty
strings generates a different value and hence incorrect
results across different Kudu client and server versions.
To resolve this issue, a new feature flag BLOOM_FILTER_PREDICATE_V2 has
been added and the older one BLOOM_FILTER_PREDICATE has been marked as
deprecated.

Since FastHash is not used for any persistent data structures
so there are no backward compatibility issues that needs special
handling during upgrade.

Discovered bugs in the Java implementation when a negative
byte is promoted to a long and left shifted. Discovery
was because of the magic value for empty strings/buffer.
Thankfully the Java implementation of the FastHash
is not used currently.

Also fixed a bug in predicate test where flush would be missed
due to increase in number of values which affected TSAN.

Change-Id: Idb2d401ab692decd5bc03b48a96d710546c0039e
Reviewed-on: http://gerrit.cloudera.org:8080/17376
Tested-by: Kudu Jenkins
Reviewed-by: Grant Henke <[email protected]>
Reviewed-by: Wenzhe Zhou <[email protected]>
Reviewed-by: Andrew Wong <[email protected]>
---
M java/kudu-client/src/main/java/org/apache/kudu/util/HashUtil.java
M java/kudu-client/src/test/java/org/apache/kudu/util/TestFashHash.java
M src/kudu/client/CMakeLists.txt
M src/kudu/client/predicate-test.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/hash_util-test.cc
M src/kudu/util/hash_util.h
9 files changed, 99 insertions(+), 32 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Grant Henke: Looks good to me, but someone else must approve
  Wenzhe Zhou: Looks good to me, but someone else must approve
  Andrew Wong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/17376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idb2d401ab692decd5bc03b48a96d710546c0039e
Gerrit-Change-Number: 17376
Gerrit-PatchSet: 6
Gerrit-Owner: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>

Reply via email to