Hello Michael Smith, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/20396
to look at the new patch set (#2).
Change subject: IMPALA-12393: Fix inconsistent hash for TimestampValue in
DictEncoder
......................................................................
IMPALA-12393: Fix inconsistent hash for TimestampValue in DictEncoder
Currently, DictEncoder uses the default hash function for
TimestampValue, which means it is hashing the entire
TimestampValue struct. This can be inconsistent, because
TimestampValue contains some padding that may not be zero
in some cases. For TimestampValues that are part of a Tuple,
the padding is zero, so this is mainly present in test cases.
This was discovered when fixing a Clang Tidy performance-for-range-copy
warning by iterating with a const reference rather than
making a copy of the value. DictTest.TestTimestamps became
flaky with that change, because the hash was no longer
consistent. The copy must have had consistent content for
the padding through the iteration, but the const reference
did not.
This adds a template specialization of the Hash function
for TimestampValue. The specialization uses TimestampValue::Hash(),
which hashes only the non-padding pieces of the struct. This
also includes the change to dict-test.cc that uncovered the
issue. This fix is mostly to unblock IMPALA-12390.
Testing:
- Ran dict-test in a loop for a few hundred iterations
- Hand tested inserting many timestamps into a Parquet table
with dictionary encoding and verified that the performance didn't
change.
Change-Id: Iad86e9b0f645311c3389cf2804dcc1a346ff10a9
---
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
2 files changed, 8 insertions(+), 1 deletion(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/20396/2
--
To view, visit http://gerrit.cloudera.org:8080/20396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iad86e9b0f645311c3389cf2804dcc1a346ff10a9
Gerrit-Change-Number: 20396
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>