Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8417 )
Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes ...................................................................... Patch Set 2: (12 comments) Did a first pass over it. http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py File be/src/codegen/gen_ir_descriptions.py: http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py@99 PS2, Line 99: ["HASH_FAST", "IrFastHash"], Do we use this? I don't see any places where it's used currently. http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc File be/src/runtime/data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@a468 PS2, Line 468: Lol, weird. http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@474 PS2, Line 474: partition_expr_evals_[j Use 'eval' directly. http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc File be/src/runtime/raw-value.cc: http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc@220 PS2, Line 220: return HashUtil::FastHash64(v, static_cast<size_t>(type.GetByteSize()), seed); I think this might be slightly slower - with the previous approach I think we get a specialised version of the hash function for that input length for each data type, whereas here we have to go through another switch in GetByteSize(). http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h File be/src/util/hash-util.h: http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@235 PS2, Line 235: /* The MIT License I think this should go at the top of the file beneath the apache license. Then we can just say that the FastHash64 implementation came with that license. E.g. /* FastHash64 implementation derived from MIT-licensed code written by Zilong Tan The MIT License ... */ http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@263 PS2, Line 263: size_t use int64_t - we generally use signed integers for lengths. http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@266 PS2, Line 266: (const uint64_t *) We use c-style casts - i.e. reinterpret_cast<> http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@267 PS2, Line 267: pos I believe the C++ standard doesn't allow pointer arithmetic on void - we should convert to uint8_t for doing that. http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@278 PS2, Line 278: (const unsigned char* See above comments. http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@282 PS2, Line 282: (uint64_t) should use static_cast<uint64_t>() http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test File testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test: http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test@145 PS2, Line 145: 8,'k1',-1,'k1',1 Why does this make a difference when we have VERIFY_IS_EQUAL_SORTED above? http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test File testdata/workloads/tpcds/queries/tpcds-q77a.test: http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test@125 PS2, Line 125: 'catalog channel',NULL,538912.55,2050279.74,-1383554.73 I'll file a JIRA to make this test deterministically pass: IMPALA-6155. In the meantime, can you change this to ---- RESULTS: VERIFY_IS_EQUAL_SORTED That will make it ignore the order of results. -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Sat, 04 Nov 2017 00:28:16 +0000 Gerrit-HasComments: Yes