Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21597 )
Change subject: IMPALA-13246: Smallify strings during broadcast exchange ...................................................................... Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/21597/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21597/4//COMMIT_MSG@14 PS4, Line 14: it turned out that this is not true in some : cases) > Could you please add more details for this? In DIRECTED mode I wouldn't exp I was wrong here, the problematic tuple was in a different part of the query in test_iceberg.py::TestIcebergV2Table::test_delete_partitioned / DELETE FROM evolve_part WHERE length(s) < 4; The DCHECK was hit in the fragment with the union that merges the rows that don't need anti join with deletes + the ones that do. http://gerrit.cloudera.org:8080/#/c/21597/4//COMMIT_MSG@17 PS4, Line 17: Measured >1% improvement in TPCH(42). > Above you mentioned that removing Tuple::SmallifyStrings() makes TPC-H slow There was still some improvement after removing SmallifyStrings. Note that the perf job I used (perf-AB-test-ub2004) didn't seem that reliable to me recently, it can be ~0.5% off, even with large iteration counts. Running another test to check the current improvement. http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.h File be/src/runtime/tuple.h: http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.h@123 PS4, Line 123: ) cons > IMO the default parameter makes the code a bit unsafe, as it is too easy to agree, done http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.h@140 PS4, Line 140: e > nit: redundant space Done http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.cc File be/src/runtime/tuple.cc: http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.cc@68 PS4, Line 68: > nit: you could add a method 'PossibleToSmallify()' or 'FitToSmall()' to Str Added SmallifiedLen() to StringVal and moved the calculation there. -- To view, visit http://gerrit.cloudera.org:8080/21597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I281d77c7a241ebfe8716eb5c975f0660601aec1b Gerrit-Change-Number: 21597 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 14 Aug 2024 07:22:20 +0000 Gerrit-HasComments: Yes
