Dimitris Tsirogiannis has posted comments on this change. Change subject: Enable TPC-H workload for Kudu tables ......................................................................
Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/3633/5/testdata/datasets/tpch/tpch_schema_template.sql File testdata/datasets/tpch/tpch_schema_template.sql: PS5, Line 46: distribute by hash (l_orderkey, l_partkey, l_suppkey, l_linenumber) into 9 buckets > How did you come to choose 9 buckets? Kind of ad-hoc. We run the tests in a pseudo-cluster of size 3, so I picked a multiple of that for the buckets. Couldn't find any meaningful guidelines, so I just picked something. PS5, Line 51: 'kudu.key_columns' = 'l_orderkey, l_partkey, l_suppkey, l_linenumber' > Was there any deep thought on how you chose the primary keys? In some of th I used the official TPC-H spec for that. http://gerrit.cloudera.org:8080/#/c/3633/5/tests/query_test/test_tpch_queries.py File tests/query_test/test_tpch_queries.py: PS5, Line 30: def test_tpch_q1(self, vector): > Any interest in doing some good citizen maintenance for these tests? If so, Sure I can take a look at it. Thanks for pointing that out. -- To view, visit http://gerrit.cloudera.org:8080/3633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3a5de71fefa92a78970226d8f49ef445d28f9289 Gerrit-PatchSet: 5 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: David Knupp <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
