Dimitris Tsirogiannis has posted comments on this change.

Change subject: Enable TPC-H workload for Kudu tables
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3633/5/testdata/datasets/tpch/tpch_schema_template.sql
File testdata/datasets/tpch/tpch_schema_template.sql:

PS5, Line 46: distribute by hash (l_orderkey, l_partkey, l_suppkey, 
l_linenumber) into 9 buckets
> How did you come to choose 9 buckets?
Kind of ad-hoc. We run the tests in a pseudo-cluster of size 3, so I picked a 
multiple of that for the buckets. Couldn't find any meaningful guidelines, so I 
just picked something.


PS5, Line 51:   'kudu.key_columns' = 'l_orderkey, l_partkey, l_suppkey, 
l_linenumber'
> Was there any deep thought on how you chose the primary keys? In some of th
I used the official TPC-H spec for that.


http://gerrit.cloudera.org:8080/#/c/3633/5/tests/query_test/test_tpch_queries.py
File tests/query_test/test_tpch_queries.py:

PS5, Line 30:   def test_tpch_q1(self, vector):
> Any interest in doing some good citizen maintenance for these tests? If so,
Sure I can take a look at it. Thanks for pointing that out.


-- 
To view, visit http://gerrit.cloudera.org:8080/3633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a5de71fefa92a78970226d8f49ef445d28f9289
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to