Michael Brown has posted comments on this change.

Change subject: Enable TPC-H workload for Kudu tables
......................................................................


Patch Set 5:

(3 comments)

Hey Dimitris, some of the questions here are for my information, more than 
critical review comments. Hope that's OK.

http://gerrit.cloudera.org:8080/#/c/3633/5/testdata/datasets/tpch/tpch_schema_template.sql
File testdata/datasets/tpch/tpch_schema_template.sql:

PS5, Line 46: distribute by hash (l_orderkey, l_partkey, l_suppkey, 
l_linenumber) into 9 buckets
How did you come to choose 9 buckets?


PS5, Line 51:   'kudu.key_columns' = 'l_orderkey, l_partkey, l_suppkey, 
l_linenumber'
Was there any deep thought on how you chose the primary keys? In some of the 
queries, it seems pretty clear. This one is different though, for example you 
included l_linenumber.


http://gerrit.cloudera.org:8080/#/c/3633/5/tests/query_test/test_tpch_queries.py
File tests/query_test/test_tpch_queries.py:

PS5, Line 30:   def test_tpch_q1(self, vector):
Any interest in doing some good citizen maintenance for these tests? If so, 
check out http://docs.pytest.org/en/latest/example/parametrize.html . It seems 
this could be written as 1 test and the query ID (or file name) parametrized 
via pytest.mark.parametrize. Not required, just a thought.


-- 
To view, visit http://gerrit.cloudera.org:8080/3633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a5de71fefa92a78970226d8f49ef445d28f9289
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to