Michael Brown has posted comments on this change.

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8102/1//COMMIT_MSG
Commit Message:

PS1, Line 22: CDH-59396
This looks like internal Cloudera information. Please excise all internal 
Cloudera information from the commit message and diff, as we have non-Cloudera 
community members not privy to that information.


PS1, Line 25: --- tpcds-q39.test / MULTIPLE RESULT SET not recognized by test 
framework / MARKED XFAIL.
            : --- tpcds-q47.test / RESULT MISMATCH in LSD of DECIMAL values / 
ADDED ROUND(2) TO 8th COLUMN OF WITH TABLE, TAKE ACTUAL RESULT AS EXPECTED.
            : --- tpcds-q49.test / RESULT MISMATCH in LSD of DECIMAL values / 
MARKED XFAIL, IMPALA-5945
            : --- tpcds-q57.test / RESULT MISMATCH, excess scale in DECIMAL 
values / FIXED, ADDED TRUNCATE(2) AROUND 6th COLUMN.
These are good comments, but they are not line-wrapped. They could also stand 
to have some vertical white space. Could you please make this section 
cosmetically cleaner?


PS1, Line 32: --- tpcds-q63.test / RESULT MISMATCH, excess scale in DECIMAL 
values / ADDED CAST(DECIMAL(7, 2)) TO 3rd COLUMN
Comments like this are useful. It would be good for you to note this in 
tpcds-q63.test directly in a comment. Please do the same for all annotations.


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q12.test
File testdata/workloads/tpcds/queries/tpcds-q12.test:

PS1, Line 4:       ,i_item_desc 
           :       ,i_category 
           :       ,i_class 
Gerrit highlights trailing white space in files. Click on this file in the 
WebUI to see what I mean.

It's best not to have trailing white space since it makes diffing harder to 
read over time. Can you please remove the trailing white space from all files? 
I'm sure there are Vim, sed, or perl tricks that you know that could do it 
quickly.


PS1, Line 12:   web_sales
            :           ,item 
            :           ,date_dim
I believe these characters (visible in Gerrit WebUI when you look at the file 
in its diff viewer) indicate the presence of tab characters in the file. I 
expect it's better to replace these with spaces; I'm not aware that we use tabs 
anywhere in the code base. For all files, can you replace with spaces and make 
sure the alignment is clean?


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q23.test
File testdata/workloads/tpcds/queries/tpcds-q23.test:

PS1, Line 51:  limit 100;
            :  
            : with frequent_ss_items as
The stress test gets its bank of queries to run from files such as this, but 
it's not clear to me how the stress test handles multiple queries and results 
gathering. This matters both when collecting runtime info and when running the 
stress test proper. Can you please investigate how the stress test handles 
having two queries here? You can talk to Matt Mulder or me about this.


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q24.test
File testdata/workloads/tpcds/queries/tpcds-q24.test:

PS1, Line 48: having sum(netpaid) > (select 0.05*avg(netpaid)
            :                                  from ssales)
            : ;
            : 
            : with ssales as
Highlighting another place where there are 2 queries and the stress test's 
behavior isn't clear.


PS1, Line 104: --
Is this a results delimiter? If yes, why doesn't q23 have a results delimiter?


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q39.test
File testdata/workloads/tpcds/queries/tpcds-q39.test:

PS1, Line 27: ;
            : with inv as
Another multi-query file, like previously.


http://gerrit.cloudera.org:8080/#/c/8102/1/tests/query_test/test_tpcds_queries.py
File tests/query_test/test_tpcds_queries.py:

PS1, Line 87:   @pytest.mark.xfail(reason="IMPALA-5226")
            :   def test_tpcds_q10(self, vector):
            :     self.run_test_case(self.get_workload() + '-q10', vector)
While this works for functional end-to-end tests, we need a way to tell the 
stress test to skip this (and other) queries so as not to include it when 
collecting runtime info or as part of a stress run. You can talk to Matt Mulder 
or me about this.


-- 
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Wood <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-HasComments: Yes

Reply via email to