Henry Robinson has posted comments on this change.

Change subject: Runtime filters tests
......................................................................


Patch Set 3:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test
File testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test:

Line 2: ---- COMMENT
> what's the reason for adding a new section for this? we can already put com
We can't put comments *anywhere* - they have to be within a section. I added 
COMMENT to have something that wasn't obviously scoped to a single query. 
However, I've removed it, and now the larger comments are moved into the QUERY 
block.


Line 12: select count(*) from alltypes p join [BROADCAST] alltypestiny b
> if you're not testing the planner, always use straight_join to avoid unwant
Done


Line 17: row_regex: .*RowsRead: 2.43K .*
> how does this work? is it looking for any line that fits the regex?
https://issues.cloudera.org/browse/IMPALA-3190


Line 28: row_regex: .*RowsRead: 206 .*
> is the fact that a filter/a number of filters was applied recorded more dir
Yes, the number of files / scan ranges / row groups / rows processed and 
filtered is recorded. I've added those counters to the tests, but for tests 
where there are no filters, the counters are absent. In that case we have to 
keep using the number of rows read as an indicator that no filtering was done.


Line 30: 
> does this really need to have 2 blank lines? you could also expand the ====
I find our test files extremely hard to read because there's no grouping of 
related test cases. I don't know we have to retrospectively move all test files 
to this, but the visual separation makes it much easier for me to navigate the 
file and make updates. Now that I've removed COMMENT as a section, is it ok to 
leave the whitespace separation?


Line 196: row_regex: .*Files rejected: 7 .*
> 'files rejected' seems a bit brittle (we could change the number of files i
True, there's a lot of brittle things about these tests: the number of files, 
the number of backends and so on (although other tests elsewhere are similarly 
brittle).

But we don't actually ever evaluate filters per-partition, it's always on a 
per-file basis (this could be a future improvement, but I'm not sure it's the 
right thing), so that information isn't available to expose.


Line 219: # Test case 9: filters do pass through ROJ.
> foj case missing
Done


Line 277: row_regex: .*RowsRead: 333 .*
> so this passes if any scan returned 333 rows?
Yes. This is not ideal. I also added "Files rejected" here.


Line 285: # join as its root, which produces filters for the scan of t1.
> what's the plan for this? it can't have an hj at the root (there's an agg).
typo - should have said "in" not "as".


http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test
File 
testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test:

Line 4: # Regression test for IMPALA-3078: Don't wait for global filters
> why not roll this into runtime-filters.test?
Because we need to measure how long this test took to determine if it's 
failing, which means a separate test case (and therefore a separate test file).


Line 6: # but fail the test if the query took more than a minute.
> the latter part of that sentence isn't obvious from the test case
Done


http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
File 
testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test:

Line 31: 
> does this really need to have 2 blank lines?
I find it a lot easier to read and write test files with this structure 
(otherwise there's so much visual noise it's hard to tell where the breaks are).


Line 117: row_regex: .*Rows rejected: 2.43K .*
> why 'rows rejected' here? (as opposed to 'rows read' elsewhere)
Rows rejected measures the number of rows rejected by the filter explicitly. 
Should have used this for test 3 as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/2297
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I94d617c6d23ffa394a6eb7ead56f1cfb701e0d90
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Henry Robinson <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-HasComments: Yes

Reply via email to