Henry Robinson has posted comments on this change. Change subject: Runtime filters tests ......................................................................
Patch Set 3: (13 comments) http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test File testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test: Line 2: ---- COMMENT > what's the reason for adding a new section for this? we can already put com We can't put comments *anywhere* - they have to be within a section. I added COMMENT to have something that wasn't obviously scoped to a single query. However, I've removed it, and now the larger comments are moved into the QUERY block. Line 12: select count(*) from alltypes p join [BROADCAST] alltypestiny b > if you're not testing the planner, always use straight_join to avoid unwant Done Line 17: row_regex: .*RowsRead: 2.43K .* > how does this work? is it looking for any line that fits the regex? https://issues.cloudera.org/browse/IMPALA-3190 Line 28: row_regex: .*RowsRead: 206 .* > is the fact that a filter/a number of filters was applied recorded more dir Yes, the number of files / scan ranges / row groups / rows processed and filtered is recorded. I've added those counters to the tests, but for tests where there are no filters, the counters are absent. In that case we have to keep using the number of rows read as an indicator that no filtering was done. Line 30: > does this really need to have 2 blank lines? you could also expand the ==== I find our test files extremely hard to read because there's no grouping of related test cases. I don't know we have to retrospectively move all test files to this, but the visual separation makes it much easier for me to navigate the file and make updates. Now that I've removed COMMENT as a section, is it ok to leave the whitespace separation? Line 196: row_regex: .*Files rejected: 7 .* > 'files rejected' seems a bit brittle (we could change the number of files i True, there's a lot of brittle things about these tests: the number of files, the number of backends and so on (although other tests elsewhere are similarly brittle). But we don't actually ever evaluate filters per-partition, it's always on a per-file basis (this could be a future improvement, but I'm not sure it's the right thing), so that information isn't available to expose. Line 219: # Test case 9: filters do pass through ROJ. > foj case missing Done Line 277: row_regex: .*RowsRead: 333 .* > so this passes if any scan returned 333 rows? Yes. This is not ideal. I also added "Files rejected" here. Line 285: # join as its root, which produces filters for the scan of t1. > what's the plan for this? it can't have an hj at the root (there's an agg). typo - should have said "in" not "as". http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test File testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test: Line 4: # Regression test for IMPALA-3078: Don't wait for global filters > why not roll this into runtime-filters.test? Because we need to measure how long this test took to determine if it's failing, which means a separate test case (and therefore a separate test file). Line 6: # but fail the test if the query took more than a minute. > the latter part of that sentence isn't obvious from the test case Done http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test File testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test: Line 31: > does this really need to have 2 blank lines? I find it a lot easier to read and write test files with this structure (otherwise there's so much visual noise it's hard to tell where the breaks are). Line 117: row_regex: .*Rows rejected: 2.43K .* > why 'rows rejected' here? (as opposed to 'rows read' elsewhere) Rows rejected measures the number of rows rejected by the filter explicitly. Should have used this for test 3 as well. -- To view, visit http://gerrit.cloudera.org:8080/2297 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I94d617c6d23ffa394a6eb7ead56f1cfb701e0d90 Gerrit-PatchSet: 3 Gerrit-Project: Impala Gerrit-Branch: cdh5-2.5.0_5.7.0 Gerrit-Owner: Henry Robinson <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-HasComments: Yes
