Kathey Marsden wrote:
As for how to incorporate this into RuntimeStatisticsParser, the only
thing I can think of is to add an boolean orderedSearchStrings(String[]
searchStrings) method to RuntimeStatistics parser that will search for
the specified strings in order and return true if they are there in the
order they are in the array.
I think this works so long as:
1) the table names in the query do not appear elsewhere in the query
plan (ex. a table name of "T" would match the first letter of the word
"Table" in "Hash Table ResultSet", which we wouldn't want), and
2) the argument array passed to the new function includes *ALL*
tables in the query, not just a subset.
With respect to #2, if my query is of the form:
select ... from
(select ... from t2, t1, t3 where ...) X1
(select ... from t1, t2 where ...) X2
where ...
Assume a test wants to verify that the tables in subquery X2 have a join
order of { T2, T1 }, but doesn't really care about the join order of the
subquery in X1, nor does it care about the order of X1 w.r.t. X2. You'd
*still* have to make sure that the array passed into the ordered search
method includes the join order for X1, as well, otherwise the test might
incorrectly pass.
For example, if we only check for the join order of the "targeted"
subquery X2, meaning we pass ["T2", "T1"] into the proposed method and
ignore X1 altogether, then the test would IN-correctly PASS for the
following query plan:
ProjectRestrict:
+++ JoinNode_0:
++++++ LeftResultSet: <== This corresponds to X1
+++++++++ JoinNode_1:
++++++++++++ LeftResultSet:
+++++++++++++++ JoinNode_2:
++++++++++++++++++ LeftResultSet: T3
++++++++++++++++++ RightResultSet: T2
++++++++++++ RightResultSet: T1
++++++ RightResultSet: <== This corresponds to X2
+++++++++ JoinNode_3:
++++++++++++ LeftResultSet: T1
++++++++++++ RightResultSet: T2
If you just search for "T1" followed by "T2", the test will pass because
the join order for X1 matches--but that's wrong because it's really X2
that we wanted to check.
If instead of ["T1", "T2"] you pass in ["T3", "T2", "T1", "T2",
"T1"]--i.e. include *ALL* tables in the query, even the ones that aren't
necessarily targeted--then I think you'd get the desired behavior. The
downside to this is that the test will fail if a join order about which
we "don't care" changes (ex. the join order for X1 in this case). But
that's how things work today with the canon-based test, as well, so even
if it's not ideal, at least it wouldn't really be any worse...
To get the ideal behavior (where the test fails if and only if the
"targeted" subquery's join order is not what is expected) with the
proposed orderedSearchStrings() approach, one would have to ensure that
the table names used in the targeted subquery do not appear anywhere
else in the query. My guess is that you would have to rewrite a good
number of tests to guarantee that, which would probably be non-trivial.
So it seems like the easiest approach would be to follow Kathey's
suggestion, but make sure that all tests which use the new method pass
in a full list of all base table names in the query (not just a targeted
subset).
Army