Kathey Marsden wrote:

As for how to incorporate this into RuntimeStatisticsParser, the only thing I can think of is to add an boolean orderedSearchStrings(String[] searchStrings) method to RuntimeStatistics parser that will search for the specified strings in order and return true if they are there in the order they are in the array.

I think this works so long as:

1) the table names in the query do not appear elsewhere in the query plan (ex. a table name of "T" would match the first letter of the word "Table" in "Hash Table ResultSet", which we wouldn't want), and

2) the argument array passed to the new function includes *ALL* tables in the query, not just a subset.

With respect to #2, if my query is of the form:

  select ... from
    (select ... from t2, t1, t3 where ...) X1
    (select ... from t1, t2 where ...) X2
  where ...

Assume a test wants to verify that the tables in subquery X2 have a join order of { T2, T1 }, but doesn't really care about the join order of the subquery in X1, nor does it care about the order of X1 w.r.t. X2. You'd *still* have to make sure that the array passed into the ordered search method includes the join order for X1, as well, otherwise the test might incorrectly pass.

For example, if we only check for the join order of the "targeted" subquery X2, meaning we pass ["T2", "T1"] into the proposed method and ignore X1 altogether, then the test would IN-correctly PASS for the following query plan:

  ProjectRestrict:
  +++ JoinNode_0:
  ++++++ LeftResultSet:                  <== This corresponds to X1
  +++++++++ JoinNode_1:
  ++++++++++++ LeftResultSet:
  +++++++++++++++ JoinNode_2:
  ++++++++++++++++++ LeftResultSet: T3
  ++++++++++++++++++ RightResultSet: T2
  ++++++++++++ RightResultSet: T1
  ++++++ RightResultSet:                 <== This corresponds to X2
  +++++++++ JoinNode_3:
  ++++++++++++ LeftResultSet: T1
  ++++++++++++ RightResultSet: T2


If you just search for "T1" followed by "T2", the test will pass because the join order for X1 matches--but that's wrong because it's really X2 that we wanted to check.

If instead of ["T1", "T2"] you pass in ["T3", "T2", "T1", "T2", "T1"]--i.e. include *ALL* tables in the query, even the ones that aren't necessarily targeted--then I think you'd get the desired behavior. The downside to this is that the test will fail if a join order about which we "don't care" changes (ex. the join order for X1 in this case). But that's how things work today with the canon-based test, as well, so even if it's not ideal, at least it wouldn't really be any worse...

To get the ideal behavior (where the test fails if and only if the "targeted" subquery's join order is not what is expected) with the proposed orderedSearchStrings() approach, one would have to ensure that the table names used in the targeted subquery do not appear anywhere else in the query. My guess is that you would have to rewrite a good number of tests to guarantee that, which would probably be non-trivial.

So it seems like the easiest approach would be to follow Kathey's suggestion, but make sure that all tests which use the new method pass in a full list of all base table names in the query (not just a targeted subset).

Army

Reply via email to