[ 
https://issues.apache.org/jira/browse/DERBY-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484923
 ] 

Andrew McIntyre commented on DERBY-2493:
----------------------------------------

Thanks for the review, Knut! I appreciate it.

checkDistinctRows does only check the number of rows returned, and it is some 
cause for concern. Theoretically, I suppose a change could cause a row to be 
eliminated from one set of duplicates and then not removed from another for 
some reason, and the number of rows as checked by assertRowCount() would be the 
same, masking an actual failure. But because the data and queries here are so 
simple, I think it is more likely that duplicates would just fail to be 
eliminated, and such a change would cause some testcases to fail with higher 
row counts. So, while some detail has now been lost in the translation to 
JUnit, I think confidence in the test lies in its simple data sets and in the 
redundancy of the queries run over them.

As for additional checks in the new test, the only thing that immediately came 
to mind was to add a check into checkDistinctRows/assertRowCount that some key 
row in each query has returned only unique values. However, the row that is 
'key' for the select * from ... queries changes depending on the data set, so 
it might be hard to nail down a generic solution that provides a good, simple 
assert method.

There are several tests that assert full result sets in the new test, but it 
might be worthwhile to add some tests that select distinct on one column, and 
then order by another column so that differences in row order are eliminated, 
and then assert the full result set in the test. If you have any ideas or time 
for that, feel free to add them to the new JUnit test once it is committed 
(which will be very shortly).

I'll correct the two items as regards checkDistinctRows(). After having some 
time away from the test, I'm thinking a better approach for 
RuntimeStatisticsParser is to let it do the parsing. :-)  i.e. instead of:

assertEquals(-1, rtsp.indexOf("Distinct Scan"));
assertTrue(rtsp.indexOf("Eliminate duplicates = true") > 0);

in the test, have RuntimeStatisticsParser make the proper String calls and 
store the results as private instance variables whose value can be asserted 
using getter methods. The above would then become:

assertFalse(rtsp.usedDistinctScan());
assertTrue(rtsp.eliminatedDuplicates());

which I think would give the test better readability and make 
RuntimeStatisticsParser more useful.

Let me know if you have any other comments, otherwise I'll check the test in 
once I've made the changes mentioned above.

> Use unsynchronized collections in BackingStoreHashtable
> -------------------------------------------------------
>
>                 Key: DERBY-2493
>                 URL: https://issues.apache.org/jira/browse/DERBY-2493
>             Project: Derby
>          Issue Type: Improvement
>          Components: Performance, Store
>    Affects Versions: 10.3.0.0
>            Reporter: Knut Anders Hatlen
>         Assigned To: Knut Anders Hatlen
>            Priority: Minor
>
> BackingStoreHashtable uses a Vector and a Hashtable, but doesn't need the 
> synchronization provided by these classes (I think). Replacing them with 
> ArrayList and HashMap could improve performance for some kinds of operations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to