[
https://issues.apache.org/jira/browse/DERBY-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558697#comment-13558697
]
Knut Anders Hatlen commented on DERBY-6011:
-------------------------------------------
Thanks, Mike and Mamta. Mike's description of the problem and Mamta's
description of fix agree with my understanding this code.
The set of queries that will use this particular optimizer tweak after the
suggested fix (set A), are those that access a table that has a unique index,
and that have equality predicates for all the columns in that index. All plans
which do not use such an index to access that table, will get their cost
adjusted (scan cost is multiplied by 2, scan row count before selectivity
adjustment is incremented by 2).
The set of queries that will see changes in cost estimates for some plans
because of the suggested fix (set B), is the subset of A where the optimizer
believes one or more of the scan alternatives will have to read more than one
row. This includes these two categories:
i) Queries where the row count statistics actually do indicate that multiple
rows will have to be read. That is, the table contains multiple rows.
ii) Queries where the fix for beetle 5006 makes an empty or almost empty
table appear not so empty. That is, if the alternatives include index scans
with no constant start or stop keys, or full table scans. (Note: parameters are
not considered constant start or stop keys, so "our" query falls into this
category.)
The set of queries that will see that a different plan being picked after the
fix (set C), is the subset of B where all of the following conditions apply:
i) The unique index is not covering, so that an index-to-base-row result set
will be needed.
ii) One of the alternative scans is expected to return < 1 row (after
selectivity has been applied).
iii) The estimated cost of reading a row from the base table is so high that
the difference between reading less than 1 row and reading exactly 1 row, is
greater than the difference in cost between the unique index lookup and the
alternative scan.
Set C should be a lot smaller than set B, as most queries are against non-empty
tables. Those queries should already be using the unique index because the
optimizer understands that the non-unique alternatives will have to fetch at
least as many rows from the base table. The costs of the plans using non-unique
scans are still adjusted, but they're already believed to be more expensive, so
it doesn't change the outcome.
> Derby performs very badly (seems to deadlock and timeout) in very simple
> multi-threaded tests
> ---------------------------------------------------------------------------------------------
>
> Key: DERBY-6011
> URL: https://issues.apache.org/jira/browse/DERBY-6011
> Project: Derby
> Issue Type: Bug
> Affects Versions: 10.7.1.1, 10.8.2.2, 10.9.1.0
> Environment: Lenovo laptop with SSD's, Windows 7, 64-bit, Sun JDK
> 1.6.xx
> Reporter: Karl Wright
> Attachments: derby.log, force-specific-index.diff, manifoldcf.log,
> prefer-unique-index-v1.diff
>
>
> The Apache ManifoldCF project supports Derby as one of its underlying
> databases. Simple tests, however, demonstrate that Derby is apparently
> deadlocking and timing out repeatedly under multi-thread conditions. This
> problem is long-standing, and is not exhibited by any other database
> ManifoldCF supports, and makes a simple test take between 6x and 12x as long.
> There is a trivial test with demonstrates the problem vs. other databases.
> Please do the following (once you have java 1.6+, svn 1.7+, and ant 1.7+
> available):
> (1) Check out https://svn.apache.org/repos/asf/manifoldcf/trunk
> (2) Run the following ant target to download the dependencies: "ant
> make-core-deps"
> (3) Run the Derby test: "ant run-rss-tests-derby" . Note the time required -
> at least 180 seconds, can be up to 360 seconds.
> (4) Run the equivalent HSQLDB test: "ant run-rss-tests-HSQLDB". This test
> takes about 31 seconds to run.
> The output of the Derby test can be found in the directory
> "tests/rss/test-derby-output". Have a look at manifoldcf.log, where all
> long-running queries are reported. Derby.log is also included, which shows
> only that during the test's cleanup phase the database is deleted before it
> is shutdown, which is not pertinent to the performance issue.
> I am available to assist with ManifoldCF, if that seems to be required.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira