[
https://issues.apache.org/jira/browse/DERBY-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552834#comment-13552834
]
Knut Anders Hatlen commented on DERBY-6011:
-------------------------------------------
I enabled optimizer tracing and ran the following script:
connect 'jdbc:derby:memory:db;create=true';
CREATE TABLE jobs(endtime BIGINT,reseedinterval BIGINT,outputspec
CLOB,errortext CLOB,starttime BIGINT,reseedtime BIGINT,status CHAR(1) NOT
NULL,startmethod CHAR(1) NOT NULL,windowend BIGINT,connectionname VARCHAR(32)
NOT NULL /*CONSTRAINT c1357832449702 REFERENCES repoconnections(connectionname)
ON DELETE RESTRICT*/,type CHAR(1) NOT NULL,expirationtime BIGINT,docspec
CLOB,id BIGINT NOT NULL CONSTRAINT c1357832449701 PRIMARY KEY,lasttime BIGINT
NOT NULL,priority BIGINT NOT NULL,description VARCHAR(255) NOT NULL,outputname
VARCHAR(32) NOT NULL /*CONSTRAINT c1357832449700 REFERENCES
outputconnections(connectionname) ON DELETE RESTRICT*/,lastchecktime
BIGINT,hopcountmode CHAR(1),intervaltime BIGINT);
CREATE TABLE jobqueue(docpriority FLOAT,id BIGINT NOT NULL CONSTRAINT
c1357832449692 PRIMARY KEY,priorityset BIGINT,docid CLOB NOT NULL,failcount
BIGINT,status CHAR(1) NOT NULL,dochash VARCHAR(40) NOT NULL,isseed
CHAR(1),checktime BIGINT,checkaction CHAR(1),jobid BIGINT NOT NULL CONSTRAINT
c1357832449691 REFERENCES jobs(id) ON DELETE RESTRICT,failtime BIGINT);
CREATE UNIQUE INDEX I1357832449682 ON jobqueue (dochash,jobid);
prepare ps as 'SELECT id,status,checktime FROM jobqueue WHERE dochash=? AND
jobid=? FOR UPDATE';
The optimizer trace for the two plans we're interested in looked like this:
Considering conglomerate ConglomerateDescriptor: conglomerateNumber = 1233 name
= SQL130114161941611 uuid = ce8780ad-013c-39a4-fecd-000002dacb70 indexable =
true, key columns = {JOBID} for table 0
Estimating cost of conglomerate: ConglomerateDescriptor: conglomerateNumber =
1233 name = SQL130114161941611 uuid = ce8780ad-013c-39a4-fecd-000002dacb70
indexable = true, key columns = {JOBID} for table 0
Cost of conglomerate ConglomerateDescriptor: conglomerateNumber = 1233 name =
SQL130114161941611 uuid = ce8780ad-013c-39a4-fecd-000002dacb70 indexable =
true, key columns = {JOBID} scan for table number 0 is :
Level2CostEstimateImpl: at 2080388391, cost == 20.039500000000004, rowCount ==
6.0, singleScanRowCount == 6.0
Number of extra first column predicates is : 0, extra first column
selectivity is : 0.1
Number of extra start/stop predicates is : 0, extra start/stop
selectivity is : 1.0
Number of start/stop statistics predicates is : 1, statistics
start/stop selectivity is : 1.0
Number of extra qualifiers is : 0, extra qualifier selectivity is : 1.0
Number of extra non-qualifiers is : 1, extra non-qualifier selectivity
is : 0.1
Cost including extra first column selectivity is : Level2CostEstimateImpl: at
2080388391, cost == 38.50750000000001, rowCount == 0.6000000000000001,
singleScanRowCount == 0.6000000000000001 for table 0
Index does not cover query: cost including row fetch is:
Level2CostEstimateImpl: at 2080388391, cost == 118.95070000000003, rowCount ==
0.6000000000000001, singleScanRowCount == 0.6000000000000001 for table 0
Cost of 1.0 scans is: Level2CostEstimateImpl: at 2080388391, cost ==
118.95070000000003, rowCount == 0.6000000000000001, singleScanRowCount ==
0.6000000000000001 for table 0
Cost including extra non-qualifier start/stop selectivity is :
Level2CostEstimateImpl: at 2080388391, cost == 118.95070000000003, rowCount ==
0.6000000000000001, singleScanRowCount == 0.06000000000000001 for table 0
Considering conglomerate ConglomerateDescriptor: conglomerateNumber = 1249 name
= I1357832449682 uuid = 4fe880b5-013c-39a4-fecd-000002dacb70 indexable = true,
key columns = {DOCHASH, JOBID} for table 0
Estimating cost of conglomerate: ConglomerateDescriptor: conglomerateNumber =
1249 name = I1357832449682 uuid = 4fe880b5-013c-39a4-fecd-000002dacb70
indexable = true, key columns = {DOCHASH, JOBID} for table 0
Guaranteed to match a single row - cost is: 1.5715 for table 0
Lock mode set to MODE_RECORD because all start and stop positions are constant
Cost of 1.0 scans is: Level2CostEstimateImpl: at 2080388391, cost == 1.5715,
rowCount == 1.0, singleScanRowCount == 1.0 for table 0
Index does not cover query - cost including base row fetch is: 134.072 for
table 0
If I read this correctly, the optimizer estimates the predicate JOBID = ? to
match 10% of the rows in the {JOBID} index, and the size of the index is
estimated to 6 (it's actually empty), so that's where the expected 0.6 matching
rows come from.
With the {DOCHASH,JOBID} index, on the other hand, it simply concludes that
it's guaranteed to return one row (although, strictly speaking, it's only
guaranteed to return at most one row, as it could return zero rows).
So when the estimated size of the table is very small, the optimizer thinks the
non-unique index returns fewer rows than the unique index. Which is not correct.
> Derby performs very badly (seems to deadlock and timeout) in very simple
> multi-threaded tests
> ---------------------------------------------------------------------------------------------
>
> Key: DERBY-6011
> URL: https://issues.apache.org/jira/browse/DERBY-6011
> Project: Derby
> Issue Type: Bug
> Affects Versions: 10.7.1.1, 10.8.2.2, 10.9.1.0
> Environment: Lenovo laptop with SSD's, Windows 7, 64-bit, Sun JDK
> 1.6.xx
> Reporter: Karl Wright
> Attachments: derby.log, force-specific-index.diff, manifoldcf.log
>
>
> The Apache ManifoldCF project supports Derby as one of its underlying
> databases. Simple tests, however, demonstrate that Derby is apparently
> deadlocking and timing out repeatedly under multi-thread conditions. This
> problem is long-standing, and is not exhibited by any other database
> ManifoldCF supports, and makes a simple test take between 6x and 12x as long.
> There is a trivial test with demonstrates the problem vs. other databases.
> Please do the following (once you have java 1.6+, svn 1.7+, and ant 1.7+
> available):
> (1) Check out https://svn.apache.org/repos/asf/manifoldcf/trunk
> (2) Run the following ant target to download the dependencies: "ant
> make-core-deps"
> (3) Run the Derby test: "ant run-rss-tests-derby" . Note the time required -
> at least 180 seconds, can be up to 360 seconds.
> (4) Run the equivalent HSQLDB test: "ant run-rss-tests-HSQLDB". This test
> takes about 31 seconds to run.
> The output of the Derby test can be found in the directory
> "tests/rss/test-derby-output". Have a look at manifoldcf.log, where all
> long-running queries are reported. Derby.log is also included, which shows
> only that during the test's cleanup phase the database is deleted before it
> is shutdown, which is not pertinent to the performance issue.
> I am available to assist with ManifoldCF, if that seems to be required.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira