31.03.2010 13:56, Knut Anders Hatlen пишет:
On 03/31/10 08:26 AM, Fedd Kraft wrote:
Hello,
When profiling an application I've noticed that sometimes when performing a
simple query to one table without joins, like
select kee from res where qid=? and kee=?
it looks like Derby searching for a row makes a lot of comparisons by
converting binary data to hex strings and then comparing them.
And (here I am not sure) chooses to do a full scan. As the result, the whole
thing works too slow.
The table is defined like this.
create table res(
qid char (16) for bit data not null,
kee char (32) for bit data not null,
dnk char (32) for bit data not null
);
create unique index res_unq on res (qid, kee, dnk);
create index res_idx on res (qid, dnk);
create index res_ix2 on res (qid, kee);
It also looks like it does this hex string comparison when the table is
quite big. And when it is quite small, call normal byte array comparison,
and there is no 'fullscan' classes in the stack trace, and it works
ultrafast.
Sorry there no reproducing scripts yet ready; please tell me may be I just
missing something. I thought that converting makes sense when we compare
values of different types or which may be null, but mine is not that case
(?)
Hi Fyodor,
Comparisons of binary data are supposed be performed byte-wise, without
converting the data to strings first, so if you can provide scripts or
code to reproduce this, please file a bug report here so that it can be
investigated:https://issues.apache.org/jira/browse/DERBY
You may first want to check that you're not using a debug build of
Derby, though, since it might be that the debug builds convert byte
arrays to hex strings to do some sanity checks. (If derby.jar contains a
class called SanityManager, it's a debug build.)
Hi Knut,
unfortunately I cannot yet reproduce this without my bigger project. My
version in use has no SanityManager, and if I'd want to see which method
Derby uses for the comparison, I'd have to use the debug version...
Anyway, looking at the code, I see (db-derby-10.5.3.0-src.zip) that in
the method org.apache.derby.iapi.types.SQLBinary.compare(int op,
DataValueDescriptor other, boolean orderedNulls, boolean unknownRV) is
called with orderedNulls parameter set to false, it always creates hex
strings, see:
public final boolean compare(int op,
DataValueDescriptor other,
boolean orderedNulls,
boolean unknownRV)
throws StandardException
{
if (!orderedNulls) // nulls are unordered
{
// sanity manager stuff skipped
String otherString = other.getString(); // <==== line 574 HERE IS HEX
STRING CREATION
if (this.getString() == null || otherString == null) // <=== AND HERE
return unknownRV;
}
return super.compare(op, other, orderedNulls, unknownRV);
}
The hex string is created through StringBuffer tens of thounsands times
according to profiler; (and the debugger stopped on line 574, when used
a debug build). Well, I was wrong that hex strings are compared; they
are prepared only to know whether any of the values is null.
There is also a query plan problem with this table, and also when the
table grows big:
create table res(
qid char (16) for bit data not null,
kee char (32) for bit data not null,
dnk char (32) for bit data not null
);
create unique index res_unq on res (qid, kee, dnk);
create index res_idx on res (qid, dnk);
create index res_ix2 on res (qid, kee);
The query is
select kee from res where dnk=? and qid=?
And here are two plans, one for big "res" table, and another for small.
The first plan shows that the wrong index was chosen, and we read too
many records to find the result. The table is populated right before it
gets queried; Maybe there is some statistic stuff, but I would like to
avoid running some Derby-specific calls in my application...
Thank you!
=================================
2010-04-05 18:27:29.062 GMT
Thread[mety.store-("zipcode",("city",(*VALUE*,(&5022740b-a32a-4d4a-8a38-511811e32dc5)))),5,m
ain] (XID = 1364486), (SESSIONID = 5), select kee from res where dnk=?
and qid=? ******* Project-Restrict ResultSet (2):
Number of opens = 34
Rows seen = 34
Rows filtered = 0
restriction = false
projection = true
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
restriction time (milliseconds) = 0
projection time (milliseconds) = 0
optimizer estimated row count: 1488,65
optimizer estimated cost: 12706,02
Source result set:
Index Scan ResultSet for RES using index RES_UNQ at read committed
isolation level using instantaneous share row
locking chosen by the optimizer
Number of opens = 34
Rows seen = 34
Rows filtered = 0
Fetch Size = 16
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0
scan information:
Bit set of columns fetched={0, 1, 2}
Number of columns fetched=3
Number of deleted rows visited=0
Number of pages visited=1711
Number of rows qualified=1
Number of rows visited=43192
Scan type=btree
Tree height=4
start position:
>= on first 1 column(s).
Ordered null semantics on the following columns:
stop position:
> on first 1 column(s).
Ordered null semantics on the following columns:
qualifiers:
Column[0][0] Id: 2
Operator: =
Ordered nulls: false
Unknown return value: false
Negate comparison result: false
optimizer estimated row count: 1488,65
optimizer estimated cost: 12706,02
==========================================================================
2010-04-05 18:54:19.171 GMT
Thread[mety.store-("zipcode",("city",(*VALUE*,(&42d96ab6-4dc3-45d0-87d9-2cdea975db97)))),5,m
ain] (XID = 33823), (SESSIONID = 5), select kee from res where dnk=? and
qid=? ******* Project-Restrict ResultSet (3):
Number of opens = 214
Rows seen = 274
Rows filtered = 0
restriction = false
projection = true
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
restriction time (milliseconds) = 0
projection time (milliseconds) = 0
optimizer estimated row count: 28,53
optimizer estimated cost: 277,09
Source result set:
Index Row to Base Row ResultSet for RES:
Number of opens = 214
Rows seen = 274
Columns accessed from heap = {0, 1, 2}
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
optimizer estimated row count: 28,53
optimizer estimated cost: 277,09
Index Scan ResultSet for RES using index RES_IDX at read committed
isolation level using instantaneous s
hare row locking chosen by the optimizer
Number of opens = 214
Rows seen = 274
Rows filtered = 0
Fetch Size = 16
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0
scan information:
Bit set of columns fetched=All
Number of columns fetched=3
Number of deleted rows visited=0
Number of pages visited=4
Number of rows qualified=1
Number of rows visited=2
Scan type=btree
Tree height=3
start position:
>= on first 2 column(s).
Ordered null semantics on the following columns:
stop position:
> on first 2 column(s).
Ordered null semantics on the following columns:
qualifiers:
None
optimizer estimated row count: 28,53
optimizer estimated cost: 277,09