[ https://issues.apache.org/jira/browse/DERBY-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13578175#comment-13578175 ]
Mamta A. Satoor commented on DERBY-6045: ---------------------------------------- Made some changes to the junit test and committed it into trunk with revision 1446048 with following commit comments ******************** DERBY-6045 (in list multi-probe by primary key not chosen on tables with >256 rows) Added a new test scenario where we add about 10K rows into a table with primary key and check if we are still using index scan for the queries being tested Additionally, cleaned up the test a little bit. ******************** > in list multi-probe by primary key not chosen on tables with >256 rows > ---------------------------------------------------------------------- > > Key: DERBY-6045 > URL: https://issues.apache.org/jira/browse/DERBY-6045 > Project: Derby > Issue Type: Bug > Components: Store > Affects Versions: 10.9.1.0, 10.10.0.0 > Environment: Linux Debian 6.0.5 > Reporter: Tony Brusseau > Priority: Critical > > I have a table with a long integer primary key field and 11 million rows. I > seem to be unable to load large chunks of rows via id in a reasonably > efficient manner. > 1. If I do individual lookups via the primary key, then a fast indexed > lookup occurs. However, if I do large numbers of such queries, then the time > is overwhelmed by round-trip overhead which makes everything incredibly slow. > 2. If I use a single query with a disjunction of the primary keys of > interest, then a table scan is performed (even if the clause only contains > 1-3 items), which walks over 11 million rows...incredibly inefficient. > 3. If I use an IN clause, then a table scan is performed (even if the > clause only contains 1-3 items), which walks over 11 million > rows...incredibly inefficient. > I'm guessing that this might have something to do with the fact that I'm > using large integers and really big numbers that don't start anywhere at or > about 1 for my keys. Could this possibly be confusing the optimizer? > Here are the unlimited query plans for the 3 cases that I enumerated: > ********************************************************************************************* > [EL Fine]: 2013-01-17 > 11:09:53.384--ServerSession(582235416)--Connection(1430986883)--Thread(Thread["Initial > Lisp Listener",5,SubL Thread Group])--SELECT TERM_ID, ARG0, ARG1, ARG2, > ARG3, FORMULA_HASH, FORMULA_LENGTH, FORMULA_TYPE, KB_STATUS FROM > KB.FORMULA_TERM WHERE (TERM_ID = ?) > bind => [2251799814033500] > Thu Jan 17 11:09:53 CST 2013 Thread["Initial Lisp Listener",5,SubL Thread > Group] (XID = 4711079), (SESSIONID = 3), SELECT TERM_ID, ARG0, ARG1, ARG2, > ARG3, FORMULA_HASH, FORMULA_LENGTH, FORMULA_TYPE, KB_STATUS FROM > KB.FORMULA_TERM WHERE (TERM_ID = ?) ******* Project-Restrict ResultSet (3): > Number of opens = 1 > Rows seen = 1 > Rows filtered = 0 > restriction = false > projection = true > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > restriction time (milliseconds) = 0 > projection time (milliseconds) = 0 > optimizer estimated row count: 1.00 > optimizer estimated cost: 6.59 > Source result set: > Index Row to Base Row ResultSet for FORMULA_TERM: > Number of opens = 1 > Rows seen = 1 > Columns accessed from heap = {1, 2, 3, 4, 5, 6, 7, 8} > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > optimizer estimated row count: 1.00 > optimizer estimated cost: 6.59 > Index Scan ResultSet for FORMULA_TERM using constraint > KB_FORMULA_TERM_TERM_ID_PK at read committed isolation level using share row > locking chosen by the optimizer > Number of opens = 1 > Rows seen = 1 > Rows filtered = 0 > Fetch Size = 1 > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > next time in milliseconds/row = 0 > scan information: > Bit set of columns fetched=All > Number of columns fetched=2 > Number of deleted rows visited=0 > Number of pages visited=3 > Number of rows qualified=1 > Number of rows visited=1 > Scan type=btree > Tree height=-1 > start position: > >= on first 1 column(s). > Ordered null semantics on the following > columns: > stop position: > > on first 1 column(s). > Ordered null semantics on the following > columns: > qualifiers: > None > optimizer estimated row count: 1.00 > optimizer estimated cost: 6.59 > [EL Fine]: 2013-01-17 > 11:01:00.732--ServerSession(1237006689)--Connection(927179828)--Thread(Thread["Initial > Lisp Listener",5,SubL Thread Group])--SELECT TERM_ID, ARG0, ARG1, ARG2, > ARG3, FORMULA_HASH, FORMULA_LENGTH, FORMULA_TYPE, KB_STATUS FROM > KB.FORMULA_TERM WHERE (((TERM_ID = ?) OR (TERM_ID = ?)) OR (TERM_ID = ?)) > bind => [2251799814033500, 2251799814033501, 2251799814033499] > Thu Jan 17 11:01:10 CST 2013 Thread["Initial Lisp Listener",5,SubL Thread > Group] (XID = 4711078), (SESSIONID = 3), SELECT TERM_ID, ARG0, ARG1, ARG2, > ARG3, FORMULA_HASH, FORMULA_LENGTH, FORMULA_TYPE, KB_STATUS FROM > KB.FORMULA_TERM WHERE (((TERM_ID = ?) OR (TERM_ID = ?)) OR (TERM_ID = ?)) > ******* Project-Restrict ResultSet (3): > Number of opens = 1 > Rows seen = 3 > Rows filtered = 0 > restriction = false > projection = true > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > restriction time (milliseconds) = 0 > projection time (milliseconds) = 0 > optimizer estimated row count: 1176730.30 > optimizer estimated cost: 5931065.54 > Source result set: > Project-Restrict ResultSet (2): > Number of opens = 1 > Rows seen = 11767298 > Rows filtered = 11767295 > restriction = true > projection = false > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > restriction time (milliseconds) = 0 > projection time (milliseconds) = 0 > optimizer estimated row count: 1176730.30 > optimizer estimated cost: 5931065.54 > Source result set: > Table Scan ResultSet for FORMULA_TERM at read committed > isolation level using instantaneous share row locking chosen by the optimizer > Number of opens = 1 > Rows seen = 11767298 > Rows filtered = 0 > Fetch Size = 16 > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > next time in milliseconds/row = 0 > scan information: > Bit set of columns fetched=All > Number of columns fetched=9 > Number of pages visited=34358 > Number of rows qualified=11767298 > Number of rows visited=11767298 > Scan type=heap > start position: > null > stop position: > null > qualifiers: > None > optimizer estimated row count: 1176730.30 > optimizer estimated cost: 5931065.54 > [EL Fine]: 2013-01-17 > 11:27:00.627--ServerSession(1237006689)--Connection(1688096771)--Thread(Thread["Initial > Lisp Listener",5,SubL Thread Group])--SELECT TERM_ID, ARG0, ARG1, ARG2, > ARG3, FORMULA_HASH, FORMULA_LENGTH, FORMULA_TYPE, KB_STATUS FROM > KB.FORMULA_TERM WHERE (TERM_ID IN (?,?,?)) > bind => [2251799814033500, 2251799814033501, 2251799814033499] > Thu Jan 17 11:47:26 CST 2013 Thread["Initial Lisp Listener",5,SubL Thread > Group] (XID = 4711080), (SESSIONID = 3), SELECT TERM_ID, ARG0, ARG1, ARG2, > ARG3, FORMULA_HASH, FORMULA_LENGTH, FORMULA_TYPE, KB_STATUS FROM > KB.FORMULA_TERM WHERE (TERM_ID IN (?,?,?)) ******* Project-Restrict ResultSet > (3): > Number of opens = 1 > Rows seen = 3 > Rows filtered = 0 > restriction = false > projection = true > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > restriction time (milliseconds) = 0 > projection time (milliseconds) = 0 > optimizer estimated row count: 1176730.30 > optimizer estimated cost: 5931065.54 > Source result set: > Project-Restrict ResultSet (2): > Number of opens = 1 > Rows seen = 11767298 > Rows filtered = 11767295 > restriction = true > projection = false > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > restriction time (milliseconds) = 0 > projection time (milliseconds) = 0 > optimizer estimated row count: 1176730.30 > optimizer estimated cost: 5931065.54 > Source result set: > Table Scan ResultSet for FORMULA_TERM at read committed > isolation level using instantaneous share row locking chosen by the optimizer > Number of opens = 1 > Rows seen = 11767298 > Rows filtered = 0 > Fetch Size = 16 > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > next time in milliseconds/row = 0 > scan information: > Bit set of columns fetched=All > Number of columns fetched=9 > Number of pages visited=34358 > Number of rows qualified=11767298 > Number of rows visited=11767298 > Scan type=heap > start position: > null > stop position: > null > qualifiers: > None > optimizer estimated row count: 1176730.30 > optimizer estimated cost: 5931065.54 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira