[
https://issues.apache.org/jira/browse/DERBY-6132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618568#comment-13618568
]
Dag H. Wanvik commented on DERBY-6132:
--------------------------------------
Thanks, Tony! Yes, we could consider making an alternate implementation using a
hash table for such cases.
I'll file another issue and close this one.
> FETCH NEXT extremely slow when used with DISTINCT queries
> ---------------------------------------------------------
>
> Key: DERBY-6132
> URL: https://issues.apache.org/jira/browse/DERBY-6132
> Project: Derby
> Issue Type: Improvement
> Components: Store
> Affects Versions: 10.9.1.0
> Environment: Debian Linux
> Reporter: Tony Brusseau
>
> I tried an expensive count query that counted almost 300k rows from a table
> with > 15 million rows. It took 5.8 seconds. I then tried optimizing the
> count query to only fetch the first 129 results (because I only care if
> there are more than 128 results) and the query took 5.9 seconds...even
> longer! In both cases the query plan finds all 300k rows before returning
> results which severely limits the utility of the FETCH NEXT command.
> SELECT COUNT(DISTINCT a.term_id) FROM kb.gaf_assertion_term a,
> kb.formula_entries fe
> WHERE (fe.formula_id = a.formula_id) AND (fe.arg_term = 1407374883553721)
> AND (fe.formula_type = 1)
> AND (fe.arg_num > -1)
> Execution finished after 5.787 s, 0 error(s) occurred.
> 284960 result
> SELECT COUNT(*) FROM (SELECT DISTINCT a.term_id FROM kb.gaf_assertion_term a,
> kb.formula_entries fe
> WHERE (fe.formula_id = a.formula_id) AND (fe.arg_term = 1407374883553721)
> AND (fe.formula_type = 1)
> AND (fe.arg_num > -1) FETCH NEXT 129 ROWS ONLY) x
> Execution finished after 5.855 s, 0 error(s) occurred.
> 129 result
> *******************************************************************************************
> QUERY PLAN: WITH FETCH NEXT
> Fri Mar 29 17:30:28 CDT 2013 Thread[DRDAConnThread_3,5,main] (XID =
> 76255884), (SESSIONID = 1), SELECT COUNT(*) FROM (SELECT DISTINCT a.term_id
> FROM kb.gaf_assertion_term a, kb.formula_entries fe
> WHERE (fe.formula_id = a.formula_id) AND (fe.arg_term = 1407374883553721)
> AND (fe.formula_type = 1)
> AND (fe.arg_num > -1) FETCH NEXT 129 ROWS ONLY) x ******* Project-Restrict
> ResultSet (12):
> Number of opens = 1
> Rows seen = 1
> Rows filtered = 0
> restriction = false
> projection = true
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds) = 0
> projection time (milliseconds) = 0
> optimizer estimated row count: 1.00
> optimizer estimated cost: 1205122.64
> Source result set:
> Scalar Aggregate ResultSet:
> Number of opens = 1
> Rows input = 129
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count: 91812.27
> optimizer estimated cost: 1205122.64
> Index Key Optimization = false
> Source result set:
> Project-Restrict ResultSet (11):
> Number of opens = 1
> Rows seen = 129
> Rows filtered = 0
> restriction = false
> projection = true
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds) = 0
> projection time (milliseconds) = 0
> optimizer estimated row count: 91812.27
> optimizer estimated cost: 1205122.64
> Source result set:
> Row Count (2):
> Number of opens = 1
> Rows seen = 129
> Rows filtered = 0
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count: 91812.27
> optimizer estimated cost: 1205122.64
> Source result set:
> Sort ResultSet:
> Number of opens = 1
> Rows input = 284973
> Rows returned = 129
> Eliminate duplicates = true
> In sorted order = false
> Sort information:
> Number of rows input=284973
> Number of rows output=284960
> Sort type=internal
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count: 97588.77
> optimizer estimated cost: 1205122.64
> Source result set:
> Project-Restrict ResultSet (10):
> Number of opens = 1
> Rows seen = 284973
> Rows filtered = 0
> restriction = false
> projection = true
> constructor time (milliseconds)
> = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds)
> = 0
> projection time (milliseconds)
> = 0
> optimizer estimated row count:
> 97588.77
> optimizer estimated cost:
> 1205122.64
> Source result set:
> Nested Loop Join ResultSet:
> Number of opens = 1
> Rows seen from the left = 284688
> Rows seen from the right =
> 284973
> Rows filtered = 0
> Rows returned = 284973
> constructor time
> (milliseconds) = 0
> open time
> (milliseconds) = 0
> next time
> (milliseconds) = 0
> close time
> (milliseconds) = 0
> optimizer estimated row
> count: 97588.77
> optimizer estimated
> cost: 1205122.64
> Left result set:
> Project-Restrict
> ResultSet (7):
> Number of opens = 1
> Rows seen = 284688
> Rows filtered = 0
> restriction = true
> projection = true
> constructor
> time (milliseconds) = 0
> open time
> (milliseconds) = 0
> next time
> (milliseconds) = 0
> close time
> (milliseconds) = 0
> restriction
> time (milliseconds) = 0
> projection time
> (milliseconds) = 0
> optimizer
> estimated row count: 91812.27
> optimizer
> estimated cost: 554694.34
> Source result set:
> Index Row to
> Base Row ResultSet for FORMULA_ENTRIES:
> Number of opens
> = 1
> Rows seen =
> 284688
> Columns
> accessed from heap = {1, 4}
>
> constructor time (milliseconds) = 0
> open
> time (milliseconds) = 0
> next
> time (milliseconds) = 0
> close
> time (milliseconds) = 0
>
> optimizer estimated row count: 91812.27
>
> optimizer estimated cost: 554694.34
> Index
> Scan ResultSet for FORMULA_ENTRIES using index
> KB_FORMULA_ENTRIES_FORMULA_TERM_TYPE at read committed isolation level using
> share row locking chosen by the optimizer
> Number
> of opens = 1
> Rows
> seen = 284688
> Rows
> filtered = 0
> Fetch
> Size = 1
>
> constructor time (milliseconds) = 0
>
> open time (milliseconds) = 0
>
> next time (milliseconds) = 0
>
> close time (milliseconds) = 0
>
> next time in milliseconds/row = 0
> scan
> information:
>
> Bit set of columns fetched=All
>
> Number of columns fetched=3
>
> Number of deleted rows visited=0
>
> Number of pages visited=277
>
> Number of rows qualified=284688
>
> Number of rows visited=284689
>
> Scan type=btree
>
> Tree height=3
>
> start position:
>
> >= on first 2 column(s).
>
> Ordered null semantics on the following columns:
>
> 0 1
>
> stop position:
>
> > on first 2 column(s).
>
> Ordered null semantics on the following columns:
>
> 0 1
>
> qualifiers:
>
> None
>
> optimizer estimated row count: 91812.27
>
> optimizer estimated cost: 554694.34
> Right result set:
> Index Row to Base Row
> ResultSet for GAF_ASSERTION_TERM:
> Number of opens = 284688
> Rows seen = 284973
> Columns accessed from
> heap = {0}
> constructor
> time (milliseconds) = 0
> open time
> (milliseconds) = 0
> next time
> (milliseconds) = 0
> close time
> (milliseconds) = 0
> optimizer
> estimated row count: 97588.77
> optimizer
> estimated cost: 650428.30
> Index Scan
> ResultSet for GAF_ASSERTION_TERM using constraint
> KB_GAF_ASSERTION_TERM_FORMULA_MT_UNIQUE at read committed isolation level
> using share row locking chosen by the optimizer
> Number of opens
> = 284688
> Rows seen =
> 284973
> Rows filtered = > 0
> Fetch Size = 1
>
> constructor time (milliseconds) = 0
> open
> time (milliseconds) = 0
> next
> time (milliseconds) = 0
> close
> time (milliseconds) = 0
> next
> time in milliseconds/row = 0
> scan
> information:
> Bit set
> of columns fetched={0, 2}
> Number
> of columns fetched=2
> Number
> of deleted rows visited=0
> Number
> of pages visited=854700
> Number
> of rows qualified=284973
> Number
> of rows visited=569661
> Scan
> type=btree
> Tree
> height=3
> start
> position:
>
> >= on first 1 column(s).
>
> Ordered null semantics on the following columns:
>
> 0
> stop
> position:
>
> > on first 1 column(s).
>
> Ordered null semantics on the following columns:
>
> 0
>
> qualifiers:
>
> None
>
> optimizer estimated row count: 97588.77
>
> optimizer estimated cost: 650428.30
> *******************************************************************************************
> QUERY PLAN: WITHOUT FETCH NEXT
> Fri Mar 29 17:28:15 CDT 2013 Thread[DRDAConnThread_3,5,main] (XID =
> 76255878), (SESSIONID = 1), SELECT COUNT(DISTINCT a.term_id) FROM
> kb.gaf_assertion_term a, kb.formula_entries fe
> WHERE (fe.formula_id = a.formula_id) AND (fe.arg_term = 1407374883553721)
> AND (fe.formula_type = 1)
> AND (fe.arg_num > -1) ******* Project-Restrict ResultSet (9):
> Number of opens = 1
> Rows seen = 1
> Rows filtered = 0
> restriction = false
> projection = true
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds) = 0
> projection time (milliseconds) = 0
> optimizer estimated row count: 1.00
> optimizer estimated cost: 1205122.64
> Source result set:
> Distinct Scalar Aggregate ResultSet:
> Number of opens = 1
> Rows input = 284973
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count: 97588.77
> optimizer estimated cost: 1205122.64
> Source result set:
> Project-Restrict ResultSet (8):
> Number of opens = 1
> Rows seen = 284973
> Rows filtered = 0
> restriction = false
> projection = true
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds) = 0
> projection time (milliseconds) = 0
> optimizer estimated row count: 97588.77
> optimizer estimated cost: 1205122.64
> Source result set:
> Nested Loop Join ResultSet:
> Number of opens = 1
> Rows seen from the left = 284688
> Rows seen from the right = 284973
> Rows filtered = 0
> Rows returned = 284973
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count: 97588.77
> optimizer estimated cost: 1205122.64
> Left result set:
> Project-Restrict ResultSet (5):
> Number of opens = 1
> Rows seen = 284688
> Rows filtered = 0
> restriction = true
> projection = true
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds) = 0
> projection time (milliseconds) = 0
> optimizer estimated row count: 91812.27
> optimizer estimated cost: 554694.34
> Source result set:
> Index Row to Base Row ResultSet for
> FORMULA_ENTRIES:
> Number of opens = 1
> Rows seen = 284688
> Columns accessed from heap = {1, 3, 4,
> 5}
> constructor time (milliseconds)
> = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count:
> 91812.27
> optimizer estimated cost:
> 554694.34
> Index Scan ResultSet for
> FORMULA_ENTRIES using index KB_FORMULA_ENTRIES_FORMULA_TERM_TYPE at read
> committed isolation level using instantaneous share row locking chosen by the
> optimizer
> Number of opens = 1
> Rows seen = 284688
> Rows filtered = 0
> Fetch Size = 16
> constructor time
> (milliseconds) = 0
> open time
> (milliseconds) = 0
> next time
> (milliseconds) = 0
> close time
> (milliseconds) = 0
> next time in
> milliseconds/row = 0
> scan information:
> Bit set of columns
> fetched=All
> Number of columns
> fetched=3
> Number of deleted rows
> visited=0
> Number of pages
> visited=277
> Number of rows
> qualified=284688
> Number of rows
> visited=284689
> Scan type=btree
> Tree height=3
> start position:
> >= on first 2
> column(s).
> Ordered null
> semantics on the following columns:
> 0 1
> stop position:
> > on first 2
> column(s).
> Ordered null
> semantics on the following columns:
> 0 1
> qualifiers:
> None
> optimizer estimated row
> count: 91812.27
> optimizer estimated
> cost: 554694.34
> Right result set:
> Index Row to Base Row ResultSet for
> GAF_ASSERTION_TERM:
> Number of opens = 284688
> Rows seen = 284973
> Columns accessed from heap = {0, 6}
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count: 97588.77
> optimizer estimated cost: 650428.30
> Index Scan ResultSet for
> GAF_ASSERTION_TERM using constraint KB_GAF_ASSERTION_TERM_FORMULA_MT_UNIQUE
> at read committed isolation level using instantaneous share row locking
> chosen by the optimizer
> Number of opens = 284688
> Rows seen = 284973
> Rows filtered = 0
> Fetch Size = 16
> constructor time (milliseconds)
> = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> next time in milliseconds/row = > 0
> scan information:
> Bit set of columns fetched={0,
> 2}
> Number of columns fetched=2
> Number of deleted rows visited=0
> Number of pages visited=854700
> Number of rows qualified=284973
> Number of rows visited=569661
> Scan type=btree
> Tree height=3
> start position:
> >= on first 1 column(s).
> Ordered null semantics
> on the following columns:
> 0
> stop position:
> > on first 1 column(s).
> Ordered null semantics
> on the following columns:
> 0
> qualifiers:
> None
> optimizer estimated row count:
> 97588.77
> optimizer estimated cost:
> 650428.30
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
