[
https://issues.apache.org/jira/browse/DERBY-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251938#comment-14251938
]
Mike Matrigali edited comment on DERBY-6784 at 12/18/14 6:00 PM:
-----------------------------------------------------------------
Good questions. I am somewhat out of my depth in this code so don't have
answers for sure yet.
I do know that some part of the code does sort the values in the IN LIST. This
was originally
done to implement a first level optimization of IN LIST's before the
multi-probe work. The
code would sort the IN LIST and then rather than doing a full scan of the index
it would
use the sort to set start and stop parameter for the scan so hopefully
eliminating part of
the index scan. I am hoping this still happens which also leads to better
localized caching
for subsequent probes in the many term multi-probe case.
was (Author: mikem):
Good questions. I am somewhat out of my depth in this code so don't have
answers for sure.
I do know that some part of the code does sort the values in the IN LIST. This
was originally
done to implement a first level optimization of IN LIST's before the
multi-probe work. The
code would sort the IN LIST and then rather than doing a full scan of the index
it would
use the sort to set start and stop parameter for the scan so hopefully
eliminating part of
the index scan. I am hoping this still happens which also leads to better
localized caching
for subsequent probes in the many term multi-probe case.
> change optimizer to choose in list multiprobe more often
> --------------------------------------------------------
>
> Key: DERBY-6784
> URL: https://issues.apache.org/jira/browse/DERBY-6784
> Project: Derby
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 10.11.1.1
> Reporter: Mike Matrigali
> Assignee: Mike Matrigali
>
> Using the multi-probe join strategy is an obvious performance win when
> the optimizer chooses it. There are cases currently where the costing
> makes the optimizer choose other plans which do not perform as well as
> the multi-probe strategy.
> The class of queries that are affected are those where the number of terms
> in the IN LIST is large relative to the number of rows in the table, and there
> is a useful index to probe for the column that is referenced by the IN LIST.
> There are multiple benefits to choosing the multi-probe strategy, including
> the following:
> 1) often better execution time, where the alternative is to do a full table
> merge on the column.
> 2) The multi-probe strategy results in "pushing" the work into the store,
> and this may result in more concurrent behavior (see DERBY-6300 and
> DERBY-6301). First less rows may
> be locked by probing rather than full table scan (and in the worst case
> same number if query manages to probe on every value in table).
> Second depending on isolation level of the query store will only matching
> rows, while in the current implementation all rows that are returned by
> store for qualification above store will remain locked whether they
> qualify or not. Especially in small table cases other query plan
> choices
> have been changed to favor probing indexes rather than full table scans
> even if pure cpu is better with table scan.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)