[ 
https://issues.apache.org/jira/browse/DERBY-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897615#comment-13897615
 ] 

Mamta A. Satoor commented on DERBY-6301:
----------------------------------------

Spend some time in code generation and run time code to see what happens for 
the query below
select * from t1 where c11  in 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30);
when table has only 200 rows and there is a primary key defined on column c11.

At code generation time, ProjectRestrictNode has in list predicate classified 
as start and stop predicate but since we found that bulk table scan is more 
appropriate for this query instead of index scan(because number of elements in 
the in list are 30 and table has only200 rows), we remove the predicate from 
the list of start/stop predicate and put it as restriction at 
ProjectRestrictNode level(the stack trace of that code sequence is as follows). 
This is where we loose the information about predicate between start/stop 
predicate which is right anyways because start/stop predicate only makes sense 
for index scan and not for table scan.
Thread [main] (Suspended)       
        ProjectRestrictNode.generateMinion(ExpressionClassBuilder, 
MethodBuilder, boolean) line: 1365   
        ProjectRestrictNode.generate(ActivationClassBuilder, MethodBuilder) 
line: 1301  
        ProjectRestrictNode.generateMinion(ExpressionClassBuilder, 
MethodBuilder, boolean) line: 1348   
        ProjectRestrictNode.generate(ActivationClassBuilder, MethodBuilder) 
line: 1301  
        ScrollInsensitiveResultSetNode.generate(ActivationClassBuilder, 
MethodBuilder) line: 86 
        CursorNode.generate(ActivationClassBuilder, MethodBuilder) line: 628    
        CursorNode(StatementNode).generate(ByteArray) line: 317 
        GenericStatement.prepMinion(LanguageConnectionContext, boolean, 
Object[], SchemaDescriptor, boolean) line: 547  
        GenericStatement.prepare(LanguageConnectionContext, boolean) line: 99   
        
GenericLanguageConnectionContext.prepareInternalStatement(SchemaDescriptor, 
String, boolean, boolean) line: 1116        
        EmbedStatement.execute(String, boolean, boolean, int, int[], String[]) 
line: 682        
        EmbedStatement.execute(String) line: 631        
        ij.executeImmediate(String) line: 367   
        utilMain.doCatch(String) line: 527      
        utilMain.runScriptGuts() line: 372      
        utilMain.go(LocalizedInput[], LocalizedOutput) line: 245        
        Main.go(LocalizedInput, LocalizedOutput) line: 229      
        Main.mainCore(String[], Main) line: 184 
        Main.main(String[]) line: 75    
        ij.main(String[]) line: 59      
restriction at ProjectRestrictNode looks like following
restriction     BinaryRelationalOperatorNode  (id=1139) 
        beginOffset     -1      
        betweenSelectivity      false   
        btnVis  BaseTableNumbersVisitor  (id=1708)      
        cm      ContextManager  (id=1993)       
        constantActionFactory   null    
        dataTypeServices        DataTypeDescriptor  (id=1165)   
        endOffset       -1      
        forQueryRewrite false   
        inListProbeSource       InListOperatorNode  (id=1533)   
                beginOffset     22      
                cm      ContextManager  (id=1993)       
                constantActionFactory   null    
                dataTypeServices        DataTypeDescriptor  (id=1861)   
                endOffset       113     
                isOrdered       true    
                isPrivilegeCollectionRequired   true    
                lcc     GenericLanguageConnectionContext  (id=1194)     
                leftOperand     ColumnReference  (id=1215)      
                methodName      "in" (id=1122)  
                operator        "IN" (id=1974)  
                rightOperandList        ValueNodeList  (id=2443)        
                sortDescending  false   
                transformed     false   
                visitableTags   null    
        isPrivilegeCollectionRequired   true    
        kind    2       
        kind    0       
        lcc     GenericLanguageConnectionContext  (id=1194)     
        leftInterfaceType       
"org.apache.derby.iapi.types.DataValueDescriptor" (id=1089)     
        leftOperand     ColumnReference  (id=1215)      
        methodName      "equals" (id=2102)      
        operator        "=" (id=1349)   
        optBaseTables   JBitSet  (id=1080)      
        receiver        null    
        relOpType       1       
        resultInterfaceType     null    
        rightInterfaceType      
"org.apache.derby.iapi.types.DataValueDescriptor" (id=1089)     
        rightOperand    ParameterNode  (id=1783)        
        transformed     false   
        valNodeBaseTables       JBitSet  (id=1065)      
        visitableTags   null    
        xmlQuery        null    

We continue in ProjectRestrictNode.generateMinion and convert the restriction 
to a method call at runtime. This restriction method will be run on every row 
returned from the store since store does not know anything about restrictions 
since no start/stop/qualifier has been passed to bulk table scan calls to 
store. So at the end of code generation, we have generated two kinds of 
resultsets,
1)BulkTableScanResultSet with no qualifers/start/stop keys, so all the rows 
from the table will be returned by store and
2)ProjectRestrictResultSet which will use BulkTableScanResultSet generated 
earlier as the source. And for each row returned by BulkTableScanResultSet, it 
will apply the restriction method generated earlier which will go through the 
row returned and see if returns true for the inlist elements. So, by this 
point, we have lost all the information about inlist. It is only available 
indirectly as restriction method but there is no way to get to it at this point 
for the way the code is written. Following is the relevant code from 
ProjectRestrictResultSet.getNextRowCore
    do 
    {
        candidateRow = source.getNextRowCore(); //this is the call to 
BulkTableScanResultSet to get next row
        if (candidateRow != null) 
        {
                beginRT = getCurrentTimeMillis();
                /* If restriction is null, then all rows qualify */
                if (restriction == null)
                {
                        restrict = true;
                }
                else
                {
                        setCurrentRow(candidateRow);
                        ////Following is the call to restriction method to see 
if row returned by 
                        //BulkTableScanResultSet qualifies or not
                        restrictBoolean = (DataValueDescriptor) 
                                   restriction.invoke(activation);

The above blurb just explains what is happening at code generation and 
execution time for an in list which was identified by optimizer as not a good 
candidate for multi-probing and subsequently, table scan was found to be the 
most efficient plan for it.

> SQL layer should push down IN list predicates to store when doing a scan
> ------------------------------------------------------------------------
>
>                 Key: DERBY-6301
>                 URL: https://issues.apache.org/jira/browse/DERBY-6301
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.10.1.1
>            Reporter: Mike Matrigali
>
> The store interface allows for OR and AND qualifiers to be passed down to 
> store as part of either
> a heap or btree scan.  It is more efficient to qualify the rows at the lowest 
> levels.  The SQL level
> does not seem to  push any qualifier in the case of IN lists.
> This does not matter if the optimizer choses the multi-probe execution 
> strategy for the IN list as that also
> qualifies the row at the lowest level.
> The problem arises when the optimizer chooses not to do multi-probe, for 
> instance if it determines there
> are too many terms in the in-list relative to the size of the table and the 
> cardinality of the terms.  In this
> case it chooses a scan with no qualifiers which results in all rows being 
> returned to the sql layer and qualified there.  
> In addition to performance considerations this presents a locking problem 
> with respect to the repeatable read isolation level.   It is optimal in 
> repeatable read to not maintain locks on those
> rows that do not qualify.  Currently this locking optimization only takes 
> place for those rows that
> are qualified in the store vs. those qualified in the upper SQL layer.  So in 
> the case of a non-multi-probe IN-LIST plan all non-qualified rows looked at 
> as part of the execution will remain locked in repeatable 
> read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to