[ 
https://issues.apache.org/jira/browse/IGNITE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821548#comment-15821548
 ] 

ASF GitHub Bot commented on IGNITE-4518:
----------------------------------------

GitHub user nva opened a pull request:

    https://github.com/apache/ignite/pull/1426

    IGNITE-4518 Fixed parallel load cache.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gridgain/apache-ignite ignite-4518

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/ignite/pull/1426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1426
    
----
commit 74208e984bb473ca6a9838f117fe5d6b0fc3e86d
Author: Andrey Novikov <anovi...@gridgain.com>
Date:   2017-01-13T04:10:19Z

    IGNITE-4518 Fixed parallel load cache.

----


> Wrong WHERE clause generated by 
> org.apache.ignite.cache.store.jdbc.dialect.BasicJdbcDialect#loadCacheRangeQuery
>  for upper bounds
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-4518
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4518
>             Project: Ignite
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.8
>            Reporter: Anghel Botos
>            Assignee: Andrey Novikov
>
> Scenario:
> * Cache entity with 3 key fields mapped to database as columns {{KC1}}, 
> {{KC2}}, {{KC3}}
> * The following data is in the database table {{MY_TABLE}} (the values 0 to 4 
> encoded as binary):
> ||KC1||KC2||KC3||
> |0|0|0|
> |0|0|1|
> |0|1|0|
> |0|1|1|
> |1|0|0|
> When running {{org.apache.ignite.IgniteCache#loadCache(null)}} (i.e. with no 
> custom SQL query arguments) the following happens:
> * In {{org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore#loadCache}}, 
> the query produced by 
> {{org.apache.ignite.cache.store.jdbc.dialect.JdbcDialect#loadCacheSelectRangeQuery}}
>  is used to determine the range boundaries for the load. In our case the 
> query would be:
> {{SELECT KC1,KC2,KC3 FROM (SELECT KC1,KC2,KC3, ROWNUM AS rn FROM (SELECT 
> KC1,KC2,KC3 FROM MY_TABLE ORDER BY KC1,KC2,KC3)) WHERE mod(rn, 2) = 0}} (I 
> used {{parallelLoadCacheMinThreshold=2}} just as an example so that the 
> problem is visible for this small data set).
> The results of this query are:
> ||KC1||KC2||KC3||
> |0|0|1|
> |0|1|1|
> (which is correct, due to https://issues.apache.org/jira/browse/IGNITE-4163 
> being fixed now)
> These results will be used by 
> {{org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore#loadCacheRange}} 
> to generate queries using 
> {{org.apache.ignite.cache.store.jdbc.dialect.JdbcDialect#loadCacheRangeQuery}}
>  which will be executed in parallel. Since there were 2 rows in the above 
> result, this means that the load will occur using 3 ranges/queries:
> # No lower bound, upper bound using first row of the selectRangeQuery result
> # Lower bound using the first row of the selectRangeQuery result, upper bound 
> using the second row of the selectRangeQuery result
> # Lower bound using the second row of the selectRangeQuery result, no upper 
> bound
> The queries are the following:
> # {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 <= ?  
> OR KC1 = ? AND KC2 <= ?  OR KC1 <= ? )}}
> # {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? 
> OR KC1 = ? AND KC2 > ? OR KC1 > ? ) AND (KC1 = ? AND KC2 = ? AND KC3 <= ?  OR 
> KC1 = ? AND KC2 <= ?  OR KC1 <= ? )}}
> # {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? 
> OR KC1 = ? AND KC2 > ? OR KC1 > ? )}}
> Replacing the first bound in the first query yields the following query: 
> {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = 0 AND KC2 = 0 AND KC3 <= 1  
> OR KC1 = 0 AND KC2 <= 0  OR KC1 <= 0 )}}, which executed, yields the 
> following results:
> ||KC1||KC2||KC3||
> |0|0|0|
> |0|0|1|
> |0|1|0|
> |0|1|1|
> which is not correct, as the result set should have contained only 2 rows 
> (due to {{parallelLoadCacheMinThreshold=2}}). Something similar will happen 
> also to the second query (the one with both lower and upper bound).
> The problem arises due to how the upper bound part of the WHERE clause is 
> being generated, namely the conditions when NOT ALL of the key columns are 
> constrained. These conditions should use *<* (strictly less), not *<=* (less 
> or equal).
> The correct queries should be (please note the {{OR KC1 = ? AND KC2 < ?  OR 
> KC1 < ?}} part which makes the difference):
> # {{SELECT KC1, KC2, KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 <= 
> ?  OR KC1 = ? AND KC2 < ?  OR KC1 < ? )}}
> # {{SELECT KC1, KC2, KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? 
> OR KC1 = ? AND KC2 > ? OR KC1 > ? ) AND (KC1 = ? AND KC2 = ? AND KC3 <= ?  OR 
> KC1 = ? AND KC2 < ?  OR KC1 < ? )}}
> # {{SELECT KC1, KC2, KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? 
> OR KC1 = ? AND KC2 > ? OR KC1 > ? )}}
> My fix to generate the correct queries was to use the following code instead 
> of 
> {{org.apache.ignite.cache.store.jdbc.dialect.BasicJdbcDialect#loadCacheRangeQuery:193-205}}:
> {code}
>         sb.append("(");
>         for (int cnt = keyCols.size(); cnt > 0; cnt--) {
>             for (int j = 0; j < cnt; j++) {
>                 if (j == cnt - 1) {
>                     if (cnt == keyCols.size()) {
>                         sb.append(cols[j]).append(" <= ? ");
>                     } else {
>                         sb.append(cols[j]).append(" < ? ");
>                     }
>                 } else {
>                     sb.append(cols[j]).append(" = ? AND ");
>                 }
>             }
>             if (cnt != 1) {
>                 sb.append(" OR ");
>             }
>         }
>         sb.append(")");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to