[ https://issues.apache.org/jira/browse/IGNITE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821548#comment-15821548 ]
ASF GitHub Bot commented on IGNITE-4518: ---------------------------------------- GitHub user nva opened a pull request: https://github.com/apache/ignite/pull/1426 IGNITE-4518 Fixed parallel load cache. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-4518 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/1426.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1426 ---- commit 74208e984bb473ca6a9838f117fe5d6b0fc3e86d Author: Andrey Novikov <anovi...@gridgain.com> Date: 2017-01-13T04:10:19Z IGNITE-4518 Fixed parallel load cache. ---- > Wrong WHERE clause generated by > org.apache.ignite.cache.store.jdbc.dialect.BasicJdbcDialect#loadCacheRangeQuery > for upper bounds > -------------------------------------------------------------------------------------------------------------------------------- > > Key: IGNITE-4518 > URL: https://issues.apache.org/jira/browse/IGNITE-4518 > Project: Ignite > Issue Type: Bug > Components: SQL > Affects Versions: 1.8 > Reporter: Anghel Botos > Assignee: Andrey Novikov > > Scenario: > * Cache entity with 3 key fields mapped to database as columns {{KC1}}, > {{KC2}}, {{KC3}} > * The following data is in the database table {{MY_TABLE}} (the values 0 to 4 > encoded as binary): > ||KC1||KC2||KC3|| > |0|0|0| > |0|0|1| > |0|1|0| > |0|1|1| > |1|0|0| > When running {{org.apache.ignite.IgniteCache#loadCache(null)}} (i.e. with no > custom SQL query arguments) the following happens: > * In {{org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore#loadCache}}, > the query produced by > {{org.apache.ignite.cache.store.jdbc.dialect.JdbcDialect#loadCacheSelectRangeQuery}} > is used to determine the range boundaries for the load. In our case the > query would be: > {{SELECT KC1,KC2,KC3 FROM (SELECT KC1,KC2,KC3, ROWNUM AS rn FROM (SELECT > KC1,KC2,KC3 FROM MY_TABLE ORDER BY KC1,KC2,KC3)) WHERE mod(rn, 2) = 0}} (I > used {{parallelLoadCacheMinThreshold=2}} just as an example so that the > problem is visible for this small data set). > The results of this query are: > ||KC1||KC2||KC3|| > |0|0|1| > |0|1|1| > (which is correct, due to https://issues.apache.org/jira/browse/IGNITE-4163 > being fixed now) > These results will be used by > {{org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore#loadCacheRange}} > to generate queries using > {{org.apache.ignite.cache.store.jdbc.dialect.JdbcDialect#loadCacheRangeQuery}} > which will be executed in parallel. Since there were 2 rows in the above > result, this means that the load will occur using 3 ranges/queries: > # No lower bound, upper bound using first row of the selectRangeQuery result > # Lower bound using the first row of the selectRangeQuery result, upper bound > using the second row of the selectRangeQuery result > # Lower bound using the second row of the selectRangeQuery result, no upper > bound > The queries are the following: > # {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 <= ? > OR KC1 = ? AND KC2 <= ? OR KC1 <= ? )}} > # {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? > OR KC1 = ? AND KC2 > ? OR KC1 > ? ) AND (KC1 = ? AND KC2 = ? AND KC3 <= ? OR > KC1 = ? AND KC2 <= ? OR KC1 <= ? )}} > # {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? > OR KC1 = ? AND KC2 > ? OR KC1 > ? )}} > Replacing the first bound in the first query yields the following query: > {{SELECT KC1,KC2,KC3 FROM MY_TABLE WHERE (KC1 = 0 AND KC2 = 0 AND KC3 <= 1 > OR KC1 = 0 AND KC2 <= 0 OR KC1 <= 0 )}}, which executed, yields the > following results: > ||KC1||KC2||KC3|| > |0|0|0| > |0|0|1| > |0|1|0| > |0|1|1| > which is not correct, as the result set should have contained only 2 rows > (due to {{parallelLoadCacheMinThreshold=2}}). Something similar will happen > also to the second query (the one with both lower and upper bound). > The problem arises due to how the upper bound part of the WHERE clause is > being generated, namely the conditions when NOT ALL of the key columns are > constrained. These conditions should use *<* (strictly less), not *<=* (less > or equal). > The correct queries should be (please note the {{OR KC1 = ? AND KC2 < ? OR > KC1 < ?}} part which makes the difference): > # {{SELECT KC1, KC2, KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 <= > ? OR KC1 = ? AND KC2 < ? OR KC1 < ? )}} > # {{SELECT KC1, KC2, KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? > OR KC1 = ? AND KC2 > ? OR KC1 > ? ) AND (KC1 = ? AND KC2 = ? AND KC3 <= ? OR > KC1 = ? AND KC2 < ? OR KC1 < ? )}} > # {{SELECT KC1, KC2, KC3 FROM MY_TABLE WHERE (KC1 = ? AND KC2 = ? AND KC3 > ? > OR KC1 = ? AND KC2 > ? OR KC1 > ? )}} > My fix to generate the correct queries was to use the following code instead > of > {{org.apache.ignite.cache.store.jdbc.dialect.BasicJdbcDialect#loadCacheRangeQuery:193-205}}: > {code} > sb.append("("); > for (int cnt = keyCols.size(); cnt > 0; cnt--) { > for (int j = 0; j < cnt; j++) { > if (j == cnt - 1) { > if (cnt == keyCols.size()) { > sb.append(cols[j]).append(" <= ? "); > } else { > sb.append(cols[j]).append(" < ? "); > } > } else { > sb.append(cols[j]).append(" = ? AND "); > } > } > if (cnt != 1) { > sb.append(" OR "); > } > } > sb.append(")"); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)