[
https://issues.apache.org/jira/browse/IGNITE-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740122#comment-16740122
]
Gourav Agrawal commented on IGNITE-10855:
-----------------------------------------
Is somebody looking into the issue?
> Delayed execution when running SQL query involving JOINS and ORDER BY
> eventually leading to heap OOME.
> ------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-10855
> URL: https://issues.apache.org/jira/browse/IGNITE-10855
> Project: Ignite
> Issue Type: Bug
> Components: cache, sql
> Affects Versions: 2.7
> Reporter: Gourav Agrawal
> Priority: Blocker
> Fix For: None
>
>
> To simplify our use-case, we created two caches using the SQL query and
> loaded data consisting of about 4 million records and 60k records
> approximately, in the respective caches with INDEX created on all the
> columns. Ignite is set up to run on a single node, meaning all the data is
> present on the same node. The query used for testing/the one we are facing
> issue with is of the type -
> _SELECT * FROM CACHE1 C1, CACHE2 C2 WHERE C1.JOINCol = C2.JOINCol AND
> C1.COL1 = 'someValue' ORDER BY C1.COL2_
> The above query execution leads to the Ignite thread memory rising
> extensively, eventually leading to heap OOME. When the heap memory was
> increased to about 14GB, we were able to get the results back, but the
> processing time of the query was too long, about 2-4 minutes ( with CPUs =2).
> We ran an EXPLAIN for the above query and found out that INDEX was created on
> COL1 for C1 cache and on JOINCol for C2 cache. There was no index on the
> sorted column. We think the problem of 'slow querying and huge heap memory
> requirement' is because of the absence of an index in the sorted column.
> Whenever there is a condition present in the WHERE clause ( in our example
> C1.COL1='someValue'), Ignite is using an INDEX for that column and there is
> no INDEX being created on the ORDER BY column.
> And for our use-case, it is imperative that we have a condition in the where
> clause ( to filter out the data) and a join condition apart from the order by
> clause.
> We tried the multiple column indexing strategy on the COL1, COL2 as per our
> use case.
> In case of a composite index with the order as (COL1, COL2), INDEX was
> created only for the COL1.
> While for the composite index order as (COL2, COL1), INDEX was getting
> created for both COL1 and COL2 and the results were index sorted. ( But only
> in case of the absence of an INDEX for COL1, it looks for the ORDER BY clause
> column and uses a composite index). But, if we don't have a separate INDEX
> for COL1, it again poses a problem as COL1 is something which is heavily used
> for filtering in all other queries. So an INDEX on COL1 is necessary.
> To summarize, In case there is a condition present in the WHERE clause,
> Ignite uses the WHERE clause column for indexing, and therefore there is no
> INDEX in the sorting column, resulting in severe query performance, which can
> eventually lead us to our system going down.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)