[ 
https://issues.apache.org/jira/browse/IGNITE-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740122#comment-16740122
 ] 

Gourav Agrawal commented on IGNITE-10855:
-----------------------------------------

Is somebody looking into the issue?

> Delayed execution when running SQL query involving JOINS and ORDER BY 
> eventually leading to heap OOME.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-10855
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10855
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache, sql
>    Affects Versions: 2.7
>            Reporter: Gourav Agrawal
>            Priority: Blocker
>             Fix For: None
>
>
> To simplify our use-case, we created two caches using the SQL query and 
> loaded data consisting of about 4 million records and 60k records 
> approximately, in the respective caches with INDEX created on all the 
> columns. Ignite is set up to run on a single node, meaning all the data is 
> present on the same node. The query used for testing/the one we are facing 
> issue with is of the type -
> _SELECT * FROM CACHE1 C1, CACHE2 C2  WHERE  C1.JOINCol = C2.JOINCol AND 
> C1.COL1 = 'someValue' ORDER BY C1.COL2_
> The above query execution leads to the Ignite thread memory rising 
> extensively, eventually leading to heap OOME. When the heap memory was 
> increased to about 14GB,  we were able to get the results back, but the 
> processing time of the query was too long, about 2-4 minutes ( with CPUs =2).
> We ran an EXPLAIN for the above query and found out that INDEX was created on 
> COL1 for C1 cache and on JOINCol for C2 cache. There was no index on the 
> sorted column. We think the problem of 'slow querying and huge heap memory 
> requirement' is because of the absence of an index in the sorted column. 
> Whenever there is a condition present in the WHERE clause ( in our example 
> C1.COL1='someValue'), Ignite is using an INDEX for that column and there is 
> no INDEX being created on the ORDER BY column.
> And for our use-case, it is imperative that we have a condition in the where 
> clause ( to filter out the data) and a join condition apart from the order by 
> clause.
>  We tried the multiple column indexing strategy on the COL1, COL2 as per our 
> use case.
>  In case of a composite index with the order as (COL1, COL2), INDEX was 
> created only for the COL1.
> While for the composite index order as (COL2, COL1), INDEX was getting 
> created for both COL1 and COL2 and the results were index sorted. ( But only 
> in case of the absence of an INDEX for COL1, it looks for the ORDER BY clause 
> column and uses a composite index). But, if we don't have a separate INDEX 
> for COL1, it again poses a problem as COL1 is something which is heavily used 
> for filtering in all other queries. So an INDEX on COL1 is necessary.
> To summarize, In case there is a condition present in the WHERE clause, 
> Ignite uses the WHERE clause column for indexing, and therefore there is no 
> INDEX in the sorting column, resulting in severe query performance, which can 
> eventually lead us to our system going down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to