[ https://issues.apache.org/jira/browse/IGNITE-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740122#comment-16740122 ]
Gourav Agrawal commented on IGNITE-10855: ----------------------------------------- Is somebody looking into the issue? > Delayed execution when running SQL query involving JOINS and ORDER BY > eventually leading to heap OOME. > ------------------------------------------------------------------------------------------------------ > > Key: IGNITE-10855 > URL: https://issues.apache.org/jira/browse/IGNITE-10855 > Project: Ignite > Issue Type: Bug > Components: cache, sql > Affects Versions: 2.7 > Reporter: Gourav Agrawal > Priority: Blocker > Fix For: None > > > To simplify our use-case, we created two caches using the SQL query and > loaded data consisting of about 4 million records and 60k records > approximately, in the respective caches with INDEX created on all the > columns. Ignite is set up to run on a single node, meaning all the data is > present on the same node. The query used for testing/the one we are facing > issue with is of the type - > _SELECT * FROM CACHE1 C1, CACHE2 C2 WHERE C1.JOINCol = C2.JOINCol AND > C1.COL1 = 'someValue' ORDER BY C1.COL2_ > The above query execution leads to the Ignite thread memory rising > extensively, eventually leading to heap OOME. When the heap memory was > increased to about 14GB, we were able to get the results back, but the > processing time of the query was too long, about 2-4 minutes ( with CPUs =2). > We ran an EXPLAIN for the above query and found out that INDEX was created on > COL1 for C1 cache and on JOINCol for C2 cache. There was no index on the > sorted column. We think the problem of 'slow querying and huge heap memory > requirement' is because of the absence of an index in the sorted column. > Whenever there is a condition present in the WHERE clause ( in our example > C1.COL1='someValue'), Ignite is using an INDEX for that column and there is > no INDEX being created on the ORDER BY column. > And for our use-case, it is imperative that we have a condition in the where > clause ( to filter out the data) and a join condition apart from the order by > clause. > We tried the multiple column indexing strategy on the COL1, COL2 as per our > use case. > In case of a composite index with the order as (COL1, COL2), INDEX was > created only for the COL1. > While for the composite index order as (COL2, COL1), INDEX was getting > created for both COL1 and COL2 and the results were index sorted. ( But only > in case of the absence of an INDEX for COL1, it looks for the ORDER BY clause > column and uses a composite index). But, if we don't have a separate INDEX > for COL1, it again poses a problem as COL1 is something which is heavily used > for filtering in all other queries. So an INDEX on COL1 is necessary. > To summarize, In case there is a condition present in the WHERE clause, > Ignite uses the WHERE clause column for indexing, and therefore there is no > INDEX in the sorting column, resulting in severe query performance, which can > eventually lead us to our system going down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)