Konstantin Orlov created IGNITE-21286:
-----------------------------------------

             Summary: Sql. Enable correlated join
                 Key: IGNITE-21286
                 URL: https://issues.apache.org/jira/browse/IGNITE-21286
             Project: Ignite
          Issue Type: Improvement
          Components: sql
            Reporter: Konstantin Orlov


As for now, implementation of correlated join has a number of performance 
problems:
# Opening a cursor over store is quite expensive. Given that table split on 
partitions, actual number of lookups should be multiplied by number of 
partitions. This should be accounted by cost function.
# Integration with storage (not quite a problem of particular implementation of 
correlated join, but indirectly affects it): every lookup to a storage actually 
schedules a task in different thread pool. When the scan result is ready, it 
schedules a task in sql query task executor. Given that we process only one 
correlate at a time, we are scheduling now `partCount * 2` tasks per every row 
from left shoulder of join. This is very inefficient for single-row lookups of 
a small table on the right shoulder (we spent significantly more time on tasks 
coordination rather than on an actual job).

We need to improve performance of correlated join in general, or at least find 
out cases where it performs better that other types of joins and enable 
correlated join only for those cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to