Konstantin Orlov created IGNITE-21286:
-----------------------------------------
Summary: Sql. Enable correlated join
Key: IGNITE-21286
URL: https://issues.apache.org/jira/browse/IGNITE-21286
Project: Ignite
Issue Type: Improvement
Components: sql
Reporter: Konstantin Orlov
As for now, implementation of correlated join has a number of performance
problems:
# Opening a cursor over store is quite expensive. Given that table split on
partitions, actual number of lookups should be multiplied by number of
partitions. This should be accounted by cost function.
# Integration with storage (not quite a problem of particular implementation of
correlated join, but indirectly affects it): every lookup to a storage actually
schedules a task in different thread pool. When the scan result is ready, it
schedules a task in sql query task executor. Given that we process only one
correlate at a time, we are scheduling now `partCount * 2` tasks per every row
from left shoulder of join. This is very inefficient for single-row lookups of
a small table on the right shoulder (we spent significantly more time on tasks
coordination rather than on an actual job).
We need to improve performance of correlated join in general, or at least find
out cases where it performs better that other types of joins and enable
correlated join only for those cases.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)