[
https://issues.apache.org/jira/browse/IGNITE-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislav Lukyanov reassigned IGNITE-8732:
------------------------------------------
Assignee: Stanislav Lukyanov
> SQL: REPLICATED cache cannot be left-joined to PARTITIONED
> ----------------------------------------------------------
>
> Key: IGNITE-8732
> URL: https://issues.apache.org/jira/browse/IGNITE-8732
> Project: Ignite
> Issue Type: Improvement
> Components: sql
> Affects Versions: 2.5
> Reporter: Vladimir Ozerov
> Assignee: Stanislav Lukyanov
> Priority: Major
> Labels: sql-engine
>
> *Steps to reproduce*
> # Run
> {{org.apache.ignite.sqltests.ReplicatedSqlTest#testLeftJoinReplicatedPartitioned}}
> # Observe that we have 2x results on 2-node cluster
> *Root Cause*
> {{left LEFT JOIN right ON cond}} operation assumes full scan of of a left
> expression. Currently we perform this scan on every node and then simply
> merge results on reducer. Two nodes, two scans of {{REPLICATED}} cache, 2x
> results.
> *Potential Solutions*
> We may consider several solutions. Deeper analysis is required to understand
> which is the right one.
> # Perform deduplication on reducer - this most prospective and general
> technique, described in more details below
> # Treat {{REPLICATED}} cache as {{PARTITIONED}}. Essentially, we just need to
> pass proper backup filter. But what if {{REPLICATED}} cache spans more nodes
> than {{PARTITIONED}}? We cannot rely on primary/backup in this case
> # Implement additional execution phase as follows:
> {code}
> SELECT left.cols, right.cols FROM left INNER JOIN right ON cond;
> // Get "inner join" part
> UNION
> UNICAST SELECT left.cols, [NULL].cols FROM left WHERE left.id NOT IN ([ids
> from the first phase]) // Get "outer join" part
> {code}
> *Reducer Deduplication*
> The idea is to get all data locally and then perform final deduplication.
> This may incur high network overhead, because of lot of duplicated left parts
> would be transferred. However, this could be optimized greatly with the
> following techniques applied one after another
> # Semi-jions: {{left}} is {{joined}} on mapper node, but instead of sending
> {{(left, right)}} relation, we send {{(left) + (right)}}
> # In case {{left}} part is known to be idempotent (i.e. it produces the same
> result set on all nodes), only one node will send {{(left) + (right)}}, other
> nodes will send {{(right)}} only
> # Merge {{left}} results with if needed (i.e. if idempotence-related opto was
> not applicable)
> # Join {{left}} and {{right}} parts on reducer
--
This message was sent by Atlassian Jira
(v8.3.4#803005)