[
https://issues.apache.org/jira/browse/IGNITE-24995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Pereslegin reassigned IGNITE-24995:
-----------------------------------------
Assignee: Pavel Pereslegin
> Sql. Rework correlates serialization and propagation to another node.
> ---------------------------------------------------------------------
>
> Key: IGNITE-24995
> URL: https://issues.apache.org/jira/browse/IGNITE-24995
> Project: Ignite
> Issue Type: Improvement
> Components: sql
> Affects Versions: 3.0
> Reporter: Andrey Mashenkov
> Assignee: Pavel Pereslegin
> Priority: Major
> Labels: ignite-3, performance, tech-debt
>
> *Motivation.*
> As for now, a SharedState class for storing correlates in execution context
> and is used by CorrelatedNestedLoopJoinNode (CNLJN) execution node.
> Seems, CorrelatedNestedLoopJoinNode was designed to use batching for
> correlates variables, to transfer many rows at a time, but implemented in
> wrong way, and this just don't work.
> There are few related issues
> 1. The class implements Serializable interface and can be transferred to
> another node.
> This causes using DefaultUserObjectMarshaller for class serialization in
> messaging system. Despite the SharedState class contains BinaryTuple objects,
> they are not converted to byte[] during serialization, which is ineffective.
> Maybe making it Externalizable could mitigate the issue.
> 2. We don't need to put a whole sql row to a correlate variable, but only
> required row columns(projection) to reduce network pressure.
> It is important that all the nodes creates the same projection for the same
> correlate.
> 3. We should fix the SharedState class to make batching possible, by allowing
> set multiple rows for the same correlate id.
> Most likely, we must keep correlates hierarchy order to preserve CNLJN
> collation. Correlate id number doesn't have this guarantee) in case of more
> than one correlate.
> It may turn out that passing batches for parent correlates is useless,
> because we can spool only child batch at a time to preserve the collation.
> Thus, SharedState maybe split or changed it's structure, to separate
> correlates, which where received from parent fragment, and current correlates
> to be passed to child fragment.
> *Suggestion*
> Let's improve SharedState class structure to support batching, by allowing
> multiple rows for same correlate and resolve ordering issue (if it exists).
> Let's resolve serialization issue by adding message class for this (or use
> externalizable at least).
> Let's avoid transferring whole rows.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)