[ 
https://issues.apache.org/jira/browse/IGNITE-18225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Orlov updated IGNITE-18225:
--------------------------------------
    Description: 
Currently, ModifyNode can only have distribution "single". This means that this 
node will be executed on a single node, and the input should be gathered at one 
place. Assume the following query: UPDATE t SET a = a + 1. Such a query will be 
executed in 2 steps: first we select the rows to update and then do the update. 
Having a ModifyNode as "single" will result in sending all rows of table T to 
the reducer, and then send updated version of rows back to the data nodes.

We could eliminate this round trip by pushing down the ModifyNode (i.e. 
allowing this node to have distribution matching the distribution of modifying 
table).

Two approaches come to my mind:
 * as with aggregates, we can introduce 2 physical version of a logical modify: 
SingleModify (NB: not colocated!) and Map- + ReduceModify (I hope the rest of 
the necessary changes are clear)
 * make the ModifyNode to have the same distribution as modifying table. In 
that case we need to put SUM aggregate on top of ModifyNode to reduce an 
outcome.

Personally, I would prefer to stick with the second option, because in that 
case we can get rid of {{FragmentMapping#updatingTableAssignments()}} which was 
introduced more like a hack.

  was:
Having plan tree we easily can check whether a final modification may be 
executed on data nodes directly or not. We should implement such kind of 
optimization.

Proposed solution is to pushdown MODIFY to under exchange, and add a single SUM 
aggregate on top to reduce the result.


> Sql. Pushdown MODIFY to data node
> ---------------------------------
>
>                 Key: IGNITE-18225
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18225
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>            Reporter: Konstantin Orlov
>            Priority: Major
>              Labels: ignite-3
>
> Currently, ModifyNode can only have distribution "single". This means that 
> this node will be executed on a single node, and the input should be gathered 
> at one place. Assume the following query: UPDATE t SET a = a + 1. Such a 
> query will be executed in 2 steps: first we select the rows to update and 
> then do the update. Having a ModifyNode as "single" will result in sending 
> all rows of table T to the reducer, and then send updated version of rows 
> back to the data nodes.
> We could eliminate this round trip by pushing down the ModifyNode (i.e. 
> allowing this node to have distribution matching the distribution of 
> modifying table).
> Two approaches come to my mind:
>  * as with aggregates, we can introduce 2 physical version of a logical 
> modify: SingleModify (NB: not colocated!) and Map- + ReduceModify (I hope the 
> rest of the necessary changes are clear)
>  * make the ModifyNode to have the same distribution as modifying table. In 
> that case we need to put SUM aggregate on top of ModifyNode to reduce an 
> outcome.
> Personally, I would prefer to stick with the second option, because in that 
> case we can get rid of {{FragmentMapping#updatingTableAssignments()}} which 
> was introduced more like a hack.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to