[ 
https://issues.apache.org/jira/browse/IGNITE-11498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-11498:
---------------------------------------
    Fix Version/s:     (was: 2.9)

> SQL: Rework DML data distribution logic
> ---------------------------------------
>
>                 Key: IGNITE-11498
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11498
>             Project: Ignite
>          Issue Type: Task
>          Components: sql
>            Reporter: Vladimir Ozerov
>            Priority: Major
>
> Current DML implementation has a number of problems:
> 1) We fetch the whole data set to originator's node. There is 
> "skipDmlOnReducer" flag to avoid this in some cases, but it is still in 
> experimental state, and is not enabled by default
> 2) Updates are deadlock-prone: we update entries in batches equal to 
> {{SqlFieldsQuery.pageSize}}. So we can deadlock easily with concurrent cache 
> operations
> 3) We have very strange re-try logic. It is not clear why it is needed in the 
> first place provided that DML is not transactional and no guarantees are 
> needed.
> Proposal:
> # Implement proper routing logic: if a request could be executed on data 
> nodes bypassing skipping reducer, do this. Otherwise fetch all data to 
> reducer. This decision should be made in absolutely the same way as for MVCC 
> (see {{GridNearTxQueryEnlistFuture}} as a starting point)
> # Distribute updates to primary data node in batches, but apply them one by 
> one, similar to data streamer with {{allowOverwrite=false}}. Do not do any 
> partition state or {{AffinityTopologyVersion}} checks, since DML is not 
> transactional. Return and aggregate update counts back.
> # Remove or at least rethink retry logic. Why do we need it in the first 
> place?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to