[
https://issues.apache.org/jira/browse/IGNITE-11498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksey Plekhanov updated IGNITE-11498:
---------------------------------------
Fix Version/s: (was: 2.9)
> SQL: Rework DML data distribution logic
> ---------------------------------------
>
> Key: IGNITE-11498
> URL: https://issues.apache.org/jira/browse/IGNITE-11498
> Project: Ignite
> Issue Type: Task
> Components: sql
> Reporter: Vladimir Ozerov
> Priority: Major
>
> Current DML implementation has a number of problems:
> 1) We fetch the whole data set to originator's node. There is
> "skipDmlOnReducer" flag to avoid this in some cases, but it is still in
> experimental state, and is not enabled by default
> 2) Updates are deadlock-prone: we update entries in batches equal to
> {{SqlFieldsQuery.pageSize}}. So we can deadlock easily with concurrent cache
> operations
> 3) We have very strange re-try logic. It is not clear why it is needed in the
> first place provided that DML is not transactional and no guarantees are
> needed.
> Proposal:
> # Implement proper routing logic: if a request could be executed on data
> nodes bypassing skipping reducer, do this. Otherwise fetch all data to
> reducer. This decision should be made in absolutely the same way as for MVCC
> (see {{GridNearTxQueryEnlistFuture}} as a starting point)
> # Distribute updates to primary data node in batches, but apply them one by
> one, similar to data streamer with {{allowOverwrite=false}}. Do not do any
> partition state or {{AffinityTopologyVersion}} checks, since DML is not
> transactional. Return and aggregate update counts back.
> # Remove or at least rethink retry logic. Why do we need it in the first
> place?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)