[
https://issues.apache.org/jira/browse/IGNITE-9314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771100#comment-16771100
]
Ivan Pavlukhin edited comment on IGNITE-9314 at 2/20/19 11:49 AM:
------------------------------------------------------------------
After a discussion with [~gvvinblade], [~amashenkov], [~agoncharuk],
[~ivan.glukos] it was found that streaming via primary node can lead to
unhandled data inconsistency between partition replicas, so this approach
cannot guarantee data consistency in failover scenarios.
It also known that current implementation of {{IgniteDataStreamer}} has
multiple problems which can lead to inconsistent data. The most promising
approach is streaming to exclusively locked table. In that case inserts can be
done using special _initial version_ and incrementing counters one by one at a
time of insert.
Other approaches can be studied. Among them is using special
_write-conflict-free_ semantics for streamer updates. But any other approach
should be thoroughly evaluated.
was (Author: pavlukhin):
After a discussion with [~gvvinblade], [~amashenkov], [~agoncharuk],
[~ivan.glukos] it was found that streaming via primary node can lead to
unhandled data inconsistency between partition replicas, so this approach
cannot guarantee data consistency in failover scenarios.
It also known that current implementation of `IgniteDataStreamer` has multiple
problems which can lead to inconsistent data. The most promising approach is
streaming to exclusively locked table. In that case inserts can be done using
special _initial version_ and incrementing counters one by one at a time of
insert.
Other approaches can be studied. Among them is using special
_write-conflict-free_ semantics for streamer updates. But any other approach
should be thoroughly evaluated.
> MVCC TX: Datastreamer operations
> --------------------------------
>
> Key: IGNITE-9314
> URL: https://issues.apache.org/jira/browse/IGNITE-9314
> Project: Ignite
> Issue Type: Task
> Components: mvcc
> Reporter: Igor Seliverstov
> Assignee: Ivan Pavlukhin
> Priority: Major
> Fix For: 2.8
>
>
> Need to change DataStreamer semantics.
> {{allowOverwrite=false}} mode currently is inconsistent with interval
> _partition counters_ update approach used by MVCC transactions.
> {{allowOverwrite=true}} mode is terribly slow when using single {{cache.put}}
> operations (snapshot request, tx commit on coordinator overhead). Batched
> mode using {{cache.putAll}} should handle write conflicts and possible
> deadlocks.
> Also there is a problem when {{DataStreamer}} with {{allowOverwrite ==
> false}} does not insert value when versions for entry exist but they all are
> aborted. Proper transactional semantics should developed for such case. After
> that attention should be put on Cache.size method behavior. Cache.size
> addressed in https://issues.apache.org/jira/browse/IGNITE-8149 could be
> decremented improperly in
> {{org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager#mvccRemoveAll}}
> method (called during streamer processing) when all existing mvcc row
> versions are aborted or last committed one is _remove_ version.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)