Mikhail Petrov created IGNITE-27676:
---------------------------------------
Summary: FULL_SYNC mode guarantee may be broken for ATOMIC caches
in case any update errors on primary node
Key: IGNITE-27676
URL: https://issues.apache.org/jira/browse/IGNITE-27676
Project: Ignite
Issue Type: Task
Reporter: Mikhail Petrov
Consider the following scenario:
Consider cluster with 3 nodes - node0, node1, node2
ATOMIC, FULL_SYNC, 1 backup cache.
1. node0 accepts putAll request, maps all keys to corresponding primary nodes
and sends GridNearAtomicFullUpdateRequest to node1 and node2.
2. node1 manages to successfully update only some of the entries. Update of
others failed with exception (e.g. user defined cache interceptor threw an
exception). Lets say that entries with keys 1, 2 were successfully updated and
key 3 failed.
3. node1 sends GridDhtAtomicUpdateRequest with successfully updated entries
(1, 2) to backup node(node2)
4. node1 sends GridNearAtomicUpdateResponse with failed keys (3) to node0 (near)
5. as soon as node0 receives GridNearAtomicUpdateResponse, it will complete
putAll operation with CachePartialUpdateException that will contain all failed
entries (3) from GridNearAtomicUpdateResponse and message that user should
retry insert operation for them (see
GridNearAtomicUpdateFuture#onPrimaryResponse)
Therefore, if node0 receives GridNearAtomicUpdateResponse before entries 1 and
2, successfully updated on the primary node, are processed on the backup node
(node2), putAll operation will complete with exception only that key 3 failed
to be processed. This behavior violates FULL_SYNC guarantee for entries 1 and
2.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)