Anton,

I did not know mechanics locking entries on backups during prepare
phase. Thank you for pointing that out!

пт, 12 июл. 2019 г. в 22:45, Ivan Rakov <ivan.glu...@gmail.com>:
>
> Hi Anton,
>
> > Each get method now checks the consistency.
> > Check means:
> > 1) tx lock acquired on primary
> > 2) gained data from each owner (primary and backups)
> > 3) data compared
> Did you consider acquiring locks on backups as well during your check,
> just like 2PC prepare does?
> If there's HB between steps 1 (lock primary) and 2 (update primary +
> lock backup + update backup), you may be sure that there will be no
> false-positive results and no deadlocks as well. Protocol won't be
> complicated: checking read from backup will just wait for commit if it's
> in progress.
>
> Best Regards,
> Ivan Rakov
>
> On 12.07.2019 9:47, Anton Vinogradov wrote:
> > Igniters,
> >
> > Let me explain problem in detail.
> > Read Repair at pessimistic tx (locks acquired on primary, full sync, 2pc)
> > able to see consistency violation because backups are not updated yet.
> > This seems to be not a good idea to "fix" code to unlock primary only when
> > backups updated, this definitely will cause a performance drop.
> > Currently, there is no explicit sync feature allows waiting for backups
> > updated during the previous tx.
> > Previous tx just sends GridNearTxFinishResponse to the originating node.
> >
> > Bad ideas how to handle this:
> > - retry some times (still possible to gain false positive)
> > - lock tx entry on backups (will definitely break failover logic)
> > - wait for same entry version on backups during some timeout (will require
> > huge changes at "get" logic and false positive still possible)
> >
> > Is there any simple fix for this issue?
> > Thanks for tips in advance.
> >
> > Ivan,
> > thanks for your interest
> >
> >>> 4. Very fast and lucky txB writes a value 2 for the key on primary and
> > backup.
> > AFAIK, reordering not possible since backups "prepared" before primary
> > releases lock.
> > So, consistency guaranteed by failover and by "prepare" feature of 2PC.
> > Seems, the problem is NOT with consistency at AI, but with consistency
> > detection implementation (RR) and possible "false positive" results.
> > BTW, checked 1PC case (only one data node at test) and gained no issues.
> >
> > On Fri, Jul 12, 2019 at 9:26 AM Павлухин Иван <vololo...@gmail.com> wrote:
> >
> >> Anton,
> >>
> >> Is such behavior observed for 2PC or for 1PC optimization? Does not it
> >> mean that the things can be even worse and an inconsistent write is
> >> possible on a backup? E.g. in scenario:
> >> 1. txA writes a value 1 for the key on primary.
> >> 2. txA unlocks the key on primary.
> >> 3. txA freezes before updating backup.
> >> 4. Very fast and lucky txB writes a value 2 for the key on primary and
> >> backup.
> >> 5. txB wakes up and writes 1 for the key.
> >> 6. As result there is 2 on primary and 1 on backup.
> >>
> >> Naively it seems that locks should be released after all replicas are
> >> updated.
> >>
> >> ср, 10 июл. 2019 г. в 16:36, Anton Vinogradov <a...@apache.org>:
> >>> Folks,
> >>>
> >>> Investigating now unexpected repairs [1] in case of ReadRepair usage at
> >>> testAccountTxNodeRestart.
> >>> Updated [2] the test to check is there any repairs happen.
> >>> Test's name now is "testAccountTxNodeRestartWithReadRepair".
> >>>
> >>> Each get method now checks the consistency.
> >>> Check means:
> >>> 1) tx lock acquired on primary
> >>> 2) gained data from each owner (primary and backups)
> >>> 3) data compared
> >>>
> >>> Sometime, backup may have obsolete value during such check.
> >>>
> >>> Seems, this happen because tx commit on primary going in the following
> >> way
> >>> (check code [2] for details):
> >>> 1) performing localFinish (releases tx lock)
> >>> 2) performing dhtFinish (commits on backups)
> >>> 3) transferring control back to the caller
> >>>
> >>> So, seems, the problem here is that "tx lock released on primary" does
> >> not
> >>> mean that backups updated, but "commit() method finished at caller's
> >>> thread" does.
> >>> This means that, currently, there is no happens-before between
> >>> 1) thread 1 committed data on primary and tx lock can be reobtained
> >>> 2) thread 2 reads from backup
> >>> but still strong HB between "commit() finished" and "backup updated"
> >>>
> >>> So, it seems to be possible, for example, to gain notification by a
> >>> continuous query, then read from backup and gain obsolete value.
> >>>
> >>> Is this "partial happens before" behavior expected?
> >>>
> >>> [1] https://issues.apache.org/jira/browse/IGNITE-11973
> >>> [2] https://github.com/apache/ignite/pull/6679/files
> >>> [3]
> >>>
> >> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal#finishTx
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Ivan Pavlukhin
> >>



-- 
Best regards,
Ivan Pavlukhin

Reply via email to