It seems that the current upsert does not work correctly based on the deadlock-free locking protocol when the query in the upsert statement has any blocking operator in the job pipeline. According to the deadlock-free locking protocol, when a trylock() fails for a record to be upserted during search phase from the query clause, all records that already acquired lock should be pushed to the commit operator before a lock()(blocking call) is called for the trylock-failed record in order to avoid the hold-and-wait situation. However, when there is any blocking operator in the pipeline, the pushed operator may not reach the commit operator. So, upsert with a blocking operator breaks the deadlock-free locking protocol. This should be fixed.
For the same reason, the delete should not be modified to hold non-instant locks for maintaining the deadlock-free locking protocol. Best, Young-Seok On Sat, Jun 25, 2016 at 11:42 AM, abdullah alamoudi <[email protected]> wrote: > Not difficult at all. We do something similar for Upserts. > Essentially, all we need to do is: > 1. Exclude the delete case from the introduce materialize rule. > 2. Use different callbacks for the search and delete operations. This means > that the lock on the search in this case will be a write lock and in the > delete will be a no lock. The commit will then release the lock. > > Should be a couple of hours work. > ~Abdullah. > > On Sat, Jun 25, 2016 at 6:25 PM, Mike Carey <[email protected]> wrote: > > > I wonder how easy this would be to fit into the current S/W structure, > > though? > > > > (Side note: My wife will be sad; Halloween is her favorite holiday... > :-)) > > > > > > On 6/24/16 7:22 PM, abdullah alamoudi wrote: > > > >> There is a better alternative. > >> > >> When the search is performed, we acquire a write lock and perform the > >> deletion right away without materialization. Currently, the > >> materialization > >> is introduced by the IntroduceMaterializeForInsertWithSelfReadRule. The > >> rule is meant to prevent a Halloween situation. However, with a delete, > >> Halloween is cancelled. > >> > >> Cheers, > >> Abdullah. > >> > >> On Sat, Jun 25, 2016 at 3:59 AM, Young-Seok Kim <[email protected]> > >> wrote: > >> > >> Non-instant read locks will break the deadlock-free locking protocol > that > >>> we achieved currently since the necessary condition for the deadlock, > >>> i.e., > >>> hold-and-wait situation will come back alive. > >>> > >>> Best, > >>> Young-Seok > >>> > >>> > >>> On Fri, Jun 24, 2016 at 5:49 PM, Mike Carey <[email protected]> wrote: > >>> > >>> Got it and agreed. Would another solution be to retain the (no longer > >>>> instant) lock? (What can of worms would that open?) > >>>> > >>>> > >>>> > >>>> On 6/24/16 5:09 PM, Young-Seok Kim wrote: > >>>> > >>>> Please see inline below. > >>>>> > >>>>> Best, > >>>>> Young-Seok > >>>>> > >>>>> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <[email protected]> > wrote: > >>>>> > >>>>> So to clarify, record-level consistency (and primary/secondary index > >>>>> > >>>>>> consistency) is guaranteed and will work "correctly" in all cases > if a > >>>>>> record R is updated (or deleted) by T2 after being targeted (by > >>>>>> primary > >>>>>> key) for deletion by T1. > >>>>>> > >>>>>> Yes, agreed on. > >>>>> > >>>>> > >>>>> The only semantic issue is that there is a (hopefully very, very > small) > >>>>> > >>>>>> window of time between when T1 sees R in a secondary index and when > it > >>>>>> acquires for the lock on R's primary key - during which T2 could > >>>>>> > >>>>> change R > >>> > >>>> in a way that makes it no longer query-compliant. > >>>>>> > >>>>>> Here, let me clarify the above sentence: "when it acquires for the > >>>>> lock > >>>>> > >>>> on > >>> > >>>> R's primary key" means that the lock is acquired and released by T1 > >>>>> > >>>> since > >>> > >>>> the lock was an instant shared-mode(read) lock. So, T2 can change R > >>>>> > >>>> after > >>> > >>>> acquiring an exclusive lock and consequently the R is not qualified > for > >>>>> the > >>>>> query predicate anymore. > >>>>> > >>>>> > >>>>> (However, at its time of being observed - which happened under > >>>>> > >>>>>> read-committed - it was a correct candidate for deletion. So this > is > >>>>>> kind > >>>>>> of "expected" but admittedly kind of weird. It seems like this > could > >>>>>> maybe > >>>>>> be fixed in the future via a mechanism similar to the index-only > >>>>>> > >>>>> branch's > >>> > >>>> way of handling locks?) > >>>>>> > >>>>>> This expected but undesired situation can be avoided by introducing > a > >>>>> version number which will be stored as a field (which is not exposed > to > >>>>> users) in each entry of the primary index and the secondary indexes > >>>>> such > >>>>> that the version number can be used to verify that the record > searched > >>>>> during the search phase is the same record to be deleted during the > >>>>> > >>>> delete > >>> > >>>> phase. If the verification is succeeded, the delete will be performed. > >>>>> Otherwise, it will not. > >>>>> > >>>>> > >>>>> > >>>>> On 6/24/16 10:59 AM, Young-Seok Kim wrote: > >>>>>> > >>>>>> This is somewhat expected issue by having read-committed isolation > >>>>>> > >>>>> level > >>> > >>>> based on strict 2PL locking protocol. > >>>>>>> The strict 2PL guarantees that all acquired exclusive locks by a > >>>>>>> transaction can be released after the transaction is committed. > >>>>>>> But, read lock doesn't follow this. > >>>>>>> So, as you described in the email, a record read by a transaction, > T1 > >>>>>>> during search can be modified by another transaction T2 before the > >>>>>>> record > >>>>>>> is deleted by T1. This is a possible situation under the > >>>>>>> > >>>>>> read-committed > >>> > >>>> isolation level. > >>>>>>> However, there is no inconsistency between a primary index and > >>>>>>> > >>>>>> secondary > >>> > >>>> indexes in the way that the modified record by T2 is deleted by T1 > >>>>>>> > >>>>>> from > >>> > >>>> the > >>>>>>> primary index and the corresponding secondary index entry may not > be > >>>>>>> deleted by T1. This is because when T1 starts deleting process > >>>>>>> through > >>>>>>> the > >>>>>>> job pipeline, an exclusive lock for the record is first acquired > and > >>>>>>> then > >>>>>>> the delete operations in primary and secondary indexes are > performed. > >>>>>>> So, > >>>>>>> either case1) the record should exist with the identical primary > key > >>>>>>> > >>>>>> for > >>> > >>>> the record to be deleted by T1 (since the search will deliver the > >>>>>>> primary > >>>>>>> key, not the complete record) or case2) the record will not be > >>>>>>> deleted > >>>>>>> by > >>>>>>> T1 if the record with the primary key does not exist. > >>>>>>> > >>>>>>> For case1), once a record is deleted from the primary index, all > rest > >>>>>>> > >>>>>> of > >>> > >>>> secondary indexes in the job pipeline correctly find and delete the > >>>>>>> corresponding secondary index entries. > >>>>>>> For case2), I need to check the behavior whether the job pipeline > >>>>>>> > >>>>>> throws > >>> > >>>> an > >>>>>>> exception due to trying to delete the non-existing record and stops > >>>>>>> proceeding the job by aborting the job, or the exception is just > >>>>>>> swallowed > >>>>>>> and the job proceeds for the next record. > >>>>>>> > >>>>>>> Best, > >>>>>>> Young-Seok > >>>>>>> > >>>>>>> > >>>>>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi < > >>>>>>> > >>>>>> [email protected] > >>> > >>>> wrote: > >>>>>>> > >>>>>>> Hi everyone, > >>>>>>> > >>>>>>> I think we have a problem related to the deletes transaction > >>>>>>>> behavior:here > >>>>>>>> is the problem: > >>>>>>>> > >>>>>>>> Our delete starts by searching the tree to identify delete tuples > >>>>>>>> > >>>>>>> based > >>> > >>>> on > >>>>>>>> the delete statement conditional clause. It follows that by > >>>>>>>> inserting > >>>>>>>> delete tuples in primary index, followed by updating secondary > >>>>>>>> > >>>>>>> indexes, > >>> > >>>> followed by a commit on the PK > >>>>>>>> > >>>>>>>> The problem happens if after searching the tree and identifying > the > >>>>>>>> records > >>>>>>>> to be deleted, one of those records was updated. This will cause > the > >>>>>>>> record > >>>>>>>> to be deleted in the primary index even though it might not meet > the > >>>>>>>> conditional clause. Moreover, the new entries in the secondary > >>>>>>>> > >>>>>>> indexes > >>> > >>>> will > >>>>>>>> remain without their record in the primary index. > >>>>>>>> > >>>>>>>> In order to fix this, we need to do one of the following: > >>>>>>>> 1. lock the records when we do the search to identify the records > to > >>>>>>>> > >>>>>>> be > >>> > >>>> deleted > >>>>>>>> OR > >>>>>>>> 2. when performing the delete, we double check that the record > we're > >>>>>>>> deleting is the same as the record we find when we do the actual > >>>>>>>> > >>>>>>> delete > >>> > >>>> A better way would be to perform the delete as we do the search since > >>>>>>>> there > >>>>>>>> is no need to do the whole search, materialize then perform the > >>>>>>>> > >>>>>>> delete. > >>> > >>>> There is a change I got something wrong. Did I? Thoughts? > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > > >
