Hi!

>>>>> "Kristian" == Kristian Nielsen <[email protected]> writes:

Kristian> Michael Widenius <[email protected]> writes:
>> +      if (list->wakeup_subsequent_commits_running)
>> +      {
>> +        mysql_mutex_lock(&list->LOCK_wait_commit);
>> +        list->wakeup_subsequent_commits_running= false;
>> +        mysql_mutex_unlock(&list->LOCK_wait_commit);
>> 
>> Why do we need the mutex above?

Kristian> I checked this again.

Kristian> We _do_ need a full memory barrier here (memory barrier is implied in 
taking a
Kristian> mutex). Otherwise the compiler or CPU could re-order the setting of 
the
Kristian> wakeup_subsequent_commits_running flag with the reads and writes done 
in the
Kristian> list manipulations. This could cause the two threads to corrupt the 
list as
Kristian> they both manipulate it at the same time.

This I would like to understand better.
(It always nice to know what the cpu/compiler is really doing)..

In this case the code is looping over a list and just setting
list->wakeup_subsequent_commits_running= false;

I don't see how the compiler or CPU could reorder the code to things
in different order (as we are not updating anything else than this
flag in the loop).

This is especially true as setting of
cur->wakeup_subsequent_commits_running= true;
is done within a mutex and that is the last write to this element of
the list.

So there is a barrier between we set it and potentially clear it.
As the 'clear' may now happen 'any time' (from other threads point of
view) I don't see why it needs to be protected.


Kristian> But after carefully checking, it seems to me a barrier would be 
enough, we do
Kristian> not need to lock and unlock the mutex.

Kristian> Unfortunately, we do not have any good mechanisms for doing memory 
barriers in
Kristian> the MariaDB source currently, AFAIK. I will put a comment here that 
the
Kristian> lock/unlock is only needed for the memory barrier, and could be 
changed
Kristian> later. Then we can re-visit it when the more critical things have 
been fixed
Kristian> (and if we get memory barrier primitives in the source tree). What do 
you
Kristian> think of this suggestion?

That is fine with me.

Kristian> [I was wondering why I did it like this in the first place.

Kristian> Due to chaining between the waiter's mutex and the waitees mutex, the 
extra
Kristian> lock/unlock does prevent that wakeup_subsequent_commits2() can change 
the flag
Kristian> in the middle of unregister_wait_for_prior_commit2() doing its
Kristian> operation. Maybe that is why I did it this way. But I could not 
actually think
Kristian> of any problems from such a change in the middle.

Kristian> Or maybe I did it this way to emphasise that the code conceptually is 
locking
Kristian> both the waitee and the waiter - but to prevent deadlocks, it 
temporarily
Kristian> releases the waitee lock in the middle. So instead of

Kristian>    LOCK a; LOCK b; UNLOCK b; UNLOCK a;

Kristian> we inject a temporary unlock of a:

Kristian>    LOCK a; [UNLOCK a]; LOCK b; UNLOCK b; [LOCK a]; UNLOCK a;

Kristian> But it does not really make things any clearer.]

>> The only place where we test this variable, as far as I can find, is in
>> wait_for_commit::register_wait_for_prior_commit()

Kristian> The critical place is in 
wait_for_commit::unregister_wait_for_prior_commit2()

>> This thread doesn't know if the other thread is just before
>> mysql_mutex_lock(&waitee->LOCK_wait_commit), inside the lock or after
>> the lock.  So when we reset the variable should not matter.

The same is true for unregister_wait_for_prior_commit2.

Kristian> What we need to ensure is that

Kristian>  - All the reads of the list in thread 1 see only writes from other 
threads
Kristian>    that happened _before_ the variable was reset.

Kristian>  - All changes to the list in thread 2 happen only _after_ the 
variable was
Kristian>    reset in thread 1.

Kristian> So a memory barrier is needed, but as you say, lock/unlock should not 
be
Kristian> needed.

I still don't understand the current code fully.

The issue is that unregister_wait_for_prior_commit2() can be called by
thread 2 just before or just after thread 1 is resetting
wakeup_subsequent_commits_running.  There is not other variables
involved.

This means that in unregister_wait_for_prior_commit2() if thread 2 is there
just before list->wakeup_subsequent_commits_running is set to false, we will go
into this code:

    if (loc_waitee->wakeup_subsequent_commits_running)
    {
      /*
        When a wakeup is running, we cannot safely remove ourselves from the
        list without corrupting it. Instead we can just wait, as wakeup is
        already in progress and will thus be immediate.

        See comments on wakeup_subsequent_commits2() for more details.
      */
      mysql_mutex_unlock(&loc_waitee->LOCK_wait_commit);
      while (waiting_for_commit)
        mysql_cond_wait(&COND_wait_commit, &LOCK_wait_commit);
    }

I don't however see any code in :queue_for_group_commit() that will
signal COND_wait_commit.  What code do we have that will wake up the above
code ?

Looking at wait_for_commit::wakeup() that makes this:

  mysql_mutex_lock(&LOCK_wait_commit);
  waiting_for_commit= false;
  mysql_cond_signal(&COND_wait_commit);
  mysql_mutex_unlock(&LOCK_wait_commit);

However, in
wait_for_commit::register_wait_for_prior_commit()

We set wait_for_commit() without a mutex protection ?
Is this right?

I assume that the code works because things are called in a specific
order and some threads can't call other functions until at some
certain point when things are safe.

Like that wakeup() for a thread can never be called when that thread
is just before the first mutex in

wait_for_commit::register_wait_for_prior_commit()

as in this case 'waiting_for_cond' variable may be set to false (in
wakeup()) even if it should be true.

Regards,
Monty

_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp

Reply via email to