On Thu, Oct 01, 2020 at 12:15:29PM -0400, Alan Stern wrote:
> On Wed, Sep 30, 2020 at 09:51:16PM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > Al Viro posted the following query:
> > 
> > ------------------------------------------------------------------------
> > 
> > <viro> fun question regarding barriers, if you have time for that
> > <viro>         V->A = V->B = 1;
> > <viro>
> > <viro> CPU1:
> > <viro>         to_free = NULL
> > <viro>         spin_lock(&LOCK)
> > <viro>         if (!smp_load_acquire(&V->B))
> > <viro>                 to_free = V
> > <viro>         V->A = 0
> > <viro>         spin_unlock(&LOCK)
> > <viro>         kfree(to_free)
> > <viro>
> > <viro> CPU2:
> > <viro>         to_free = V;
> > <viro>         if (READ_ONCE(V->A)) {
> > <viro>                 spin_lock(&LOCK)
> > <viro>                 if (V->A)
> > <viro>                         to_free = NULL
> > <viro>                 smp_store_release(&V->B, 0);
> > <viro>                 spin_unlock(&LOCK)
> > <viro>         }
> > <viro>         kfree(to_free);
> > <viro> 1) is it guaranteed that V will be freed exactly once and that
> >       no accesses to *V will happen after freeing it?
> > <viro> 2) do we need smp_store_release() there?  I.e. will anything
> >       break if it's replaced with plain V->B = 0?
> 
> Here are my answers to Al's questions:
> 
> 1) It is guaranteed that V will be freed exactly once.  It is not 
> guaranteed that no accesses to *V will occur after it is freed, because 
> the test contains a data race.  CPU1's plain "V->A = 0" write races with 
> CPU2's READ_ONCE; if the plain write were replaced with 
> "WRITE_ONCE(V->A, 0)" then the guarantee would hold.  Equally well, 
> CPU1's smp_load_acquire could be replaced with a plain read while the 
> plain write is replaced with smp_store_release.
> 
> 2) The smp_store_release in CPU2 is not needed.  Replacing it with a 
> plain V->B = 0 will not break anything.
> 
> Analysis: Apart from the kfree calls themselves, the only access to a 
> shared variable outside of a critical section is CPU2's READ_ONCE of 
> V->A.  So let's consider two possibilities:
> 
> 1: The READ_ONCE returns 0.  Then CPU2 doesn't execute its critical 
> section and does kfree(V).  However, the fact that the READ_ONCE got 0 
> means that CPU1 has already entered its critical section, has already 
> written to V->A (but with a plain write!) and therefore has already seen 
> V->B = 1 (because of the smp_load_acquire), and therefore will not free 
> V.  This case shows that the ordering we require is for CPU1 to read 
> V->B before it writes V->A.  The ordering can be enforced by using 
> either a load-acquire (as in the litmus test) or a store-release.
> 
> 2: The READ_ONCE returns 1.  Then CPU2 does execute its critical 
> section, and we can simply treat this case the same as if the critical 
> section was executed unconditionally.  Whichever CPU runs its critical 
> section second will free V, and the other CPU won't try to access V 
> after leaving its own critical section (and thus won't access V after it 
> has been freed).
> 
> > ------------------------------------------------------------------------
> > 
> > Of course herd7 supports neither structures nor arrays, but I was
> > crazy enough to try individual variables with made-up address and data
> > dependencies.  This litmus test must also detect use-after-free bugs,
> > but a simple variable should be able to do that.  So here is a
> > prototype:
> > 
> > ------------------------------------------------------------------------
> > 
> > C C-viro-2020.09.29a
> > 
> > {
> >     int a = 1;
> >     int b = 1;
> >     int v = 1;
> > }
> 
> Not the way I would have done it, but okay.  I would have modeled the 
> kfree by setting a and b both to some sentinel value.

Might be well worth pursuing!  But how would you model the address
dependencies in that approach?

> > P0(int *a, int *b, int *v, spinlock_t *l)
> > {
> >     int r0;
> >     int r1;
> >     int r2 = 2;
> >     int r8;
> >     int r9a = 2;
> >     int r9b = 2;
> > 
> >     r0 = 0;
> >     spin_lock(l);
> >     r9a = READ_ONCE(*v); // Use after free?
> >     r8 = r9a - r9a; // Restore address dependency
> >     r8 = b + r8;
> >     r1 = smp_load_acquire(r8);
> >     if (r1 == 0)
> >             r0 = 1;
> >     r9b = READ_ONCE(*v); // Use after free?
> >     WRITE_ONCE(*a, r9b - r9b); // Use data dependency
> >     spin_unlock(l);
> >     if (r0) {
> >             r2 = READ_ONCE(*v);
> >             WRITE_ONCE(*v, 0); /* kfree(). */
> >     }
> > }
> > 
> > P1(int *a, int *b, int *v, spinlock_t *l)
> > {
> >     int r0;
> >     int r1;
> >     int r1a;
> >     int r2 = 2;
> >     int r8;
> >     int r9a = 2;
> >     int r9b = 2;
> >     int r9c = 2;
> > 
> >     r0 = READ_ONCE(*v);
> >     r9a = r0; // Use after free?
> 
> Wrong.  This should be:
> 
>       r0 = 1;
>       r9a = READ_ONCE(*v);

Thank you!  I was definitely suffering from a severe case of Programmer's
Blindness.  Fixed!

> >     r8 = r9a - r9a; // Restore address dependency
> >     r8 = a + r8;
> >     r1 = READ_ONCE(*r8);
> >     if (r1) {
> >             spin_lock(l);
> >             r9b = READ_ONCE(*v); // Use after free?
> >             r8 = r9b - r9b; // Restore address dependency
> >             r8 = a + r8;
> >             r1a = READ_ONCE(*r8);
> >             if (r1a)
> >                     r0 = 0;
> >             r9c = READ_ONCE(*v); // Use after free?
> >             smp_store_release(b, r9c - rc9); // Use data dependency
> >             spin_unlock(l);
> >     }
> >     if (r0) {
> >             r2 = READ_ONCE(*v);
> >             WRITE_ONCE(*v, 0); /* kfree(). */
> >     }
> > }
> > 
> > locations [a;b;v;0:r1;0:r8;1:r1;1:r8]
> > exists (0:r0=1:r0 \/ (* Both or neither did kfree(). *)
> >     v=1 \/ (* Neither did kfree, redundant check. *)
> >     0:r2=0 \/ 1:r2=0 \/  (* Both did kfree, redundant check. *)
> >     0:r9a=0 \/ 0:r9b=0 \/ 1:r9a=0 \/ (* CPU1 use after free. *)
> >     1:r9b=0 \/ 1:r9c=0) (* CPU2 use after free. *)
> > 
> > ------------------------------------------------------------------------
> > 
> > This "exists" clause is satisfied:
> > 
> > ------------------------------------------------------------------------
> > 
> > $ herd7 -conf linux-kernel.cfg 
> > ~/paper/scalability/LWNLinuxMM/litmus/manual/kernel/C-viro-2020.09.29a.litmus
> > Test C-viro-2020.09.29a Allowed
> > States 5
> > 0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=0; 0:r9b=0; 1:r0=1; 1:r1=0; 1:r2=1; 
> > 1:r8=a; 1:r9a=1; 1:r9b=2; 1:r9c=2; a=0; b=1; v=0;
> > 0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=0; 1:r0=1; 1:r1=0; 1:r2=1; 
> > 1:r8=a; 1:r9a=1; 1:r9b=2; 1:r9c=2; a=0; b=1; v=0;
> > 0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=1; 1:r0=0; 1:r1=1; 1:r2=2; 
> > 1:r8=a; 1:r9a=1; 1:r9b=1; 1:r9c=1; a=0; b=1; v=1;
> 
> The values for this case don't make sense.  I haven't checked the other 
> four cases.  Printing a graph of the relations for this case (the only 
> state with v=1 at the end) might help.
> 
> > 0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=1; 1:r0=1; 1:r1=0; 1:r2=1; 
> > 1:r8=a; 1:r9a=1; 1:r9b=2; 1:r9c=2; a=0; b=1; v=0;
> > 0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=1; 1:r0=1; 1:r1=1; 1:r2=1; 
> > 1:r8=a; 1:r9a=1; 1:r9b=1; 1:r9c=1; a=0; b=1; v=0;
> > Ok
> > Witnesses
> > Positive: 3 Negative: 2
> > Condition exists (0:r0=1:r0 \/ v=1 \/ 0:r2=0 \/ 1:r2=0 \/ 0:r9a=0 \/ 
> > 0:r9b=0 \/ 1:r9a=0 \/ 1:r9b=0 \/ 1:r9c=0)
> > Observation C-viro-2020.09.29a Sometimes 3 2
> > Time C-viro-2020.09.29a 14.33
> > Hash=89f74abff4de682ee0bea8ee6dd53134
> 
> Why didn't this flag the data race?

Because I turned Al's simple assignments into *_ONCE() or better.
In doing this, I was following the default KCSAN settings which
(for better or worse) forgive the stores from data races.

> > ------------------------------------------------------------------------
> > 
> > So did we end up with herd7 not respecting "fake" dependencies like
> > those shown above, or have I just messed up the translation from Al's
> > example to the litmus test?  (Given one thing and another over the past
> > couple of days, my guess would be that I just messed up the translation,
> > especially given that I don't see a reference to fake dependencies in
> > the documentation, but I figured that I should ask.)
> 
> What do you get if you fix up the litmus test?

With your suggested change and using simple assignments where Al
indicated them:

------------------------------------------------------------------------

$ herd7 -conf linux-kernel.cfg 
~/paper/scalability/LWNLinuxMM/litmus/manual/kernel/C-viro-2020.09.29a.litmus
Test C-viro-2020.09.29a Allowed
States 5
0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=0; 0:r9b=0; 1:r0=1; 1:r1=0; 1:r2=1; 
1:r8=a; 1:r9a=1; 1:r9b=2; 1:r9c=2; a=0; b=1; v=0;
0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=0; 1:r0=1; 1:r1=0; 1:r2=1; 
1:r8=a; 1:r9a=1; 1:r9b=2; 1:r9c=2; a=0; b=1; v=0;
0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=1; 1:r0=0; 1:r1=1; 1:r2=2; 
1:r8=a; 1:r9a=1; 1:r9b=1; 1:r9c=1; a=0; b=1; v=1;
0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=1; 1:r0=1; 1:r1=0; 1:r2=1; 
1:r8=a; 1:r9a=1; 1:r9b=2; 1:r9c=2; a=0; b=1; v=0;
0:r0=0; 0:r1=1; 0:r2=2; 0:r8=b; 0:r9a=1; 0:r9b=1; 1:r0=1; 1:r1=1; 1:r2=1; 
1:r8=a; 1:r9a=1; 1:r9b=1; 1:r9c=1; a=0; b=1; v=0;
Ok
Witnesses
Positive: 3 Negative: 2
Flag data-race
Condition exists (0:r0=1:r0 \/ v=1 \/ 0:r2=0 \/ 1:r2=0 \/ 0:r9a=0 \/ 0:r9b=0 \/ 
1:r9a=0 \/ 1:r9b=0 \/ 1:r9c=0)
Observation C-viro-2020.09.29a Sometimes 3 2
Time C-viro-2020.09.29a 17.95
Hash=14ded51102b668bc38b790e8c3692227

------------------------------------------------------------------------

So still "Sometimes", but the "Flag data-race" you expected is there.

I posted the updated litmus test below.  Additional or other thoughts?

                                                        Thanx, Paul

------------------------------------------------------------------------

C C-viro-2020.09.29a

{
        int a = 1;
        int b = 1;
        int v = 1;
}


P0(int *a, int *b, int *v, spinlock_t *l)
{
        int r0;
        int r1;
        int r2 = 2;
        int r8;
        int r9a = 2;
        int r9b = 2;

        r0 = 0;
        spin_lock(l);
        r9a = READ_ONCE(*v); // Use after free?
        r8 = r9a - r9a; // Restore address dependency
        r8 = b + r8;
        r1 = smp_load_acquire(r8);
        if (r1 == 0)
                r0 = 1;
        r9b = READ_ONCE(*v); // Use after free?
        // WRITE_ONCE(*a, r9b - r9b); // Use data dependency
        *a = r9b - r9b; // Use data dependency
        spin_unlock(l);
        if (r0) {
                r2 = READ_ONCE(*v);
                WRITE_ONCE(*v, 0); /* kfree(). */
        }
}

P1(int *a, int *b, int *v, spinlock_t *l)
{
        int r0;
        int r1;
        int r1a;
        int r2 = 2;
        int r8;
        int r9a = 2;
        int r9b = 2;
        int r9c = 2;

        r0 = 1;
        r9a = READ_ONCE(*v); // Use after free?
        r8 = r9a - r9a; // Restore address dependency
        r8 = a + r8;
        r1 = READ_ONCE(*r8);
        if (r1) {
                spin_lock(l);
                r9b = READ_ONCE(*v); // Use after free?
                r8 = r9b - r9b; // Restore address dependency
                r8 = a + r8;
                // r1a = READ_ONCE(*r8);
                r1a = *r8;
                if (r1a)
                        r0 = 0;
                r9c = READ_ONCE(*v); // Use after free?
                smp_store_release(b, r9c - rc9); // Use data dependency
                spin_unlock(l);
        }
        if (r0) {
                r2 = READ_ONCE(*v);
                WRITE_ONCE(*v, 0); /* kfree(). */
        }
}

locations [a;b;v;0:r1;0:r8;1:r1;1:r8]
exists (0:r0=1:r0 \/ (* Both or neither did kfree(). *)
        v=1 \/ (* Neither did kfree, redundant check. *)
        0:r2=0 \/ 1:r2=0 \/  (* Both did kfree, redundant check. *)
        0:r9a=0 \/ 0:r9b=0 \/ 1:r9a=0 \/ (* CPU1 use after free. *)
        1:r9b=0 \/ 1:r9c=0) (* CPU2 use after free. *)

Reply via email to