RE: [concurrency-interest] RFR: 8065804:JEP171:Clarifications/corrections for fence intrinsics

David Holmes Wed, 26 Nov 2014 00:55:29 -0800

Can you expand on that please. All previous discussion of IRIW I have seen 
indicated that the property, while a consequence of existing JMM rules, had no 
practical use.


Thanks,
David
  -----Original Message-----
  From: Roman Elizarov [mailto:[email protected]]
  Sent: Wednesday, 26 November 2014 6:49 PM
  To: [email protected]; Hans Boehm
  Cc: [email protected]; core-libs-dev
  Subject: RE: [concurrency-interest] RFR: 
8065804:JEP171:Clarifications/corrections for fence intrinsics


  There is no conceivable way to kill IRIW consistency requirement while 
retaining ability to prove correctness of large software systems. If IRIW of 
volatile variables are not consistent, then volatile reads and writes are not 
linearizable, which breaks linearizabiliy of all higher-level primitives build 
on top of them and makes formal reasoning about behavior of concurrent systems 
practically impossible. There are many fields where this is not acceptable.

   

  /Roman

   

  From: [email protected] 
[mailto:[email protected]] On Behalf Of David Holmes
  Sent: Wednesday, November 26, 2014 5:11 AM
  To: Hans Boehm
  Cc: [email protected]; core-libs-dev
  Subject: Re: [concurrency-interest] RFR: 8065804: 
JEP171:Clarifications/corrections for fence intrinsics

   

  Hi Hans,

   

  Given IRIW is a thorn in everyone's side and has no known useful benefit, and 
can hopefully be killed off in the future, lets not get bogged down in IRIW. 
But none of what you say below relates to multi-copy-atomicity.

   

  Cheers,

  David

    -----Original Message-----
    From: [email protected] [mailto:[email protected]]On Behalf Of Hans 
Boehm
    Sent: Wednesday, 26 November 2014 12:04 PM
    To: [email protected]
    Cc: Stephan Diestelhorst; [email protected]; core-libs-dev
    Subject: Re: [concurrency-interest] RFR: 8065804: 
JEP171:Clarifications/corrections for fence intrinsics

    To be concrete here, on Power, loads can normally be ordered by an address 
dependency or light-weight fence (lwsync).  However, neither is enough to 
prevent the questionable outcome for IRIW, since it doesn't ensure that the 
stores in T1 and T2 will be made visible to other threads in a consistent 
order.  That outcome can be prevented by using heavyweight fences (sync) 
instructions between the loads instead.  Peter Sewell's group concluded that to 
enforce correct volatile behavior on Power, you essentially need a a 
heavyweight fence between every pair of volatile operations on Power.  That 
cannot be understood based on simple ordering constraints. 

     

    As Stephan pointed out, there are similar issues on ARM, but they're less 
commonly encountered in a Java implementation.  If you're lucky, you can get to 
the right implementation recipe by looking at only reordering, I think.

     

     

    On Tue, Nov 25, 2014 at 4:36 PM, David Holmes <[email protected]> 
wrote:

      Stephan Diestelhorst writes:
      >
      > David Holmes wrote:
      > > Stephan Diestelhorst writes:
      > > > Am Dienstag, 25. November 2014, 11:15:36 schrieb Hans Boehm:
      > > > > I'm no hardware architect, but fundamentally it seems to me that
      > > > >
      > > > > load x
      > > > > acquire_fence
      > > > >
      > > > > imposes a much more stringent constraint than
      > > > >
      > > > > load_acquire x
      > > > >
      > > > > Consider the case in which the load from x is an L1 hit, but a
      > > > > preceding load (from say y) is a long-latency miss.  If we enforce
      > > > > ordering by just waiting for completion of prior operation, the
      > > > > former has to wait for the load from y to complete; while the
      > > > > latter doesn't.  I find it hard to believe that this doesn't leave
      > > > > an appreciable amount of performance on the table, at least for
      > > > > some interesting microarchitectures.
      > > >
      > > > I agree, Hans, that this is a reasonable assumption.  Load_acquire x
      > > > does allow roach motel, whereas the acquire fence does not.
      > > >
      > > > >  In addition, for better or worse, fencing requirements on at 
least
      > > > >  Power are actually driven as much by store atomicity issues, as 
by
      > > > >  the ordering issues discussed in the cookbook.  This was not
      > > > >  understood in 2005, and unfortunately doesn't seem to be
      > amenable to
      > > > >  the kind of straightforward explanation as in Doug's cookbook.
      > > >
      > > > Coming from a strongly ordered architecture to a weakly ordered one
      > > > myself, I also needed some mental adjustment about store 
(multi-copy)
      > > > atomicity.  I can imagine others will be unaware of this difference,
      > > > too, even in 2014.
      > >
      > > Sorry I'm missing the connection between fences and multi-copy
      > atomicity.
      >
      > One example is the classic IRIW.  With non-multi copy atomic stores, but
      > ordered (say through a dependency) loads in the following example:
      >
      > Memory: foo = bar = 0
      > _T1_         _T2_         _T3_                              _T4_
      > st (foo),1   st (bar),1   ld r1, (bar)                      ld r3,(foo)
      >                           <addr dep / local "fence" here>   <addr dep>
      >                           ld r2, (foo)                      ld r4, (bar)
      >
      > You may observe r1 = 1, r2 = 0, r3 = 1, r4 = 0 on non-multi-copy atomic
      > machines.  On TSO boxes, this is not possible.  That means that the
      > memory fence that will prevent such a behaviour (DMB on ARM) needs to
      > carry some additional oomph in ensuring multi-copy atomicity, or rather
      > prevent you from seeing it (which is the same thing).

      I take it as given that any code for which you may have ordering
      constraints, must first have basic atomicity properties for loads and
      stores. I would not expect any kind of fence to add multi-copy-atomicity
      where there was none.

      David


      > Stephan
      >
      > _______________________________________________
      > Concurrency-interest mailing list
      > [email protected]
      > http://cs.oswego.edu/mailman/listinfo/concurrency-interest

      _______________________________________________
      Concurrency-interest mailing list
      [email protected]
      http://cs.oswego.edu/mailman/listinfo/concurrency-interest

RE: [concurrency-interest] RFR: 8065804:JEP171:Clarifications/corrections for fence intrinsics

Reply via email to