Re: [c++std-parallel-1632] Re: Compilers and RCU readers: Once more unto the breach!

2015-05-21 Thread Paul E. McKenney
On Thu, May 21, 2015 at 06:17:43PM +0200, Michael Matz wrote:
 Hi,
 
 On Thu, 21 May 2015, Paul E. McKenney wrote:
 
  The point is -exactly- to codify the current state of affairs.
 
 Ah, I see, so it's not yet about creating a more useful (for compilers, 
 that is) model.

There are several approaches being considered for that as well, but we
do need to codify current usage.

 char * fancy_assign (char *in) { return in; }
 ...
 char *x, *y;
 
 x = atomic_load_explicit(p, memory_order_consume);
 y = fancy_assign (x);
 atomic_store_explicit(q, y, memory_order_relaxed);
   
   So, is there, or is there not a dependency carried from x to y in your 
   proposed model (and which rule in your document states so)?  Clearly, 
   without any other language the compiler would have to assume that there 
   is 
   (because the equivalent 'y = x' assignment would carry the dependency).  
  
  The dependency is not carried, though this is due to the current set
  of rules not covering atomic loads and stores, which I need to fix.
 
 Okay, so with the current regime(s), the dependency carries ...

Yes, that is the intent.

  o   Rule 14 says that if a value is part of a dependency chain and
  is used as the actual parameter of a function call, then the
  dependency chain extends to the corresponding formal parameter,
  namely in of fancy_assign().
  
  o   Rule 15 says that if a value is part of a dependency chain and
  is returned from a function, then the dependency chain extends
  to the returned value in the calling function.
  
  o   And you are right.  I need to make the first and second rules
  cover the relaxed atomic operations, or at least atomic loads and
  stores.  Not that this is an issue for existing Linux-kernel code.
  
  But given such a change, the new version of rule 2 would
  extend the dependency chain to cover the atomic_store_explicit().
 
 ... (if this detail would be fixed).  Okay, that's quite awful ...
 
   If it has to assume this, then the whole model is not going to work 
   very well, as usual with models that assume a certain less-optimal 
   fact (carries-dep is less optimal for code generation purposes that 
   not-carries-dep) unless very specific circumstances say it can be 
   ignored.
  
  Although that is a good general rule of thumb, I do not believe that it 
  applies to this situation, with the exception that I do indeed assume 
  that no one is insane enough to do value-speculation optimizations for 
  non-NULL values on loads from pointers.
  
  So what am I missing here?
 
 ... because you are then missing that if carries-dep can flow through 
 function calls from arguments to return values by default, the compiler 
 has to assume this in fact always happens when it can't see the function 
 body, or can't analyze it.  In effect that's making the whole carries-dep 
 stops at these and those uses a useless excercise because a malicious 
 user (malicious in the sense of abusing the model to show that it's 
 hindering optimizations), i.e. me, can hide all such carries-dep stopping 
 effects inside a function, et voila, the dependecy carries through.  So 
 for a slightly more simple example:
 
   extern void *foo (void *);  // body not available
   x = load
   y = foo (x);
   store (y)
 
 the compiler has to assume that there's a dep-chain from x to y; always.  

Yes, the compiler does have to make this assumption.  And the intent
behind the rules is to ensure that this assumption does not get in the
way of reasonable optimizations.  So although I am sure that you are as
busy as the rest of us, I really do need you to go through the rules in
detail before you get too much more excited about this.

 What's worse, it also has to assume a carries-dep for this:
 
   extern void foo (void *in, void **out1, void **out2);
   x = load
   foo (x, o1, o2);
   store (o1);
   store (o2);
 
 Now the compiler has to assume that the body of 'foo' is just mean enough 
 to make the dep-chain carry from in to *out1 or *out2 (i.e. it has to 
 assume that for both).  This extends to _all_ memory accessible from foo's 
 body, i.e. generally all global and all local address-taken variables, so 
 as soon as you have a function call into which a dep-chain value flows 
 you're creating a dep-chain extension from that value to each and every 
 global piece of memory, because the compiler cannot assume that the 
 black box called foo is not mean.  This could conceivably be stopped by 
 making normal stores not to carry the dependency; then only the return 
 value might be infected; but I don't see that in your rules, as a normal 
 store is just an assigment in your model and hence rules 1 and 2 apply 
 (that is, carries-dep flows through all assignments, incl. loads and 
 stores).
 
 Basically whenever you can construct black boxes for the compiler, you 
 have to limit their effects on such transitive relations like carries-dep 
 by default, 

Re: [c++std-parallel-1632] Re: Compilers and RCU readers: Once more unto the breach!

2015-05-21 Thread Paul E. McKenney
On Thu, May 21, 2015 at 04:22:38PM +0200, Michael Matz wrote:
 Hi,
 
 On Wed, 20 May 2015, Paul E. McKenney wrote:
 
I'm not sure... you'd require the compiler to perform static analysis of
loops to determine the state of the machine when they exit (if they 
exit!)
in order to show whether or not a dependency is carried to subsequent
operations. If it can't prove otherwise, it would have to assume that a
dependency *is* carried, and it's not clear to me how it would use this
information to restrict any subsequent dependency removing 
optimisations.
   
   It'd just convert consume to acquire.
  
  It should not need to, actually.
 
 [with GCC hat, and having only lightly read your document]

Understood.

 Then you need to provide language or at least informal reasons why the 
 compiler is allowed to not do that.  Without that a compiler would have to 
 be conservative, if it can't _prove_ that a dependency chain is stopped, 
 then it has to assume it hasn't.
 
 For instance I can't really make out easily what your document says about 
 the following simple situation (well, actually I have difficulties to 
 differ between what you're proposing as the good-new model of this all, 
 and what you're merely describing as different current states of affair):

The point is -exactly- to codify the current state of affairs.  I expect
a follow-on effort to specify some sort of marking regimen, as noted in
the last paragraph of 7.9 and as discussed with Torvald Riegel.  However,
given that there are not yet any implementations or practical experience
with such markings, I suspect that some time will be required to hammer
out a good marking scheme.

   char * fancy_assign (char *in) { return in; }
   ...
   char *x, *y;
   
   x = atomic_load_explicit(p, memory_order_consume);
   y = fancy_assign (x);
   atomic_store_explicit(q, y, memory_order_relaxed);
 
 So, is there, or is there not a dependency carried from x to y in your 
 proposed model (and which rule in your document states so)?  Clearly, 
 without any other language the compiler would have to assume that there is 
 (because the equivalent 'y = x' assignment would carry the dependency).  

The dependency is not carried, though this is due to the current set
of rules not covering atomic loads and stores, which I need to fix.
Here is the sequence of events:

o   A memory_order_consume load heads a dependency chain.

o   Rule 2 says that if a value is part of a dependency chain and
is used as the right-hand side of an assignment operator,
the expression extends the chain to cover the assignment.
And I switched to numbered bullet items here:
http://www2.rdrop.com/users/paulmck/RCU/consume.2015.05.21a.pdf

o   Rule 14 says that if a value is part of a dependency chain and
is used as the actual parameter of a function call, then the
dependency chain extends to the corresponding formal parameter,
namely in of fancy_assign().

o   Rule 15 says that if a value is part of a dependency chain and
is returned from a function, then the dependency chain extends
to the returned value in the calling function.

o   And you are right.  I need to make the first and second rules
cover the relaxed atomic operations, or at least atomic loads and
stores.  Not that this is an issue for existing Linux-kernel code.

But given such a change, the new version of rule 2 would
extend the dependency chain to cover the atomic_store_explicit().

 If it has to assume this, then the whole model is not going to work very 
 well, as usual with models that assume a certain less-optimal fact 
 (carries-dep is less optimal for code generation purposes that 
 not-carries-dep) unless very specific circumstances say it can be 
 ignored.

Although that is a good general rule of thumb, I do not believe that it
applies to this situation, with the exception that I do indeed assume
that no one is insane enough to do value-speculation optimizations for
non-NULL values on loads from pointers.

So what am I missing here?  Do you have a specific example where the
compiler would need to suppress a production-quality optimization?

Thanx, Paul



Re: [c++std-parallel-1632] Re: Compilers and RCU readers: Once more unto the breach!

2015-05-21 Thread Michael Matz
Hi,

On Wed, 20 May 2015, Paul E. McKenney wrote:

   I'm not sure... you'd require the compiler to perform static analysis of
   loops to determine the state of the machine when they exit (if they exit!)
   in order to show whether or not a dependency is carried to subsequent
   operations. If it can't prove otherwise, it would have to assume that a
   dependency *is* carried, and it's not clear to me how it would use this
   information to restrict any subsequent dependency removing optimisations.
  
  It'd just convert consume to acquire.
 
 It should not need to, actually.

[with GCC hat, and having only lightly read your document]

Then you need to provide language or at least informal reasons why the 
compiler is allowed to not do that.  Without that a compiler would have to 
be conservative, if it can't _prove_ that a dependency chain is stopped, 
then it has to assume it hasn't.

For instance I can't really make out easily what your document says about 
the following simple situation (well, actually I have difficulties to 
differ between what you're proposing as the good-new model of this all, 
and what you're merely describing as different current states of affair):

  char * fancy_assign (char *in) { return in; }
  ...
  char *x, *y;
  
  x = atomic_load_explicit(p, memory_order_consume);
  y = fancy_assign (x);
  atomic_store_explicit(q, y, memory_order_relaxed);

So, is there, or is there not a dependency carried from x to y in your 
proposed model (and which rule in your document states so)?  Clearly, 
without any other language the compiler would have to assume that there is 
(because the equivalent 'y = x' assignment would carry the dependency).  

If it has to assume this, then the whole model is not going to work very 
well, as usual with models that assume a certain less-optimal fact 
(carries-dep is less optimal for code generation purposes that 
not-carries-dep) unless very specific circumstances say it can be 
ignored.


Ciao,
Michael.


Re: [c++std-parallel-1632] Re: Compilers and RCU readers: Once more unto the breach!

2015-05-21 Thread Michael Matz
Hi,

On Thu, 21 May 2015, Paul E. McKenney wrote:

 The point is -exactly- to codify the current state of affairs.

Ah, I see, so it's not yet about creating a more useful (for compilers, 
that is) model.

char * fancy_assign (char *in) { return in; }
...
char *x, *y;

x = atomic_load_explicit(p, memory_order_consume);
y = fancy_assign (x);
atomic_store_explicit(q, y, memory_order_relaxed);
  
  So, is there, or is there not a dependency carried from x to y in your 
  proposed model (and which rule in your document states so)?  Clearly, 
  without any other language the compiler would have to assume that there is 
  (because the equivalent 'y = x' assignment would carry the dependency).  
 
 The dependency is not carried, though this is due to the current set
 of rules not covering atomic loads and stores, which I need to fix.

Okay, so with the current regime(s), the dependency carries ...

 o Rule 14 says that if a value is part of a dependency chain and
   is used as the actual parameter of a function call, then the
   dependency chain extends to the corresponding formal parameter,
   namely in of fancy_assign().
 
 o Rule 15 says that if a value is part of a dependency chain and
   is returned from a function, then the dependency chain extends
   to the returned value in the calling function.
 
 o And you are right.  I need to make the first and second rules
   cover the relaxed atomic operations, or at least atomic loads and
   stores.  Not that this is an issue for existing Linux-kernel code.
 
   But given such a change, the new version of rule 2 would
   extend the dependency chain to cover the atomic_store_explicit().

... (if this detail would be fixed).  Okay, that's quite awful ...

  If it has to assume this, then the whole model is not going to work 
  very well, as usual with models that assume a certain less-optimal 
  fact (carries-dep is less optimal for code generation purposes that 
  not-carries-dep) unless very specific circumstances say it can be 
  ignored.
 
 Although that is a good general rule of thumb, I do not believe that it 
 applies to this situation, with the exception that I do indeed assume 
 that no one is insane enough to do value-speculation optimizations for 
 non-NULL values on loads from pointers.
 
 So what am I missing here?

... because you are then missing that if carries-dep can flow through 
function calls from arguments to return values by default, the compiler 
has to assume this in fact always happens when it can't see the function 
body, or can't analyze it.  In effect that's making the whole carries-dep 
stops at these and those uses a useless excercise because a malicious 
user (malicious in the sense of abusing the model to show that it's 
hindering optimizations), i.e. me, can hide all such carries-dep stopping 
effects inside a function, et voila, the dependecy carries through.  So 
for a slightly more simple example:

  extern void *foo (void *);  // body not available
  x = load
  y = foo (x);
  store (y)

the compiler has to assume that there's a dep-chain from x to y; always.  
What's worse, it also has to assume a carries-dep for this:

  extern void foo (void *in, void **out1, void **out2);
  x = load
  foo (x, o1, o2);
  store (o1);
  store (o2);

Now the compiler has to assume that the body of 'foo' is just mean enough 
to make the dep-chain carry from in to *out1 or *out2 (i.e. it has to 
assume that for both).  This extends to _all_ memory accessible from foo's 
body, i.e. generally all global and all local address-taken variables, so 
as soon as you have a function call into which a dep-chain value flows 
you're creating a dep-chain extension from that value to each and every 
global piece of memory, because the compiler cannot assume that the 
black box called foo is not mean.  This could conceivably be stopped by 
making normal stores not to carry the dependency; then only the return 
value might be infected; but I don't see that in your rules, as a normal 
store is just an assigment in your model and hence rules 1 and 2 apply 
(that is, carries-dep flows through all assignments, incl. loads and 
stores).

Basically whenever you can construct black boxes for the compiler, you 
have to limit their effects on such transitive relations like carries-dep 
by default, at the border of such black boxes; otherwise that transitive 
relation quickly becomes an most-x-everything relation (i.e. 
mostthings carries-dep to everything), and as such is then totally 
useless, because such a universally filled relation (like an empty one) 
doesn't bear any interesting information, at which point it then is 
questionably why the compiler should jump through hoops to analyse the few 
cases that would be allowed to stop the carries-dep flow, when it more 
often than not have to give up anyway and generate slow code.

 Do you have a specific example where the compiler would need to 

Re: [c++std-parallel-1632] Re: Compilers and RCU readers: Once more unto the breach!

2015-05-20 Thread Paul E. McKenney
On Wed, May 20, 2015 at 04:54:51PM +0100, Andrew Haley wrote:
 On 05/20/2015 04:46 PM, Will Deacon wrote:
  I'm not sure... you'd require the compiler to perform static analysis of
  loops to determine the state of the machine when they exit (if they exit!)
  in order to show whether or not a dependency is carried to subsequent
  operations. If it can't prove otherwise, it would have to assume that a
  dependency *is* carried, and it's not clear to me how it would use this
  information to restrict any subsequent dependency removing optimisations.
 
 It'd just convert consume to acquire.

It should not need to, actually.

Thanx, Paul