For those still following along, I'm attaching the revised patch that I'm proposing to submit. The primary difference from the previous version is that I've switched from checking the 'const' flag to checking for 'const in' intent in remoteValueForwarding.cpp and updated the comment to reflect the change.

Please give a shout if there are any reservations against committing this (though I may wait until tomorrow anyway so that any performance impacts will be better isolated from today's changes).

Thanks,
-Brad


On Fri, 14 Mar 2014, Brad Chamberlain wrote:


My concern with this patch's change to passByRef() is that flattening will pass large records by value. Which is exactly what our 'const' intent is intended to eliminate. Seems inconsistent to me.

I don't think that's right. We pass blank and const records by reference at the C level; as far as I know, there's no need to change the formal type of the argument to 'ref' (as this routine is attempting to do) in order to get this effect. As we've discussed, the Chapel IR is intended to be reprenting Chapel semantics (at least at this point), not C semantics.

I can spot-check a specific example/scenario to verify that this is still the case if you have one in mind.



But your mention of 'const ref' and semantic grounds has got me
thinking...  'const' only means that the callee won't modify the argument,
not that the caller won't do so, so I'm not convinced that we can safely
r.v.f. 'const ref' arguments.  In particular, the caller may still be
modifying its copy, and a sync var access may be used to force the callee
to see those updates.

Brad, this reminds me of an off-list discussion we had, where you said that (a) such a scenario is not likely and (b) then the user shouldn't be using 'const'/'const ref'.

If the callee is supposed to see the caller's updates to a 'const ref' formal, then the 'const ref' formal should not be marked with INTENT_FLAG_CONST. Because I'd like our compiler to treat/optimize as consts all things that are marked with INTENT_FLAG_CONST. As opposed to than checking INTENT_FLAG_CONST && !INTENT_FLAG_REF or some such. Because the latter would go against the intention of introducing INTENT_FLAG_CONST, to begin with.

Recall that the last time we discussed this broadly as a group, it was decided that the 'const'-ness of an argument was only supposed to reflect the callee's intentions for that argument, not the caller's. To know that the actual is const, you'd need to analyze the callsites and can't expect to see this in the intent's bitmask (though we could store it in some compiler-generated meta-information created by an analysis pass). The fact that I was checking the bit expecting to know something about the actual was my mistake and the reason that I'm now testing and planning to commit a more specific check for CONST_IN.

-Brad

Index: test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.cyclic.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.cyclic.good	(revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.cyclic.good	(working copy)
@@ -1,2 +1,2 @@
 (get = 0, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 2, fork_fast = 0, fork_nb = 0) (get = 59, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 25, fork_fast = 0, fork_nb = 0) (get = 59, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 24, fork_fast = 0, fork_nb = 0) (get = 59, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 24, fork_fast = 0, fork_nb = 0)
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 21, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 21, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 21, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 20, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 20, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 20, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.good	(revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.good	(working copy)
@@ -1 +1 @@
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 14, fork_fast = 0, fork_nb = 12) (get = 374, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 99, fork_fast = 0, fork_nb = 0) (get = 374, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0) (get = 374, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 14, fork_fast = 0, fork_nb = 12) (get = 373, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 99, fork_fast = 0, fork_nb = 0) (get = 373, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0) (get = 373, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduce.block.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduce.block.good	(revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduce.block.good	(working copy)
@@ -1 +1 @@
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.block.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.block.good	(revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.block.good	(working copy)
@@ -1,2 +1,2 @@
 (get = 0, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 2, fork_fast = 0, fork_nb = 0) (get = 28, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 1, fork_fast = 0, fork_nb = 0) (get = 28, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 0) (get = 28, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 0)
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.good	(revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.good	(working copy)
@@ -1 +1 @@
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 11, fork_fast = 0, fork_nb = 15) (get = 210, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 12, fork_fast = 0, fork_nb = 0) (get = 210, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0) (get = 210, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 11, fork_fast = 0, fork_nb = 15) (get = 209, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 12, fork_fast = 0, fork_nb = 0) (get = 209, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0) (get = 209, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0)
Index: test/optimizations/remoteValueForwarding/bradc/testconstlocality.chpl
===================================================================
--- test/optimizations/remoteValueForwarding/bradc/testconstlocality.chpl	(revision 0)
+++ test/optimizations/remoteValueForwarding/bradc/testconstlocality.chpl	(revision 0)
@@ -0,0 +1,27 @@
+use BlockDist;
+
+const GLOBCONST = 2**4;
+
+config const n = 10;
+config const checkSync = false;
+
+var sync$: sync bool = true;
+
+const D = {1..n} dmapped Block({1..n});
+
+var A: [D] real;
+
+testit(GLOBCONST);
+
+proc testit(valblc: int) {
+  forall (i,a) in zip(D,A) {
+    local {
+      a = valblc+i/10.0;
+      if (here.id == 0) {
+        sync$.readFF();
+      }
+    }
+  }
+}
+
+writeln("A is: ", A);
Index: test/optimizations/remoteValueForwarding/bradc/testconstlocality.good
===================================================================
--- test/optimizations/remoteValueForwarding/bradc/testconstlocality.good	(revision 0)
+++ test/optimizations/remoteValueForwarding/bradc/testconstlocality.good	(revision 0)
@@ -0,0 +1 @@
+A is: 16.1 16.2 16.3 16.4 16.5 16.6 16.7 16.8 16.9 17.0
Index: compiler/passes/flattenFunctions.cpp
===================================================================
--- compiler/passes/flattenFunctions.cpp	(revision 22838)
+++ compiler/passes/flattenFunctions.cpp	(working copy)
@@ -69,6 +69,12 @@
 // Otherwise passing by value is more efficient.
 static bool
 passByRef(Symbol* sym) {
+  //
+  // If it's constant, there's no need to pass it by reference
+  //
+  if (sym->isConstant()) {
+    return false;
+  }
 
   if (sym->hasFlag(FLAG_DISTRIBUTION) ||
       sym->hasFlag(FLAG_DOMAIN) ||
@@ -141,6 +147,22 @@
            an LHS expr. */
         type = type->refType;
       SET_LINENO(sym);
+      //
+      // BLC: TODO: This routine is part of the reason that we aren't
+      // consistent in representing 'ref' argument intents in the AST.
+      // In particular, the code above uses a certain test to decide
+      // to pass something by reference and changes the formal's type
+      // to the corresponding reference type if it believes it should.
+      // But the blankIntentForType() call below (and the INTENT_BLANK
+      // that was used before it) may pass the argument by 'const in'
+      // which seems inconsistent (because most 'ref' formals reflect
+      // INTENT_REF in the current compiler).  My current thought is
+      // to only indicate ref-ness through intents for most of the
+      // compilation (at a Chapel level) and only worry about ref
+      // types very close to code generation, primarily to avoid
+      // inconsistencies like this and keep things more
+      // uniform/simple; but we haven't made this switch yet.
+      //
       ArgSymbol* arg = new ArgSymbol(blankIntentForType(type), sym->name, type);
       if (sym->hasFlag(FLAG_ARG_THIS))
         arg->addFlag(FLAG_ARG_THIS);
Index: compiler/optimizations/remoteValueForwarding.cpp
===================================================================
--- compiler/optimizations/remoteValueForwarding.cpp	(revision 22838)
+++ compiler/optimizations/remoteValueForwarding.cpp	(working copy)
@@ -141,6 +141,31 @@
     return true;
   }
 
+  //
+  // See if this argument is 'const in'; if it is, it's a good
+  // candidate for remote value forwarding.  My current thinking is
+  // that we should not forward 'const ref' arguments because the
+  // const-ness only means that the callee will not modify them, not
+  // that the caller won't.  If someone can successfully argue that
+  // I'm being too conservative, I'm open to that.  My thinking is
+  // that I'd rather find a case that we think we could be r.v.f.'ing
+  // later on than to have to chase down a race condition due to
+  // optimizing too aggressively.
+  //
+  // Why the additional check against 'ref' types?  Because some
+  // compiler-created arguments currently indicate ref-ness only via
+  // the type and not the intent.  See the big comment I added in
+  // addVarsToFormals() (flattenFunctions.cpp) in this same commit for
+  // an example.  A case that currently fails without this test is:
+  //
+  //     test/multilocale/bradc/needMultiLocales/remoteReal.chpl
+  //
+  if (arg->intent == INTENT_CONST_IN &&
+      !arg->type->symbol->hasFlag(FLAG_REF)) {
+    return true;
+  }
+
+
   // We may want to add additional cases here as we discover them
 
   // otherwise, conservatively assume it varies
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to