Hi Chapel Developers --
This is a request for review.
Size of patch = 2 out of 7.
Complexity = 4 out of 7.
The attached patch makes the insertion of formals to represent variables
when de-nesting functions a little less conservative by strengthening the
check for const-ness and not making the argument type a ref type if it is
const. This reduces the number of unnecessary references we use for
nested functions (user or compiler-introduced).
This in turn improves remote value forwarding because it permits us to
remote value forward constants more aggressively in the presence of sync
variable accesses. For the 'reduce*' tests in the distribution robustness
performance suite, it resulted in one fewer get per locale.
I added an editorial note in the comments pointing out that making a
formal have a 'ref' type but a 'blank' intent, as we currently do in
flattenFunctions.cpp, is arguably inconsistent but left that as future
work (perhaps near-future work, as we've talked about pushing the
insertion of 'ref' types very close to codegen in recent discussions). See
the comment itself for more detail.
I also added a test that presents a simplified version of the case that
caused me to note the lack of remote value forwarding (co-written by
Vass). This test at present works with or without r.v.f., but once formal
temps are removed (coming soon to a trunk near you), lack of r.v.f. breaks
this test (and SSCA2, from which it was derived).
-Brad
Index: test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.cyclic.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.cyclic.good (revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.cyclic.good (working copy)
@@ -1,2 +1,2 @@
(get = 0, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 2, fork_fast = 0, fork_nb = 0) (get = 59, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 25, fork_fast = 0, fork_nb = 0) (get = 59, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 24, fork_fast = 0, fork_nb = 0) (get = 59, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 24, fork_fast = 0, fork_nb = 0)
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 21, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 21, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 21, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 20, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 20, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0) (get = 20, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 6, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.good (revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.good (working copy)
@@ -1 +1 @@
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 14, fork_fast = 0, fork_nb = 12) (get = 374, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 99, fork_fast = 0, fork_nb = 0) (get = 374, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0) (get = 374, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 14, fork_fast = 0, fork_nb = 12) (get = 373, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 99, fork_fast = 0, fork_nb = 0) (get = 373, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0) (get = 373, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 95, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduce.block.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduce.block.good (revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduce.block.good (working copy)
@@ -1 +1 @@
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.block.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.block.good (revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceAlias.block.good (working copy)
@@ -1,2 +1,2 @@
(get = 0, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 2, fork_fast = 0, fork_nb = 0) (get = 28, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 1, fork_fast = 0, fork_nb = 0) (get = 28, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 0) (get = 28, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 0)
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 6, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 0, fork_fast = 0, fork_nb = 3) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0) (get = 5, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 4, fork_fast = 0, fork_nb = 0)
Index: test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.good
===================================================================
--- test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.good (revision 22838)
+++ test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.good (working copy)
@@ -1 +1 @@
-(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 11, fork_fast = 0, fork_nb = 15) (get = 210, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 12, fork_fast = 0, fork_nb = 0) (get = 210, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0) (get = 210, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0)
+(get = 9, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 0, fork = 11, fork_fast = 0, fork_nb = 15) (get = 209, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 12, fork_fast = 0, fork_nb = 0) (get = 209, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0) (get = 209, get_nb = 0, get_nb_test = 0, get_nb_wait = 0, put = 3, fork = 8, fork_fast = 0, fork_nb = 0)
Index: test/optimizations/remoteValueForwarding/bradc/testconstlocality.chpl
===================================================================
--- test/optimizations/remoteValueForwarding/bradc/testconstlocality.chpl (revision 0)
+++ test/optimizations/remoteValueForwarding/bradc/testconstlocality.chpl (revision 0)
@@ -0,0 +1,27 @@
+use BlockDist;
+
+const GLOBCONST = 2**4;
+
+config const n = 10;
+config const checkSync = false;
+
+var sync$: sync bool = true;
+
+const D = {1..n} dmapped Block({1..n});
+
+var A: [D] real;
+
+testit(GLOBCONST);
+
+proc testit(valblc: int) {
+ forall (i,a) in zip(D,A) {
+ local {
+ a = valblc+i/10.0;
+ if (here.id == 0) {
+ sync$.readFF();
+ }
+ }
+ }
+}
+
+writeln("A is: ", A);
Index: test/optimizations/remoteValueForwarding/bradc/testconstlocality.good
===================================================================
--- test/optimizations/remoteValueForwarding/bradc/testconstlocality.good (revision 0)
+++ test/optimizations/remoteValueForwarding/bradc/testconstlocality.good (revision 0)
@@ -0,0 +1 @@
+A is: 16.1 16.2 16.3 16.4 16.5 16.6 16.7 16.8 16.9 17.0
Index: compiler/passes/flattenFunctions.cpp
===================================================================
--- compiler/passes/flattenFunctions.cpp (revision 22838)
+++ compiler/passes/flattenFunctions.cpp (working copy)
@@ -69,6 +69,12 @@
// Otherwise passing by value is more efficient.
static bool
passByRef(Symbol* sym) {
+ //
+ // If it's constant, there's no need to pass it by reference
+ //
+ if (sym->isConstant()) {
+ return false;
+ }
if (sym->hasFlag(FLAG_DISTRIBUTION) ||
sym->hasFlag(FLAG_DOMAIN) ||
@@ -141,6 +147,22 @@
an LHS expr. */
type = type->refType;
SET_LINENO(sym);
+ //
+ // BLC: TODO: This routine is part of the reason that we aren't
+ // consistent in representing 'ref' argument intents in the AST.
+ // In particular, the code above uses a certain test to decide
+ // to pass something by reference and changes the formal's type
+ // to the corresponding reference type if it believes it should.
+ // But the blankIntentForType() call below (and the INTENT_BLANK
+ // that was used before it) may pass the argument by 'const in'
+ // which seems inconsistent (because most 'ref' formals reflect
+ // INTENT_REF in the current compiler). My current thought is
+ // to only indicate ref-ness through intents for most of the
+ // compilation (at a Chapel level) and only worry about ref
+ // types very close to code generation, primarily to avoid
+ // inconsistencies like this and keep things more
+ // uniform/simple; but we haven't made this switch yet.
+ //
ArgSymbol* arg = new ArgSymbol(blankIntentForType(type), sym->name, type);
if (sym->hasFlag(FLAG_ARG_THIS))
arg->addFlag(FLAG_ARG_THIS);
Index: compiler/optimizations/remoteValueForwarding.cpp
===================================================================
--- compiler/optimizations/remoteValueForwarding.cpp (revision 22838)
+++ compiler/optimizations/remoteValueForwarding.cpp (working copy)
@@ -141,6 +141,20 @@
return true;
}
+ //
+ // See if this argument is const; if it is, it's a good candidate
+ // for remote value forwarding. Why the check against 'ref' types?
+ // Because some compiler-created arguments currently indicate
+ // ref-ness only via the type and not the intent. See the big
+ // comment I added in addVarsToFormals() (flattenFunctions.cpp) in
+ // this same commit for an example.
+ //
+ if (arg->intent & INTENT_FLAG_CONST &&
+ !arg->type->symbol->hasFlag(FLAG_REF)) {
+ return true;
+ }
+
+
// We may want to add additional cases here as we discover them
// otherwise, conservatively assume it varies
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers