Hi Chris -
Sorry for the delayed response. It's feature-freeze week for Chapel 1.11,
so everyone's a bit underwater right now. Vassily Litvinov and I
tag-teamed on composing this response. When something is written in the
first person, it's typically me speaking unless noted otherwise.
I’m asking these questions from the perspective of someone who would
write generic parallel libraries similar to Thrust or Threading Building
Blocks. The examples I've included below were designed as tests to
reveal details of Chapel's semantics. They're meant to help identify
ways to write robust generic code that works predictably in all use
cases.
I'll mention at the outset that while generic programming has been a
motivating theme in Chapel since the outset, the support has not yet
reached the state of maturity that we'd hoped for. Specifically, we've
had a multi-year academic collaboration running with the goal of adding
support for constrained generics to Chapel (similar to the proposed C++
'concepts' feature), but that effort has languished, and we're currently
trying to figure out how to staff it internally for the coming year in
order to gain some traction. We rely on generics heavily in our own
libraries (e.g., our support for domain maps), but in a way that suffers
from many of the similar issues as C++ templates.
More generally in response to your comments, I want to say that those of
us who originally architected the language came at it much more from the
perspective of parallel programmers seeking to significantly improve the
status quo rather than as Programming Language experts (with a capital
"PL"). At times, laxness in our specification can be attributed to this
perspective or lack of expertise. We're very open to having those who are
more "expert PL" types give us feedback to help improve the language,
proposing changes to the specification to make it clearer/more
bullet-proof, etc.
1. Higher-order functions
I wrote a test to see if higher-order functions worked. Compiling this
code gives me internal error CAL0056. What should happen in this code?
Are higher-order functions supported?
proc times2(x) {
return x * 2;
}
proc map(f, t) {
return (f(t[0]), f(t[1]))
}
map(times2, (1, 1+1i))
Chapel's current support for higher-order functions was developed as an
intern project that has not received (m)any additional development cycles
since then. It continues to be something that we would like to support
more fully, but which has not been on the critical path for
current/prospective users or for our own internal use cases.
One limitation of the current implementation is that generic functions,
such as your times2(), cannot be used as higher-order functions. For
details, please see:
$CHPL_HOME/doc/technotes/README.firstClassFns
which calls out this limitation.
Practically speaking, I think the reason that this hasn't been more of a
limiting factor for our own use is that we tend to turn such cases
inside-out, e.g., by providing iterators for user-defined data structures
and calling the 'times2' function in the body of the loop rather than
passing the function into the a procedure to execute the loop. Thus, we'd
end up with something like:
for[all] e in myADT do
times2(e);
And/or, we'd define the data structure to be promotable, in which case,
we'd simply write:
times2(myADT);
However, both of these approaches are currently fairly specific to
homogeneous data structures (and will be until there's more of a
first-class loop unrolling story for user-defined iterators).
None of this is to say that we wouldn't value higher-order functions as
well, I'm simply capturing why I think that their lack hasn't been a major
barrier to our own support for rich (homogeneous) data structures to date.
2. Semantics of references and values
I had trouble understanding when Chapel uses value vs. reference
semantics.
Let's start with your inferences first and then explain what's happening
in your example:
What I infer from this is that
* Domains are mutable objects
Yes, domain variables are mutable. Whether domains are objects depends on
your definition of "objects." They are not objects in the Java sense.
* Domain variables hold references to domains
* Arrays of domains hold references to domains
This isn't quite right. Domain and array variables have value-oriented
semantics in most every context. The value of a domain can be thought of
as the set of indices that it describes; the value of an array is
effectively the collection of elements that it describes and their values.
The identity of a domain also matters in certain contexts -- primarily
when it is the domain that defines an array's index set. In declaring an
array, the identity (not value) of its domain matters and so is captured
and forms an ongoing property of the array throughout its lifetime (we say
that the domain's identity is part of its type).
* Assignment on domains (lhs = rhs;) does not modify the lhs reference, but
copies the value of the rhs object into the value of the lhs object
That's correct. It essentially makes the lhs domain describe the same
indices as the rhs domain, while each retains its individual identity and
type (e.g., one could be a distributed domain and the other local).
For instance, in the following situation, resizing an array exposes the
semantic difference. At the beginning of the code, A[5] is the domain
of B.
// Create array B whose size is given by A[5].
var myDom = {1..5};
var A : [myDom] myDom.type;
var B : [A[5]] int;
// B can be resized by assigning to A[5].
A[5] = {1..3};
writeln(B.domain);
// After resizing A, B can no longer be resized by assigning to A[5].
myDom = {1..4};
myDom = {1..5};
A[5] = {1..2};
writeln(B.domain);
Here's what's going on with your example:
When you shrink A by re-assigning its domain to {1..4}, A[5] is no longer
a valid element of the array. Typically, this means that that domain
would cease to exist; however, since B was declared using A[5] as its
domain and B requires that domain for its definition, the domain that A[5]
described is kept around even though A[5] can't be used to refer to it
anymore.
When you grow A's domain back to {1..5}, a new domain value is created for
that fifth element, but its identity is not in any way associated with the
previous A[5] value, nor with B, so subsequent assignments to it don't
affect B.
If it helps, Chapel's domains are currently implemented as a record that
wraps a reference-counted class. This gives them the mix of value and
reference semantics that I'm describing and you're seeing above. B's
reference to the original A[5] domain keeps that domain value alive, but
when a new A[5] domain is created there is no relationship between the two
things.
I'll mention that there's been discussion of changing Chapel's semantics
so that having A[5] go away like that would render B unusable (i.e., "user
error to refer to an array whose domain no longer exists") rather than the
current scheme, to simplify the semantics and implementation. It's not
clear to me which way this will go yet, and if you have input, we'd be
happy to hear it.
At the end, is it possible to restore the situation that A[5] is the
domain of B?
Nope.
Is there a way to modify the reference held in a domain variable, instead
of modifying the domain that it references?
Not within the language, only by mucking with the internals of the
implementation.
Incidentally, domain assignment leads to some oddness because domain
expressions are lvalues. The statement ({0..1}) = {0..2}; compiles, while
(1) = 2; doesn’t compile. Without the parentheses, both are syntax errors.
I'd call this a bug, and I don't think it's one we're aware of -- thanks
for pointing it out. Personally, I don't view this as a bug in domain
assignment so much in const-ness checking and/or how the current compiler
implements domain literals like {0..1}, but that's just a guess.
3. Subclassing
It looks like Chapel has subclassing with virtual methods. I tried it out
with the code shown below, which tests how generics interact with
subclassing. It gives internal error CHE0496. What does that error mean?
class Base {
proc foo(type T) : int;
}
class Derived1 {
proc foo(type T) : int {
var x : T;
}
};
class Derived2 {
proc foo(type T) : int {
var x : T;
var y : T;
}
};
var x : Base;
if (stdin.read(bool)) {
x = new Derived1();
}
else {
x = new Derived2();
}
x.foo((int, int));
Internal errors mean you hit a bug in the compiler. "CHE0496" directs us
to where in the compiler it occured; this has no intended meaning to the
end user. Throwing the --developer flag decrypts it slightly, but
throwing such cases to us is the right thing to do.
This program should generate a clear user error.
In the current implementation/specification, procedure prototypes (those
with no body like your Base.foo()) are supported for interoperability
purposes only and are not intended as a means of creating a pure virtual
function.
If I change your code as follows:
* add a body to Base.foo()
* add return statements to the foo() overloads
* declare the Derived classes as being derived from Base
then it compiles without errors:
class Base {
proc foo(type T) : int { return -1; }
}
class Derived1 : Base {
proc foo(type T) : int {
var x : T;
return 2;
}
};
class Derived2 : Base {
proc foo(type T) : int {
var x : T;
var y : T;
return 3;
}
};
var x : Base;
if (stdin.read(bool)) {
x = new Derived1();
}
else {
x = new Derived2();
}
x.foo((int, int));
4. Dependent types
One nice feature of Chapel is that function parameters can be used in the
types of other parameters. This makes it possible to impose constraints on
array domains, such as writing a function where all array parameters have
equal domains. But why is it sometimes an error for a parameter to
reference an earlier parameter, while other times it’s an error for a
parameter to reference a later parameter?
I believe the intention is that a later argument should always be able to
refer to an earlier one -- the only cases I'm aware of where this is not
supported are simply cases that we haven't gotten to yet (bugs /
unimplemented features), rather than intentional decisions.
I believe it should be an error for earlier arguments to refer to later
ones, in order to establish a well-defined order of evaluation. One might
argue that cases in which one can currently get away with this should
result in errors.
Why does the error message say “’reindex’ used before defined”?
// error: ‘reindex’ used before defined
proc bad_1(const sizes : [{0..1}] int, ref A : [sizes[0]] int) {}
This is another bug in the compiler, sorry. 'reindex' is a variable
inserted internally by the compiler, in this case incorrectly.
Note also that sizes[0] is an integer and so cannot be used to specify an
array's domain (you may want 0..sizes[0], for example, though this doesn't
fix the bug).
// First parameter references second, this works
proc good1(ref A : [sizes[0]] int, const sizes : [{0..1}] int) {}
This works (with a tweak) because under the hood 'sizes' is used after
good1() has started. good1() resizes the incoming 'A' to the domain
specified by sizes[0]. The tweak is to make 'sizes' be an array of domains
rather than integers.
// Error: ‘D’ used before defined
proc bad_2(const n : D.idxType, D : domain(1,int,false)) {}
// Second parameter references first, this works but is the opposite order
from good1
proc good2(D : domain(1,int,false), const n : D.idxType) {}
Both examples match our intention. We could probably support your bad_2
example; currently we do not. In most cases, we require each variable to
be defined before it is used.
5. Types
The first sentence of the chapter on types in the language specification
0.96 says, “Chapel is a statically typed language with a rich set of
types.” This sentence is misleading because the type system described in
the language spec is not a static type system. From what I’ve seen, I
think Chapel’s type annotations would be best understood as dynamically
checked assertions.
Chapel’s type system is not a static type system because types are
intermingled with evaluation in a way that prevents types from being
checked, once and for all, at compile time. Verifying type-correctness at
compile time is the point of a static type system. Chapel doesn’t do
that. Moreover, types can have side effects and can depend on mutable
values, which pretty much precludes Chapel from doing that in the future.
What are Chapel’s types, really?
Vass writes: I agree, in some cases we rely on types at run time. For
example when casting between class types or when performing array
operations. In most other cases, our intention is to ensure type safety at
compile time, making dynamic type checking unnecessary.
Brad writes:
One closing comment that's also a bit of an open question: You mentioned
using an array argument's formal type as a means of specifying a
constraint on that variable. I.e., one might read:
proc foo(X: [?D] int, Y: [D] real) { ... }
as being a constraint that X and Y have the same domain. At present, when
a formal argument type names a specific rectangular domain like this, it
constrains the array argument to have that size/shape, but it could be a
different index set, and the compiler "reindexes" the array such that it
can be accessed using D's indices within the function. Thus, the above
could be called with:
var A: [1..3] real;
var B: [1..10] real;
foo(A, B[5..7]);
and within foo(), both A and the 5..7 slice of B could be indexed using
the indices 1..3.
I mention this both because it's a common misunderstanding (and one that
your mail seemed to suggest you'd fallen into), but also because we've
been planning on splitting these two cases ("Constrain the actual to have
this domain" vs. "reindex the actual to have this domain") into two
separate things to avoid confusion and because the second interpretation
has a much higher runtime overhead in general than the first (and yet, is
not the common case).
Thanks for your questions and bugs (which we'll file). If you have
follow-up questions or proposed improvements to the language /
specification / implementation based on this, please let us know (if
you're interested in contributing those proposals as patches against the
repository, all the better!)
Thanks,
-Brad and Vass
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users