Re: [bitc-dev] Retrospective Thoughts on BitC

Jonathan S. Shapiro Sun, 25 Mar 2012 13:53:30 -0700

On Sun, Mar 25, 2012 at 11:03 AM, Eli Gottlieb <[email protected]>wrote:


> I've seen everything you said about the compilation model happen in my own
> attempts.  My only way of slinking around that issue has been to output to
> a bitcode able to do link-time resolution of opaque/abstracted type
> representations.
>

Yes. There are a couple of variations on this idea that can give you a
place to stand to validate the cross-compilation-unit requirements. For
BitC, there is the added complication that this interacts with unboxed
types. For *any* language carrying a requirement that "thing X must be
defined in exactly one compilation unit" there is a failure to scale at the
developer level: your group and my group, working in mutual isolation, are
very likely to provide type classes for the same thing. That will work fine
until our respective libraries are somehow linked into a single binary.
Murphy's law ensures that both of the respective library authors will have
moved on to something else at that point, making resolution impossible.


>  This means that I have to allow for some whole-program tracking of
> abstracted representations.
>

And/or deferred compilation in some degree, but yes.


>  It leaves a bad taste in one's mouth, but it can be... workable.
>

Hmm. I no longer believe so, and that's really the topic of the follow-on
postmortem note that is yet to come.


> Its downside is that representation sizes have to be specified for
> abstracted types exported as binary blobs from libraries...
>

... Or handled as by-reference objects...

Yup. If you hum a few bars I suspect we can find a harmony here. :-)


> Ideally, I think we'd compile a New Systems Language down to a Typed
> Assembly Language, and the portion of the compiler dealing with
> polyinstantiation and type abstraction would retain access to generated TAL
> code.  Work on TALs is ongoing.
>

I've gone back and forth on this. One of the nice parts about working
initially in S-expr syntax is that I had seen S-expression based assembly
syntaxes. It gave me just enough room to see that type classes as
constraints can be used to encode safety requirements on span-dependent
resolutions, and that this is a nice basis for deferred compilation.
Further, it allows a single conceptual type system to proceed all the way
down from source to assembly, which (so far as I know) *hasn't* been
managed in a TAL before.

TAL doesn't alter the requirement for whole program, of course.

But you also have to ask what the important value of TAL actually is if you
are starting from a safe language and/or a high-level type system [as in
CLI]. The value isn't really that high if you've already done the typing
elsewhere. The notion of a continuous type system from top to bottom of the
compiler is much more useful, because it provides a sanity check on the
implementation of the entire compiler. Every current TAL-emitting compiler
has an internal type system, but most have different systems at different
layers of abstraction in the compiler, and the inter-stage translations are
unchecked. This is one of the places I'd like to explore further in
BitC-future.


> On the fortunate side, C++ already had an ABI problem with exporting
> classes from shared libraries.  C++ requires both programmers use C++, and
> both to use the same C++ ABI.  Yick.
>

I don't see that as horrifying, but I do agree that something comparable to
CLI with unboxed types that are a bit more first class is certainly a
better way to go. Still: one research problem at a time!


> I have ended up having to "add back" regions into my own language just
> this year.  Pointers have ended up being just another type constructor, and
> they do have to be associated with both a region and a mutability. The good
> news is that once you *do* associate them, the region can be a universal
> variable; a function can be polymorphic in the region it touches by taking
> a pointer as a parameter.
>

Yes. But look carefully at what happened in Cyclone. A lot of people found
them hard to use/understand, and I don't think anybody has done a good
post-mortem on why polymorphic regions didn't play out well there. In
particular, it's not clear whether the problems had to do with defaulting,
typing, or interaction with some other aspect[s] of the language. I really
wish Greg and Trevor would do a post-mortem on that, pro or con and why.

There is a surprising amount of stuff that is doable in the theory that
isn't doable in the programmer's head. Real languages have to bridge that
gap.


> The good news is: inference, polymorphism and subtyping *do* get along.
>  They get along just fine, if you're willing to toss Top, Bottom and
> principal typing out the window *in the special case where HM unification
> would end with a polymorphic generalization*.  Consistently viewing this
> problem as impossible (either a full semi-unification problem or a failure
> of the existence of principal types) has, I've noticed, been holding back
> certain strands of PL thought for a long time.
>

And in practice you don't even need to throw out all of those. For example,
if you require type declarations at module boundaries (which you need for
other reasons), you probably have enough to do the whole thing.

Yes, PL research is being held back by the bias in favor of complete
solutions. In my opinion, *sound* inference is important, but
*complete* inference
is less so.


> Once you have subtyping, you can introduce existential types (possibly
> with a restriction such as "existentials can only be used in the
> this-pointer place") and recover a fair approximation to objects,
> interfaces, BitC capsules, etc.  Since we're dealing with existentials,
> there's also a strong resemblance and possibly even a full morphism to
> first-class modules, thence to type-classes and instances...  And then we
> all get lost in how deep our thoughts are.
>

Ha! Yes on all of that.


> The lexical resolution issues of type-classes (especially with respect to
> their "looking like" object instances) can be dealt with as by Odersky et
> al in "Type-Classes as Objects and Implicits".
>

I looked at Odersky's work several times. I think he's on to something (and
I've said so here and elsewhere), but I'm unhappy with his realization for
various reasons. What he lays out smells a lot like a single-inheritance
class system in which derivation only goes one level, all derived classes
are final, and each has exactly one instance. Which smells to me like
either there ought to be a reconciliation of his work and conventional
single inheritance, or we ought to understand why not. I just keep
scratching an itch that there's something in there we haven't quite seen
yet.


> There might be a good hybrid approach in allowing only one truly global
> type-class instance per type, accessible via a *special* way of capturing
> the instance as an implicit parameter/value, but lexically resolve
> instances when we don't want the One True Instance.
>

This rapidly becomes very nasty as type class constraints propagate. In
particular, consider that the '<' operator in the iterator of a sort
algorithm may be of a different instance that the '<' operator that orders
the collection, even though the index and the elements are of the same
type. *Really* easy to make a mess once you start down that path unless the
instances used can be named explicitly, and perhaps even then.

My general conclusion is that using ambient authority for overload
resolution is a bad idea. That's really the punch line of the next note.


> Moving on, I've got a question: *why* did you tackle the problem of
> inference and mutability together by trying to treat mutability as part of
> the data type?
>

We did and we didn't. The essence of it is that your account of C/C++
behavior isn't quite right. Let's start by ignoring the fact that 'const'
in C/C++ is entirely advisory (and in C# too, because of a construction
mis-specification).

For scalars, mutability is a property of the *location*. That's certainly
true in BitC. It holds up because mutability on the type does *not* unify
at "copy boundaries".

But for aggregates, mutability on a contained field is inextricably part of
the *type*. Which is actually the root of an enormous mess, because then we
will get into a need for metaconstructors. On the one hand, I want to be
able to define an unboxed datatype having the property that *some* fields
are always frozen (e.g. for invariant preservation). On the other, I want
to be able to specify immutable instances of an unboxed datatype in which
the non-frozen fields of that instance are immutable.

In the end, there are two very separate concepts here. The problem is that
we tried to adopt the keyword conventions that were already in use in other
languages, and those conventions hopelessly conflate two things. I wouldn't
do that again.

And I keep thinking that I want the capability (albeit used rarely) to
specify something like a list whose length is constant but whose elements
are not. At this point the mutability specification starts to bear a
surface resemblance to regions...


> Look at it from the C/C++ point of view rather than the ML point of view,
> and it sort of stops making sense.  An integer value can be passed to a
> parameter expecting an integer, whether or not the procedure will modify
> the parameter slot in which it received the integer.  In an imperative
> language, if we want to actually modify the parameter, we pass it by
> reference ("var i: integer" in Pascal) or by passing a pointer ("int *i" in
> C).  In ML or Haskell, I'd say it makes sense to classify mutable
> slots/references to mutable memory cells as their own data-type because
> *everything* other than referenced memory cells is immutable.  Those
> mutable cells are not *real* values: the *reference* to one is a real
> value, and the *contents* of one is a real value, but the memory cell
> itself is not a first-class citizen of the language.
>

Unless you want finer-grain controls than "whole cell is [im]mutable"...
Which we did.


> Anyway, my condolences for the death of your language by starvation of
> funding.  It really sucks to see BitC go when you've learned so much about
> how to do BitC right, but you definitely did manage to make a ton of
> important contributions.  Though, since I should ask, did you ever talk to
> or collaborate with the Cyclone folks?  Habit folks?  ATS folk(s)?
>

If I thought I could have re-worked it in less than two years, I might have
continued.

I talk to Mark Jones (Habit) fairly often. Now that I live in Seattle, we
get together once in a while. Trevor Jim was down the hall from me at Penn
during my doctoral work, and I exchange an occasional rare email with Greg.
We actually *did* spend time talking to Frank Pfennig, but not in the
context of BitC. Haven't had much contact with the ATS folks themselves. My
current feeling about ATS is that it's hopelessly complicated for real
users. Still, it's a very interesting set of ideas, and I look forward to
its refinement over time.


shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Retrospective Thoughts on BitC

Reply via email to