Re: F#

David MacIver Sun, 11 Nov 2007 07:34:21 -0800

On Nov 11, 2007 1:54 PM, Jon Harrop <[EMAIL PROTECTED]> wrote:
>
> On Sunday 11 November 2007 11:39, David MacIver wrote:
> > On Nov 10, 2007 8:42 PM, Jon Harrop <[EMAIL PROTECTED]> wrote:
> > > That is certainly a different comparison but I do not understand why you
> > > would consider it "fairer".
> >
> > Because a) Using explicitly named types has certain advantages, both
> > in terms of the programmer's usage and encoding invariants in the
> > program
>
> Explicit types also have disadvantages. I think it is only fair to consider
> both approaches.


Certainly. I'm perfectly willing to consider implicit types as a
valuable thing. But it's an independent discussion.

> > and b) Your claim that this is an advantage of the "ML family
> > of languages" then actually holds. Currently you're comparing a very
> > specific feature of OCaml (which its immediate descendant F# has
> > inherited).
>
> The ML family of languages basically is OCaml and F# now:

I'm sure that will come as news to the people working on MLTon, Alice, SML/NJ...

> http://ocamlnews.blogspot.com/2007/11/most-popular-functional-languages-on.html

That's using a rather specific metric. I rarely use apt for language
runtimes/compilers I actually care about because it's usually a few
versions behind.

> > ...
> > Nice requires explicit packages for everything. This is a Java
> > interoperability issue.
>
> Ok. Do any languages offer multiple dispatch without this baggage?

Do I need to jump up and down and shout 'CLOS' a few more times? :-)
There are a wide variety of multiple dispatch object systems for
Scheme as well. Clojure appears to offer one, but I don't know what
its approach to packages is. In general, multiple dispatch seems
rather popular in the Lisp world.

If you want a statically typed version there's also Needle (
http://www.nongnu.org/needle/ ), but I've never used it and the
project seems mostly dead, and Jazz ( http://www.exalead.com/jazz/ ),
which I don't know the current state of.

In general there really is a lack of a good language which combines
these features and is actually still alive. My conjecture is that this
is because the OO fans go "Eww, multiple dispatch! Give me single
dispatch or give me death!", the FP people go "Weird. It looks like
pattern matching but... different" and the Lispers go "Oh noes! Itth
thtatically typed! Run!". :-) It's a style which sits in an odd middle
ground that hasn't got much attention.

> What are the implications of the warning? Is the code no longer statically
> typed?

The idea seems to just be to make it clear when one method overrides
another. Because of Nice's ad hoc overloading on top of everything
else it isn't always.

> > >   | e -> string_of () e
> > >
> > > That translated to:
> > >
> > >   <!T> String toString (Int<!T> i) = toString(i.i);
> > >   <!T> String toString (Var<!T> v) = toString(v.t);
> > >   <!T> String toString (Mult<!T> m) = f(m.fst) + f(m.snd);
> > >
> > > So you had to handle every other constructor individually by hand. That
> > > is asymptotically more code.
> >
> > Either you've misunderstood the translation, or I've misunderstood
> > what your code is doing:
>
> I think I misunderstood the translation. Your first two lines define a
> function called "f" that is equivalent to my "string_of_mul" function:

Yes, sorry, I should have followed a similar naming convention.

> <!T> String f(Expr<!T> t) = toString(t);
> override <!T> String f(Add<!T> t) = "(" toString(t) ")";
>
> The first of those two lines is equivalent to my catchall pattern.

Right. The catch alls translate to a default implementation which
others override.

> > Here's a translation of the Nice code.
> > ...
> > // Differentiation of expressions.
> > Expr<'a> diff(Expr<'a> v, 'a x); // Type parameters are now implicit,
> > indicated by starting with '.
>
> You seem to be using both overloading and type inference. I do not believe
> that is possible.

The return types can be inferred because the function is an override
(you still need to specify the return type of the top level version).
The rest can't. The implicit type parameters aren't inferred - it's
just that 'a introduces a new type parameter where it's not already in
scope. It's just syntactic sugar.

> > `*` (Expr<'a> x, Int<'a> 0) = Int 0; // Still have to repeat this. :(
>
> In general, you will have to factor out these arbitrarily-complicated repeated
> expressions into separate functions.

Yes, I think that's probably the case. See comments about how adding
some form of pattern matching on top of this would probably be really
useful.

> > `*` (Expr<'a> x, Mult<'a> u v) = (x * u) * v;
>
> This is relying upon the ability to destructure one level only, right?

There's in principle no reason why you can't nest this arbitrarily.
It's just that the implementation gets more complicated. This code
didn't seem to need it.

> > > From what I've seen, the multiple dispatch approach seems to be
> > > significantly worse (i.e. several times as much code) at handling a small
> > > subset of patterns (shallow and enumerated constructors) but it doesn't
> > > look as if it
> >
> > I've no idea where this "several times as much code" is coming from.
>
> The original 40 vs 5 line comparison.

Ok. But the Nice example is significantly closer. The pseudocode (of
which about 70% can be mechanically translated into Nice and the
remainder requires some more intelligent compilation). The original
code was basically Java. If you're going to argue that Java is more
verbose than OCaml I don't think you'll gain much except a chorus of
agreement. :-)

> > > Also, how do you distinguish between open and closed sum types and how
> > > does it
> >
> > All sum types are open. Types may be final however.
>
> If all sum types are open then you cannot provide static exhaustiveness and
> redundancy checking?

It's true that you can't provide redundancy checking, because
redundancy isn't actually possible. Exhaustiveness can still be
checked though. It's just that one more often needs a default
implementation to guarantee it than one would need catch all patterns
in OCaml.

> I'm not sure I completely understand the problem but can you not simply have:
>
>   Seq.split : ('a -> bool) -> 'a seq -> 'a seq * 'a seq
>
> and a specialized version for arrays:
>
>   Array.split : ('a -> bool) -> 'a array -> 'a array * 'a array

You can. But then if you want a function which depends only on the
ability to split things you can't write one which accepts an arbitrary
sequence and splits it - every time you want to specialise to arrays
you need to write a separate version of that function which only
accepts arrays in order to make use of that. And if you wanted to
write a specialised split for finger trees you'd need to do that to.

Now suppose we wanted to define (for example. I can't actually think
of a use for it :-) ):

(Sequence<'a> 'b, Sequence<'a> 'c) splits ('a -> bool f, 'b x, 'c y) =
(split(f, x), split(f, y))

In your version we would need to provide potentially 9 different
versions of this! One for each combination of 'finger tree', 'array'
and 'anything else'.

> Are you asking for run-time dispatch to type specialized functions?

Yes (Although it is often resolvable at compile time. e.g. if I apply
split to something known to be an array at compile time it shouldn't
cause a runtime check on the type). That's exactly what multiple
dispatch *does*.

> > What do you mean by 'depth' here? Are we talking about or patterns or
> > nesting constructors? If the former, I think you're right. If the
> > latter, see pseudocode.
>
> I was referring to A(1, B(2, C(3, D 4))) and so on as "deep patterns".

Ok. But there's no reason those are noticably harder (except for at
the implementation level in order to make them efficient) than
destructuring a single level. If I can destructure x in f(x) to be (1,
y) then there's no reason I can't destructure y to be (2, z).

> > Right. But that's just a destructuring. It's not part of the actual match.
>
> Well, destructuring is part of pattern matching, of course.

Yes. But it's also possible independently of pattern matching (even
Javascript can do it, although I don't know if it handles nesting).
It's incredibly easy to implement - what's difficult is inferring the
necessary patterns from each destructuring. In particular I'm trying
to point out that destructuring + multiple dispatch is equivalently
powerful to (basic, i.e. sans or patterns) pattern matching.

> > This is what gets called in the default case.
>
> Ok. So 6 lines of OCaml became 12 lines with multiple dispatch (excluding the
> type declaration, imports etc.).

Yes. In the pseudo-code it becomes more competitive (although still
longer).  (One could, incidentally, cut out a lot of the mess by
defining an overload that `*` (Expr<T> x, Int<T> i) = i * x, but that
changes the semantic meaning in general even though it's ok here, so I
wanted to avoid it).

> Right, but choosing overloading has cost you type inference and automatic
> generalization. That is one of my main gripes with Scala because it leads to

The ad hoc overloading certainly does. I rather wish Nice didn't have
it. Pure ML-sub can retain a reasonable amount of type inference.
Presumably pure ML sub with structural interfaces (the core of Nice's
type system) can too.

> enormous verbosity.
>
> > In general the overhead in Nice over basic pattern matching is
> > approximately equivalent to providing a type signature for each top
> > level function, which is often considered good practice anyway. There
> > is indeed not a good equivalent of or patterns, guards, etc.
>
> Per-function type annotations are considered good practice in Haskell because
> it has an undecideable type system but not in ML, where you generally see no
> type annotations whatsoever.

Haskell's core type system isn't undecidable as far as I know. Type
inference is entirely possible and no less straightforward than in ML
with Haskell's core Hindley Milner + type classes type system. It's
more a case of readability and explicit public contracts for the
module. Haskell's type system + a bazillion add-ons found only in GHC
head is of course. ;-)

> > > If you are not aware of modern pattern matchers (i.e. since SML) then I
> > > think
> >
> > Well, my SML is a bit rusty too. But the only bit of pattern matching
> > that couldn't be done in Haskell (which I know at least moderately
> > well) was the or pattern (which is very nice. I've definitely wanted
> > something like that before). The variants couldn't, but that's a type
> > system issue, not pattern matching per se.
>
> They are tied together:
>
>   http://caml.inria.fr/pub/papers/garrigue-deep-variants-2004.pdf

Argh. Not another paper to read. My pile of such is a mile high at the
moment. ;-)

> This is my attempt at translating Jacques Garrigue's "Code reuse through
> polymorphic variants" paper to this problem:
>
>   type 'a expr =
>     [ `BinOp of [ `Add | `Mul ] * 'a * 'a
>     | `Int of int
>     | `UnOp of [ `Fact | `Fun of string ] * 'a
>     | `Var of string ]
>
>   let symbol_of_unop = function
>     | `Fact -> "!"
>     | `Fun f -> f
>
>   let symbol_of_binop = function
>     | `Add -> "+"
>     | `Mul -> "*"
>
>   let s s unop binop expr =
>     match expr with
>     | `Int n -> string_of_int n
>     | `Var s -> s
>     | `UnOp(op, f) -> s f ^ unop op
>     | `BinOp(op, f, g) -> s f ^ binop op ^ s g
>
>   let rec string_of_expr expr =
>     s string_of_expr symbol_of_unop symbol_of_binop expr
>
> The essence of this solution is to untie the recursive knot of the sum type.
> So the recursive expr type has been replaced by the type "'a expr as 'a".
> This explicit recursion makes the expr type extensible because we can use:
>
>   type expr2 = [ expr2 expr | ... ]

Interesting. I note that the type parameter on expr seems to have
become a String. Is this a necessary change? (Note that mine works
with any type parameter T because the toString method can accept
different types generically).

It also looks rather like you're manually compiling type hierarchies
into pattern matching. ;-)

> > Relevant features: Although unary ops are normally postfix, functions
> > are prefix. Additionally, this needs to be open so we can add new
> > operators in other modules. toString should handle additions without
> > modification unless they require a specific change to the rules.
>
> You can still extend the toString function, albeit with a fixed amount of
> boiler-plate code at each extension to close the recursion again and make it
> usable. For example, adding floats:
>
>   type 'a expr2 = [ 'a expr | `Float of float ]
>
>   let s2 s2 unop binop expr =
>     match expr with
>     | `Float x -> string_of_float x
>     | #expr as expr -> s s2 unop binop expr
>
>   let rec string_of_expr2 (expr : 'a expr2 as 'a) =
>     s2 string_of_expr2 symbol_of_unop symbol_of_binop expr

Ok. So it isn't extensible at all. It merely allows you to build
similar things on top of it, and if I were to write a function which
accepts an expr it wouldn't be able to accept an expr2, even if the
function definition had a perfectly valid catch all term, right?

So this seems to do significantly less than the multiple dispatch
version in terms of allowing for extensibility.

> > It seems to me that while multiple dispatch can only handle shallow
> > patterns, pattern matching can only handle shallow hierarchies. Maybe
> > OCaml has some killer feature which makes this not the case, but I
> > definitely don't see it in what you've shown so far.
>
> I think there are two approaches based upon pattern matching that are relevant
> here:
>
> Firstly, you have ordinary nested variant patterns. Using this approach, you
> must replace all functions that depend upon the type you extend, which is
> asymptotically more code than the OOP approach.
>
> Secondly, you have polymorphic variants in OCaml (not SML, Haskell or F#) that
> provide overlapping sum types, inferred sum types and open sum types. These
> provide the same extensibility as the OOP approach.

No they don't. They provide a different set of extensibilities - some
more, some less. In particular they provide no support for
encapsulation of type (see something like my Sequence example above)
or for one constructor inheriting from another. On the other hand the
structural subtyping and the ability to have overlaps is very nice,
and is something that most (statically typed - it Just Works in
dynamically typed languages, but you lose compiler checking) OOP
languages don't provide.

> However, every time you add any extensibility like this you are undermining
> static checking and performance. If you opt for complete extensibility then

You've not pointed out any problems with static checking in Nice.
Indeed there's a soundness proof for ML-sub. And we wouldn't want to
use a language without a soundness proof for its type system, would
we? ;-)

> that basically obviates all static checking and you can kiss goodbye to
> competitive performance. In all of my time coding, I have never used the

Depends what you mean by "competitive performance". Nice performance
is generally within a factor of two of Java's according to the
language shootout, and its dispatch mechanism could still use a fair
bit of optimising as I understand it (which, I suspect, it will never
receive).

> above design pattern to make my code extensible. In fact, I would strongly
> advise against it because you rapidly end up with an unmaintainable rats nest
> of dependencies. You'd need some pretty fancy whole program analysis and
> refactoring to recover such a project once it gets hairy.

Ah. The "My language deficiency is actually a feature" argument.

> > It also occurs to me that there's absolutely no reason one couldn't
> > combine the two paradigms. You could pretty much drop in multiple
> > dispatch as a replacement for the constructor matching in a normal
> > pattern matching system. I wonder how well that would work. Maybe I
> > should give it a try. I've been wibbling about putting some of my
> > half-baked language ideas into practice for the best part of a year
> > now. :-)
>
> You could probably retrofit that onto OCaml quite easily as a macro.

I doubt it.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to jvm-languages@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: F#

Reply via email to