from:"Jon Harrop"

RE: Value types (Was: [Caml-list] ocamlopt LLVM support)

2010-12-15 Thread Jon Harrop

Benedikt wrote:
 Nice. But it would be even better to avoid the dummy, in your example
 
   let x = u in
   let y = v in
   if b then x - v; y - u
 
 This does not only avoid the dummy, but would also allow lowering to
 cmovcc instructions in the backend selector (atleast for x86-32/64).

FWIW, when you're using LLVM (as a library!) to generate equivalent code
then you can just represent the tuples as structs (value types) and LLVM
will take care of all of this for you. Again, it generates really efficient
code with minimal effort.

The latest example front-end for HLVM represents tuples as structs so it
gets this for free. However, polymorphic recursion is *much* harder to
implement with that design choice. Also, there is the question of when boxed
vs unboxed tuples are more efficient.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: optimize div to right shift (NOT!)

2010-12-13 Thread Jon Harrop

Edwin wrote:
 http://www.hackersdelight.org/HDcode/magic.c.txt
 http://www.hackersdelight.org/HDcode/magicu.c.txt

Nice! :-)

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: Value types (Was: [Caml-list] ocamlopt LLVM support)

2010-12-12 Thread Jon Harrop

Török Edwin wrote:
 Do you really need to use Int64 for that though? Won't the 63-bit
 version do?

I'm running 32-bit.

  I am unable to reproduce your results. Here, the time falls from 24s
  to 19.5s (using ocamlopt 3.12.0 on Intel x86) which is still 26×
  slower than HLVM.

Sorry, I'm actually using an Opteron x86 (logged in from an Intel x86!).

 Do you still have 'idiv' in the compiled code? See my attached
 assembly, and compare it with yours please.
 I was doing the test on 64-bit, with ocamlopt 3.11.2 and 3.12.0.

I get what appear to be calls to C code:

camlCollatz__collatzLen_1030:
subl$8, %esp
.L103:
movl%eax, 4(%esp)
movl%ebx, 0(%esp)
pushl   $camlCollatz__10
pushl   %ebx
movl$caml_equal, %eax
callcaml_c_call
.L104:
addl$8, %esp
cmpl$1, %eax
je  .L102
movl4(%esp), %eax
addl$8, %esp
ret
.align  16
.L102:
pushl   $camlCollatz__8
movl4(%esp), %eax
pushl   %eax
movl$caml_int64_and, %eax
callcaml_c_call
.L105:
addl$8, %esp
pushl   $camlCollatz__9
pushl   %eax
movl$caml_equal, %eax
callcaml_c_call
.L106:
addl$8, %esp
cmpl$1, %eax
je  .L101
pushl   $3
movl4(%esp), %eax
pushl   %eax
movl$caml_int64_shift_right, %eax
callcaml_c_call
.L107:
addl$8, %esp
movl%eax, %ebx
jmp .L100
.align  16
.L101:
movl0(%esp), %eax
pushl   %eax
pushl   $camlCollatz__6
movl$caml_int64_mul, %eax
callcaml_c_call
.L108:
addl$8, %esp
pushl   $camlCollatz__7
pushl   %eax
movl$caml_int64_add, %eax
callcaml_c_call
.L109:
addl$8, %esp
movl%eax, %ebx
.L100:
movl4(%esp), %eax
addl$2, %eax
jmp .L103

 FWIW the original code took 2.8 seconds here, so only 4x slower (this
 is an AMD Phenom II x6 1090T CPU). It probably depends how fast/slow
 the 'idiv' is on your CPU.

The performance of idiv is irrelevant here. The bottleneck may be those C calls 
but I don't understand why they are being generated.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: Value types (Was: [Caml-list] ocamlopt LLVM support)

2010-12-12 Thread Jon Harrop

Brian Hurt wrote:
 On Sun, 12 Dec 2010, Jon Harrop wrote:
   let rec collatzLen(c, n) : int =
 if n = 1L then c else
   collatzLen (c+1, if Int64.rem n 2L = 0L then Int64.div n 2L else
  Int64.add (Int64.mul 3L n) 1L);;
 
   let rec loop(i, (nlen, n)) =
 if i = 1L then n else
   let ilen = collatzLen(1, i) in
   let nlen, n = if ilen  nlen then ilen, i else nlen, n in
   loop (Int64.sub i 1L, (nlen, n));;
 
 Congratulations, Jon, you win today's Captain Obvious award.  Using
 Int64's, which are forced to be boxed, really slows things down.

Apparently boxing isn't the issue here, as I had assumed. On 32-bit, OCaml
compiles each arithmetic operation on the int64s to a C function call.

 Also, uncurrying all your arguments also slows things down.

I see 3% performance improvement from currying everything.

 Running your
 original code on my 64-bit laptop, it took 6.35s to run the 1M example.
 The following alternate code only took 0.82s, for a speed up of almost
 7.75x.

According to Edwin, you should be able to get C-like performance by running
the OCaml in 64-bit and replacing the div and mod operations with shifts and
logical ANDs.

 Scaling your timings by a similar amount gives Ocaml a running
 speed of 3.14s in your set up, or competitive with F#.

I'd be wary of scaling timings by measurements made across different
architectures. OCaml seems to be doing completely different things on x86
and x64 here.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: Value types (Was: [Caml-list] ocamlopt LLVM support)

2010-12-12 Thread Jon Harrop

Benedict wrote:
  A C compiler would optimize this to a right shift. Changing that to
  'Int64.shift_right n 1' speeds up the code.

 This is easy to fix in ocamlopt (see attached patch ocamlopt-
 natint.patch), by applying the same optimizations already used for
 constant int's to constant natint's (Int64 is Nativeint on 64bit). Note
 however, that mod 2 is not really and 1, neither is div 2
 equivalent to lsr 1; that would be the case for unsigned arithmetic
 (doesn't matter in your example, tho).

That's great. Does it optimize div and mod by any constant integer as C
compilers do?

  1. Unboxing can give huge performance improvements on serial code,
 
  s/Unboxing/arithmetic optimizations/
  Please find an example where the performance benefit is due to
  unboxing, and not due to arithmetic optimizations performed on the
  unboxed code.
 
 The boxing involved is relevant, but boxing in general is not the
 issue. In this special case, the let nlen, n = if... code requires
 heap allocation, because of the way the pattern is compiled. This could
 be fixed by moving the condition out of the code and using two if's to
 select n/nlen separately (doesn't speed up that much). Fixing the
 pattern compiler to handle these cases might be interesting for general
 benefit.
 
 I already mentioned this multiple times, but here we go again: Unboxing
 optimizations may indeed prove useful if applied wisely (cmmgen.ml is
 of special interest here, the unboxing optimizations are more or less
 special cases; that could be extended to include interesting cases like
 moving boxing out of if-then-else in return position, etc).
 
 But (here comes the special Harrop note) this has absolutely nothing
 to do with LLVM (and of course really, really, really nothing to do
 with HLVM). Using a different data representation for the heap requires
 a nearly complete rewrite of the OCaml system (you would probably need
 to start at the Typing level); if one wants to do this, enjoy and come
 back with code. But even then, data representation issues will have to
 be considered long before it comes to actual code generation (if you
 are serious, you'll have to think about the runtime first prior to
 talking about code generation for a non-existing runtime), so even then
 it has nothing to do with LLVM (or C-- or C or whatever IR you can
 think of).

OCaml programmers can benefit from more appropriate data representations by
using LLVM as a library to generate code from OCaml. HLVM is an example of
this that anyone can play with.

 Combining alloc's across if-then-else constructs further reduces code
 size in your example (and probably other programs as well), see
 attached file ocamlopt-comballoc-ifthenelse.patch. It's quickdirty,
 but it should illustrate the idea.

I think that is an even more valuable improvement to ocamlopt than the int
optimization.

 This doesn't mean that LLVM wouldn't be useful (in fact, I've just
 started an LLVM backend for ocamlopt). But it is important to note that
 LLVM is not the solution to everything. As the name implies, it's low
 level, it does a few higher level optimizations for C, but these are
 special cases (and somewhat ugly if you take the time to read the
 code). It won't make a faster OCaml magically, just like it didn't make
 a faster Haskell by itself.

Absolutely.

 I could go on by quoting common Harrop jokes like you need types in
 the low-level IR, etc. trying to tell him that this is simply wrong;
 but after reading through the Caml/LISP mailing list archives (thanks
 for the pointers by several readers), I'm pretty much convinced that
 Jon simply decided to end his war against LISP just to start a new one
 against ocamlopt.

Suggesting that OCaml programmers use LLVM as a library because you can get
huge performance gains is hardly a war against ocamlopt.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: Value types (Was: [Caml-list] ocamlopt LLVM support)

2010-12-12 Thread Jon Harrop

Edwin wrote:
 On Sun, 12 Dec 2010 22:05:34 -
 Jon Harrop jonathandeanhar...@googlemail.com wrote:
  Edwin wrote:
   AFAICT LLVM's OCaml bindings are only good for generating LLVM IR
   from OCaml, not for actually performing transformations on it
   (there is no binding to retrieve the type of a value for example).
   I'll probably be looking into fixing that in the near future, and
   this may indirectly help your LLVM backend (if you intend to write
   OCaml specific transformations on the LLVM IR).
 
  That's a lot of work. Wouldn't it be preferable to do the passes on
  the OCaml side and focus on generating high quality LLVM IR?
 
 Yes, that is probably a better approach (generating the optimized IR in
 the first place).

FWIW, just the basic existing bindings were still buggy last I looked. For
example, HLVM had to use a workaround to add a dummy argument to a function
with no arguments because the bindings didn't handle that case correctly.
Ironing the bugs out of the existing bindings would be more useful than
making them even bigger, IMHO.

Going back to Richard's idea, writing an OCaml library that uses LLVM to JIT
interface code on-the-fly as a replacement for writing separate C stubs
would be incredibly useful and you could even use it to interface to LLVM
itself more easily!

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: ocamlopt LLVM support (Was: [Caml-list] OCamlJIT2 vs. OCamlJIT)

2010-12-06 Thread Jon Harrop

Richard Jones wrote:
 It might kick-start efforts to automatically generate bindings, at
 least to C code, which wouldn't be a bad thing.  camlidl is clumsy and
 doesn't handle a very important case (pointers to handles) so it's not
 really useful for this.

Yes, LLVM is ideal for this. Using LLVM to JIT compile OCaml-C interop
stubs should be quite easy. I did a bit of this in some of the OCaml Journal
article that were on LLVM.

You could also kill two birds with one stone by creating an OCaml-HLVM
bridge with it. That would make it easier to create a bigarray from OCaml
and call into high-performance parallel numerical code to chomp on it.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: ocamlopt LLVM support (Was: [Caml-list] OCamlJIT2 vs. OCamlJIT)

2010-12-03 Thread Jon Harrop

Benedikt wrote:
 So, let's take that LLVM thing to a somewhat more serious level (which
 means no more arguing with toy projects that simply ignore the
 difficult stuff).

You can learn a lot from prototypes.

 3. A better LLVM. If we do it right, other's implementing functional
 languages using LLVM will not be faced with the same problems again.
 Dunno if that's possible.

What exactly are you thinking of that goes beyond what the other functional
languages that already target LLVM have already had to implement?

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] OCamlJIT2 vs. OCamlJIT

2010-12-01 Thread Jon Harrop

Erik wrote:
 Jon wrote:
  LLVM is also much better documented than ocamlopt's internals.
 
 LLVM has well over 20 full time programmers working on it. The
 Ocaml compiler has how many?

Another reason why LLVM deserves its hype.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] OCamlJIT2 vs. OCamlJIT

2010-12-01 Thread Jon Harrop

Benedikt wrote:
 This has nothing to do with LLVM, but is simply due to the fact that
 your code does not box the float parameters/results.

Exactly, yes.

 The following peace of C code is most probably even faster than your
 code, so what?

 double fib(double x) { if (x  1.5) return x else return fib(x-1) +
 fib(x-2); }

FWIW, gcc -O2 takes longer to compile and produces slower code than HLVM
anyway:

$ time gcc -O2 fib.c -o fib

real0m0.161s
user0m0.124s
sys 0m0.036s

$ time ./fib
102334155.00

real0m2.792s
user0m2.792s
sys 0m0.000s

If you're asking what the advantages of using LLVM over generating C code
are, I'd say functionality like more numeric types, tail calls and JIT
compilation come top but also the fact that LLVM bundles nice OCaml bindings
and makes it easy to generate fast code. Also, I have other examples (e.g.
the random number generator from the SciMark2 benchmark) where LLVM can
generate code that runs 2x faster than anything I've been able to get out of
GCC.

 So this is about data representation not code generation (using LLVM
 with boxed floats would result in same/lower performance);

Yes. LLVM lets you choose your own data representation and applications like
numerics can benefit enormously from that. All the more reason to use LLVM.

 HLVM ignores
 complex stuff like polymorphism, modules, etc. (at least last time I
 checked), which makes code generation almost trivial.

OCaml ignores complex stuff like parallelism and value types (at least last
time I checked ;-). This is precisely why OCaml and HLVM complement each
other. Ideally, we would want all of this at the same time but, for now,
we'll have to settle for separate solutions.

 The literature
 includes various solutions to implement stuff like ML polymorphism:
 tagged integers/boxed floats/objects is just one solution, not
 necessarily the best; but examples that simply ignore the complex
 stuff, and therefore deliver better performance don't really help to
 make progress.

Right. Reified generics can be a much better solution for performance,
particularly when combined with value types, and something else that
ocamlopt cannot express but HLVM can.

 A possible solution to improve ocamlopt's performance in this case
 would be to compile the recursive fib calls in a way that the
 parameter/result is passed in a floating point register (not boxed),
 and provide a wrapper for calls from outside, which unboxes the
 parameter, invokes the actual function code, and boxes the result. This
 should be doable on the Ulambda/Cmm level, so it is actually quite
 portable and completely independent of the low-level code generation
 (which is where LLVM comes into play). That way ocamlopt code will be
 as fast as the C code for this example.

Wow, I'd love to see your OCaml version that is as fast as C.

  I have microbenchmarks where LLVM generates code over 10x faster than
  ocamlopt (specifically, floating point code on x86) and larger
  numerical programs that also wipe the floor with OCaml.
 
 ocamlopt's x86 floating point backend is easy to beat, as demonstrated
 in the original post. Even a simple byte-code JIT engine (which still
 boxes floating point results, etc.) is able to beat it.

Yes. Another reason to consider alternatives to ocamlopt for such
applications.

 Your benchmarks do not prove that LLVM generates faster code than
 ocamlopt.

My benchmarks show LLVM generating faster code than ocamlopt.

 Instead you proved that OCaml's floating point representation
 comes at a cost for number crunching applications (which is obvious).
 Use the same data representation with LLVM (or C) and you'll notice
 that the performance is the same (most likely worse) compared to
 ocamlopt.

You are saying is that LLVM might not be faster if it were also afflicted
with ocamlopt's design, which is unproductive speculation. The point is that
LLVM and HLVM are not constrained by those design decisions as OCaml is, so
you can use them to generate much faster code.

  LLVM is also much better documented than ocamlopt's internals.
 
 Definitely, but LLVM replaces just the low-level stuff of ocamlopt
 (most probably starting at the Cmm level), so a lot of (undocumented)
 ocamlopt internals remain.

Sure. OCaml has a good pattern match compiler, for example.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] OCamlJIT2 vs. OCamlJIT

2010-12-01 Thread Jon Harrop

Benedikt wrote:
 On Dec 1, 2010, at 15:11 , Jon Harrop wrote:
  If you're asking what the advantages of using LLVM over generating C
 code
  are, I'd say functionality like more numeric types, tail calls and
 JIT
  compilation come top but also the fact that LLVM bundles nice OCaml
 bindings
  and makes it easy to generate fast code. Also, I have other examples
 (e.g.
  the random number generator from the SciMark2 benchmark) where LLVM
 can
  generate code that runs 2x faster than anything I've been able to get
 out of
  GCC.
 
 LLVM sure does better as an intermediate language than C does, no
 question. But that wasn't my point in this example.

I think we are talking at cross purposes. My post was to explain some of the
many benefits of LLVM in response to Yoann's statement:

  I don't understand why there is so much hype around LLVM. Why would you
think something written in C++ would be far better than the ocaml code we
have in the ocamlopt compiler?

I was not just talking about porting ocamlopt to LLVM but more generally
about the advantages of using LLVM from OCaml in any way.

  So this is about data representation not code generation (using LLVM
  with boxed floats would result in same/lower performance);
 
  Yes. LLVM lets you choose your own data representation and
 applications like
  numerics can benefit enormously from that. All the more reason to use
 LLVM.
 
 As does assembler, so even more reasons to emit assembler?

LLVM makes it a *lot* easier to generate efficient code, of course.

  The literature
  includes various solutions to implement stuff like ML polymorphism:
  tagged integers/boxed floats/objects is just one solution, not
  necessarily the best; but examples that simply ignore the complex
  stuff, and therefore deliver better performance don't really help to
  make progress.
 
  Right. Reified generics can be a much better solution for performance,
  particularly when combined with value types, and something else that
  ocamlopt cannot express but HLVM can.
 
 So this is an area to work on within ocamlopt.

If you can add value types and reified generics to ocamlopt and get them
accepted that would be great.

  You are saying is that LLVM might not be faster if it were also
 afflicted
  with ocamlopt's design, which is unproductive speculation. The point
 is that
  LLVM and HLVM are not constrained by those design decisions as OCaml
 is, so
  you can use them to generate much faster code.
 
 We are talking about using LLVM within the OCaml compiler, so we have
 to talk about OCaml, not HLVM or anything else.

I was talking more generally about the benefits of LLVM in response to
Yoann's statement about the hype surrounding LLVM. HLVM demonstrates some of
the things a next-generation ocamlopt might be capable of.

 Don't get me wrong, I'm curious to see an LLVM backend for ocamlopt,
 especially since that way we would get a native top-level for free (at
 least for the JIT capable LLVM targets). But I doubt that LLVM will
 make a noticeable difference (except for probably reduced number of
 target specific code), since it will be faced with the same data
 representation used right now and LLVM will not be able to optimize
 much (assuming the ocamlopt higher level optimizations were already
 applied).

Absolutely. 

 What you are talking about is replacing several design decisions within
 OCaml (which have nothing to do with the low-level code generator);

Except that those design decisions have a huge effect on the optimizations
LLVM will do (if it is your code gen).

 this might be a good idea, and may indeed improve generated code.

HLVM's benchmark results would certainly seem to indicate that, yes.

 I don't know the opinion of the OCaml core developers, but from my POV,
 this sounds like an interesting challenge; however getting beyond a
 simplified proof-of-concept requires a lot of hard work.

Yes. At the end of the day, the performance gains that can be obtained by
bringing the GC up to date already dwarf all of these other effects so a new
GC is surely the highest priority. 

This begs a couple of questions:

* How dependent is OCaml bytecode on the data representation and GC?

* Could you write an OCaml bytecode interpreter in OCaml?

Perhaps we could write a multicore friendly OCamlJIT3. :-)

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: Threading and SharedMem (Re: [Caml-list] Re: Is OCaml fast?)

2010-11-30 Thread Jon Harrop

What would be responsible for collecting the shared heap?

Cheers,
Jon.

Eray wrote:
 Seconded, why is this not possible? That is to say, why cannot each thread
maintain a separate GC,
 if so desired?


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: Is OCaml fast?

2010-11-29 Thread Jon Harrop

I assume he means one thread has one behaviour and another has the other 
behaviour, in which case there certainly is a problem!

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Christophe Raffalli
 Sent: 29 November 2010 07:33
 To: oli...@first.in-berlin.de
 Cc: Caml List
 Subject: Re: [Caml-list] Re: Is OCaml fast?
 
 Le 28/11/2010 19:17, oli...@first.in-berlin.de a écrit :
  On Sat, Nov 27, 2010 at 04:58:55PM +0100, Christophe Raffalli wrote:
  Hello,
  To the extent that this rule is the same for all languages and that
 most
  languages on the shootout are also garbage collected, I think
 OCaml's
  problem with this benchmark do point at a weakness of the current
  GC code.
  This is untrue ... the bintree example, is just bad in OCaml because
 the
  default
  value of the minor heap size is the correct value for reactive
 programs
  where you want fast minor GC slice, because they interrupt the
 program ...
  [...]
 
  And if your program contains both kinds of functionality?
  What possible solution would you recommend?
 
 Changing the value of the minor heap size at runtime ...
 There is no pb with this ...
 
 
 
  Ciao,
 Oliver
 
  ___
  Caml-list mailing list. Subscription management:
  http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
  Archives: http://caml.inria.fr
  Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
  Bug reports: http://caml.inria.fr/bin/caml-bugs
 
 --
 Christophe Raffalli
 Universite de Savoie
 Batiment Le Chablais, bureau 21
 73376 Le Bourget-du-Lac Cedex
 
 tel: (33) 4 79 75 81 03
 fax: (33) 4 79 75 87 42
 mail: christophe.raffa...@univ-savoie.fr
 www: http://www.lama.univ-savoie.fr/~RAFFALLI
 -
 IMPORTANT: this mail is signed using PGP/MIME
 At least Enigmail/Mozilla, mutt or evolution
 can check this signature. The public key is
 stored on www.keyserver.net
 -


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] OCaml GC [was Is OCaml fast?]

2010-11-28 Thread Jon Harrop

I don't understand why this would help here though. Wouldn't that help when
a long-lived structure was single large block but, in this case, the
long-lived structure is a tree composed of many small heap-allocated blocks
and, therefore, they would undergo the same wasteful allocate in young only
to be copied to old pathological behaviour?

 

Surely what you really want the ability to spot when a full minor heap
contains mostly live objects and, therefore, make the whole minor heap part
of the major heap, allocate yourself a new minor heap and clear the
remembered sets. IIRC, G1 does something like this.

 

On a related note, When designing HLVM I thought that a mallocs function
that allocated many small blocks of the same size such that they could be
freed individually would be helpful. Another option might be to preallocate
larger numbers of blocks at the same time under the assumption that a thread
allocating many small surviving blocks would continue to do so, as a kind of
pool hybrid.

 

Cheers,

Jon.

 

From: caml-list-boun...@yquem.inria.fr
[mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of Eray Ozkural
Sent: 28 November 2010 18:57
To: Benedikt Meurer
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] OCaml GC [was Is OCaml fast?]

 

On Sun, Nov 28, 2010 at 4:29 PM, Benedikt Meurer
benedikt.meu...@googlemail.com wrote:

Speaking of the OCaml GC in general, wouldn't it make sense to replace the
current generational collector with a collector framework that requires less
copying in the common case. For example, dividing the heap into two parts,
large object space and small object space, where LOS is managed by
mark-sweep/mark-compact (could use the current major heap) and SOS is
managed by a recent mark-copy/mark-region collector (Immix [1] comes to mind
here). That way objects would no longer need to be copied between
generations, and using Immix may also reduce cache misses and improve
locality of small/medium sized objects (with proper evacuation heuristics).
This should be doable with a moderate amount of work, and should require no
incompatible changes, while opening the door for concurrent/parallel garbage
collection (this would require incompatible changes then, replacing
caml_young_ptr/caml_young_limit with TLS vars, etc.).

Benedikt

[1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.144.3640


I was suggesting dealing with small fixed-size objects differently in
another post. This doesn't sound like a bad combination, nice idea :)
 
Best,


-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] OCaml GC [was Is OCaml fast?]

2010-11-28 Thread Jon Harrop

I see. Yes, that sounds like a great idea. How well does Immix cope with
high allocation rates of short-lived objects? Been a while since I read the
Immix paper...

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Benedikt Meurer
 Sent: 28 November 2010 20:00
 To: caml-list@yquem.inria.fr
 Subject: Re: [Caml-list] OCaml GC [was Is OCaml fast?]
 
 
 On Nov 28, 2010, at 20:40 , Jon Harrop wrote:
 
  I don't understand why this would help here though. Wouldn't that
 help when a long-lived structure was single large block but, in this
 case, the long-lived structure is a tree composed of many small heap-
 allocated blocks and, therefore, they would undergo the same wasteful
 allocate in young only to be copied to old pathological behaviour?
 
 There is no young and no old in this scheme. There are two
 different heaps, one for large objects, one for small (probably 2-8k
 max object size, so not really small in the sense of OCaml). The small
 area is managed by Immix, which avoids copying of long-lived objects if
 they are allocated together with other long-lived objects (likely), or
 if not evacuates a set of probably related objects to a single chunk
 (this is somewhat dependent on the evacuation strategy, which will be
 differnt for OCaml compared to Java), further improving locality. There
 are simple heuristics to ensure that an object is not evacuated too
 often (it is already unlikely with the base algorithm), so there will
 be a lot less copying. One difficulty remains however: the pause times.
 It would probably be necessary to adopt Immix to a (semi-)incremental
 style to get the same pause times as with the current GC.
 
 Benedikt
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Is OCaml fast?

2010-11-25 Thread Jon Harrop

Boxing and generics with type erasure making it difficult or impossible to
reduce the density of references in the heap. Structs and reified generics‎
can give huge performance improvements. Hash tables and complex arithmetic
are some examples where I often see F# running 5-20× faster than idiomatic
OCaml/Java code.

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Erik de Castro Lopo
 Sent: 24 November 2010 01:24
 To: caml-l...@inria.fr
 Subject: Re: [Caml-list] Is OCaml fast?
 
 Jon Harrop wrote:
 
  Many of the things that can make OCaml and Java slow were inherited
  from Lisp. They are, in effect, designs that still bear the burden
  of dynamic typing despite being statically typed.
 
 I'm curious, what in your opinion are those things?
 
 Erik
 --
 --
 Erik de Castro Lopo
 http://www.mega-nerd.com/
 
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: Is OCaml fast?

2010-11-23 Thread Jon Harrop

And almost all of the Haskell solutions (reverse-complement,
spectral-norm, Mandelbrot, n-body, fannkuch-redux, k-nucleotide, regex-dna)
abuse GHC's FFI in order to work around Haskell.

The k-nucleotide benchmark in Haskell even uses malloc! Ketil Malde crafted
a much better solution but noted:

  This is considered cheating, since it is the easy and natural way to do
it. -
http://www.haskell.org/haskellwiki/Shootout/Knucleotide

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Christophe TROESTLER
 Sent: 23 November 2010 10:38
 To: igo...@yahoo.com
 Cc: caml-l...@inria.fr
 Subject: Re: [Caml-list] Re: Is OCaml fast?
 
 On Tue, 23 Nov 2010 02:03:48 +, Isaac Gouy wrote:
 
   C version : 12.11 secs
   OCaml version : 47.22 secs
   OCaml version with GC parameters tuned (interesting alternative
   section) : 12.67 secs
 
  And of course you know because that GC tuned OCaml program is shown
   on the
  benchmarks game website ;-)
  http://shootout.alioth.debian.org/u32/program.php?test=binarytrees;
  lang=ocamlid=1
 
 Since you are here, please explain why C can use memory pools and vec
 tor instructions but tuning the GC of OCaml ― although it is part of
 the standard library ― is considered an alternative.
 
 Best,
 C.
 
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: Optimizing garbage collection

2010-11-23 Thread Jon Harrop

I would increase the default minor heap size to something closer to today's
common L2 cache sizes. But I see that Damien already did this for us:

 

http://caml.inria.fr/cgi-bin/viewcvs.cgi/ocaml/version/3.12/byterun/config.h
?rev=10787
http://caml.inria.fr/cgi-bin/viewcvs.cgi/ocaml/version/3.12/byterun/config.
h?rev=10787r1=10496r2=10787 r1=10496r2=10787

 

Cheers,

Jon.

 

From: Eray Ozkural [mailto:examach...@gmail.com] 
Sent: 22 November 2010 23:14
To: Jon Harrop
Cc: Sylvain Le Gall; caml-l...@inria.fr
Subject: Re: [Caml-list] Re: Optimizing garbage collection

 

On Mon, Nov 22, 2010 at 11:14 PM, Jon Harrop
jonathandeanhar...@googlemail.com wrote:

What happens if you just increase the default size?

  

 

Well we don't want to be a memory hog like Java do we? It's something that
kind of depends on the app, what would you set it to?

Cheers,

 

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: Is OCaml fast?

2010-11-23 Thread Jon Harrop

Note that the regex-dna solution for Haskell tweaks its GC parameters via
the -H command-line parameter:

http://shootout.alioth.debian.org/u64q/program.php?test=regexdnalang=ghcid
=2

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Christophe TROESTLER
 Sent: 23 November 2010 10:38
 To: igo...@yahoo.com
 Cc: caml-l...@inria.fr
 Subject: Re: [Caml-list] Re: Is OCaml fast?
 
 On Tue, 23 Nov 2010 02:03:48 +, Isaac Gouy wrote:
 
   C version : 12.11 secs
   OCaml version : 47.22 secs
   OCaml version with GC parameters tuned (interesting alternative
   section) : 12.67 secs
 
  And of course you know because that GC tuned OCaml program is shown
   on the
  benchmarks game website ;-)
  http://shootout.alioth.debian.org/u32/program.php?test=binarytrees;
  lang=ocamlid=1
 
 Since you are here, please explain why C can use memory pools and vec
 tor instructions but tuning the GC of OCaml ― although it is part of
 the standard library ― is considered an alternative.
 
 Best,
 C.
 
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: Is OCaml fast?

2010-11-23 Thread Jon Harrop

Yes, an answer to a better question.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Isaac Gouy
 Sent: 23 November 2010 18:07
 To: caml-l...@inria.fr
 Subject: [Caml-list] Re: Is OCaml fast?

 Jon Harrop jonathandeanharrop at googlemail.com writes:

  Ketil Malde crafted
  a much better solution but noted:

This is considered cheating, since it is the easy and natural way
 to do
  it. -
  http://www.haskell.org/haskellwiki/Shootout/Knucleotide

 Not even cheating - just an answer to a different question.

 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: Optimizing garbage collection

2010-11-22 Thread Jon Harrop

What happens if you just increase the default size?


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Native toplevel? (was: OCamlJit 2.0)

2010-11-20 Thread Jon Harrop

Also for interactive data massaging and visualization.

From: caml-list-boun...@yquem.inria.fr
[mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of Ashish Agarwal
Sent: 18 November 2010 16:50
To: Alain Frisch
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Native toplevel? (was: OCamlJit 2.0)

On Wed, Nov 17, 2010 at 3:44 AM, Alain Frisch al...@frisch.fr wrote:

Does performance really matter that much for rapid prototyping/development?

Rapid prototyping for me often involves a couple of lines of code that read
in a very large file and do something with it. I have to keep compiling
these small programs to native code because the performance of the toplevel
is too slow. Then, I have to recompile and re-read the whole file for every
little additional thing I want to compute. A high-performance toplevel would
help in this kind of work.

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] SMP multithreading

2010-11-20 Thread Jon Harrop

 This is actually a quick way to use multiple cores with ocaml. Find a
 often called function that takes considerable time and offload it to C

Or HLVM, F#, Scala, Clojure or any of the other languages that permit shared
memory parallelism. C is particularly poor in this regard so I would not
just restrict yourself to C...

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] [Was: OCamlJit 2.0]

2010-11-20 Thread Jon Harrop

 Do we have example of big companies porting their whole codebase to
 another language ?

Yes, of course. Companies modernise all the time. We have a client who just
started porting 1MLOC of C++ to F#. Flying Frog have ported hundreds of
thousands of lines of OCaml to F#. It happens all the time but it is even
more likely to happen as a consequence of multicore.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] [Was: OCamlJit 2.0]

2010-11-20 Thread Jon Harrop

Yoann Padioleau wrote:
 On Nov 20, 2010, at 9:08 AM, Jon Harrop wrote:
  Do we have example of big companies porting their whole codebase to
  another language ?
 
  Yes, of course. Companies modernise all the time. We have a client
  who just
  started porting 1MLOC of C++ to F#.
 
 How they do that ? Are they using compiler frontends to assist them
 in automatically translating part of the code to F# ?

We do it for them for a fee, using magic. We tried automated translation
(from Fortran) but it is usually easier just to knuckle down by hand. Old
code bases are usually 90% cruft anyway. That 1MLOC of C++ could have been
100kLOC of C++ but now it will be 10kLOC of F# instead.

  Flying Frog have ported hundreds of thousands of lines of OCaml to F#.
 
 OCaml and F# are very similar language, so such porting does not look
 that hard.

Yes. The main thing is that OCaml source is nice and readable so it is easy
to translate even if it makes heavy use of features that don't port
(polymorphic variants, macros). C++ source can be a nightmare so you really
need to pin down a precise specification for what it does using advanced
techniques like poking it with a stick.

 Translating from C++ to F#,  or from PHP to Java seems more
 complicated.
 I wonder what kind of techniques people have developed to help such
 language-to-language translation.

The only viable option seems to be to do it by hand. You can churn through
an enormous amount of code when you put your mind to it...

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: [Was: OCamlJit 2.0]

2010-11-20 Thread Jon Harrop

Sylvain Le Gall:
 I doubt an old code, not written with multicore in mind is easily
 portable to multicore. So basically, the migration you are talking
 about
 is starting a new project that will replace one software/library by
 another.

Yes, the systems are kept loosely coupled during the transition so it is
more like a new project to supercede the old one than a gradual transition
between code bases. However, only the developers need be aware of this
distinction. From the point of view of everyone else in the company, the
implementation of a product is being modernized. Management get improved
productivity/maintainability. Sales get shiny new things. HR get to fire any
developers who resist (but most are assimilated ;-).

The avalanche happens when the cost of rewriting falls below the cost of
adding essential features to a legacy code base.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] SMP multithreading

2010-11-18 Thread Jon Harrop

Can you cite any papers from this century? ;-)

 

Cheers,

Jon.

 

From: Eray Ozkural [mailto:examach...@gmail.com] 
Sent: 17 November 2010 13:41
To: Eray Ozkural; Jon Harrop; caml-list@yquem.inria.fr
Subject: Re: [Caml-list] SMP multithreading

 

On Wed, Nov 17, 2010 at 8:50 AM, Gabriel Kerneis kern...@pps.jussieu.fr
wrote:

On Wed, Nov 17, 2010 at 06:27:14AM +0200, Eray Ozkural wrote:
 As I said even in C good results can be achieved, I've seen that, so I
 know it's doable with ocaml, just a difficult kind of compiler. The
 functional features would expose more concurrency.

Could you share a pointer to a paper describing this compiler?


I can't reveal much, but just to point out that there are indeed more
sophisticated compilers than gcc:
http://www.research.ibm.com/vliw/compiler.html  

So, uh, there are compilers that turn loops into threads, and also
parallelize independent blocks Both coarse-grain and fine-grain
parallelization strategies in existing compiler research can be effectively
applied to the multi-core architectures. In fact, some of the more advanced
compilers (like that of the RAW architecture) must be able to target it
already, but who knows. :) Just consider that most of the parallelization
technology is language independent, they can be applied to any imperative
language. So, would such a thing be able to work on ocaml generated
binaries? Most definitely, I believe, it is in principle possible to start
from the sequential binary and emit parallel code!

Best,



-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] SMP multithreading

2010-11-16 Thread Jon Harrop

Wolfgang wrote:
 I'd like to point out how the big competitor to OCaml deals with it.
 The GHC Haskell system has SMP parallization built in for some time,
 and it does it quite well.

I beg to differ. Upon trying to reproduce many of the Haskell community's
results, I found that even their own parallel Haskell programs often exhibit
huge slowdowns. This is because Haskell's unpredictable performance leads to
unpredictable granularity and, consequently, more time can be spent
administering the tiny parallel computations than is gained by doing so.

The results I found here are typical:

 
http://flyingfrogblog.blogspot.com/2010/01/naive-parallelism-with-hlvm.html

Note that the absolute performance peaks at an unpredictable number of cores
only in the case of Haskell. This is because the GC does not scale beyond
about 4 cores for any Haskell programs doing significant amounts of
allocation, which is basically all Haskell programs because allocations are
everywhere in Haskell.

Ultimately, running on all cores attains no speedup at all with Haskell in
that case. This was branded the last core slowdown but the slowdown
clearly started well before all 8 cores. There was a significant development
towards improving this situation but it won't fix the granularity problem:

  http://hackage.haskell.org/trac/ghc/blog/new-gc-preview

The paper Regular, shape-polymorphic, parallel arrays in Haskell cites
2.5x speedups when existing techniques were not only already getting 7x
speedups but better absolute performance as well. Cache complexity is the
problem, as I explained here:

 
http://flyingfrogblog.blogspot.com/2010/06/regular-shape-polymorphic-paralle
l.html

Probably the best solution for multicore programming is Cilk. This technique
has already been adopted both in Intel's TBB and Microsoft's .NET 4 but,
AFAIK, the only functional language with access to it is F#. There are some
great papers on multicore-friendly cache oblivious algorithms written in
Cilk:

  http://www.fftw.org/~athena/papers/tocs08.pdf

Note, in particular, that Cilk is not only much faster but also much easier
to use than explicit message passing.

To do something like this, threads need to be able to run in parallel and
mutate the same shared heap. Although that is objectively easy (I did it in
HLVM), OCaml's reliance upon very high allocation rates, efficient
collection of young garbage and a ridiculous density of pointers in the heap
make it a *lot* harder.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] SMP multithreading

2010-11-16 Thread Jon Harrop

Granularity and cache complexity are the reasons why not. If you find
anything and everything that can be done in parallel and parallelize it then
you generally obtain only slowdowns. An essential trick is to exploit
locality via mutation but, of course, purely functional programming sucks at
that by design not least because it is striving to abstract that concept
away.

 

I share your dream but I doubt it will ever be realized.

 

Cheers,

Jon.

 

From: caml-list-boun...@yquem.inria.fr
[mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of Eray Ozkural
Sent: 16 November 2010 23:05
To: Gerd Stolpmann
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] SMP multithreading

 

 

On Wed, Nov 17, 2010 at 12:13 AM, Gerd Stolpmann i...@gerd-stolpmann.de
wrote:

Am Dienstag, den 16.11.2010, 22:35 +0200 schrieb Eray Ozkural:



 On Tue, Nov 16, 2010 at 7:04 PM, Gerd Stolpmann
 i...@gerd-stolpmann.de wrote:
 Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar
 Friendly:
  * As somebody mentioned implicit parallelization: Don't
 expect
anything from this. Even if a good compiler finds ways
 to
parallelize 20% of the code (which would be a lot), the
 runtime
effect would be marginal. 80% of the code is run at
 normal speed
(hopefully) and dominates the runtime behavior. The
 point is
that such compiler-driven code improvements are only
 local
optimizations. For getting good parallelization results
 you need
to restructure the design of the program - well, maybe
compiler2.0 can do this at some time, but this is not
 in sight.

 I think you are underestimating parallelizing compilers.

I was more citing Amdahl's law, and did not want to criticize any effort
in this area. It's more the usefulness for the majority of problems. How
useful is the best parallelizing compiler if only a small part of the
program _can_ actually benefit from it? Think about it. If you are not
working in an area where many subroutines can be sped up, you consider
this way of parallelizing as a waste of time. And this is still true for
the majority of problems. Also, for many problems that can be tackled,
the scalability is very limited.
  


What makes you think only 20% of the code can be parallelized? I've worked
in such a compiler project, and there were way too many opportunities for
parallelization in ordinary C code, let alone a functional language;
implicit parallelism would work wonders there. Of course you may think
whatever you were thinking, but I know that a high degree of parallelism can
be achieved through a functional language. I can't tell you much more
though. If you think that a very small portion of the code can be
parallelized you probably do not appreciate the kinds of static and dynamic
analysis those compilers perform. Of course if you are thinking of applying
it to some office or e-mail application, this might not be the case, but
automatic parallelization strategies would work best when you apply them to
a computationally intensive program.

The really limiting factor for current functional languages would be their
reliance on inherently sequential primitives like list processing, which may
in some cases limit the compiler to only pipeline parallelism (something
non-trivial to do by hand actually). Instead they would have to get their
basic forms from high-level parallel PLs (which might mean rewriting a lot
of things). The programs could look like something out of a category theory
textbook then. But I think even without modification you could get a lot of
speedup from ocaml code by applying state-of-the-art automatic
parallelization techniques. By the way, how much of the serial work is
parallelized that's what matters rather than the code length that is. Even
if the parallelism is not so obvious, the analysis can find out which
iterations are parallelizable, which variables have which kinds of
dependencies, memory dependencies, etc. etc. In the public, there is this
perception that automatic parallelization does not work, which is wrong,
while it is true that the popular compiler vendors do not understand much in
this regard, all I've seen (in popular products) are lame attempts to
generate vector instructions

That is not to say, that implicit parallelization of current code can
replace parallel algorithm design (which is AI-complete). Rather, I think,
implicit parallelization is one of the things that will help parallel
computing people: by having them work at a more abstract level, exposing
more concurrency through functional forms, yet avoiding writing low-level
comms code. High-level explicit parallelism is also quite favorable, as
there may be many situations where, say, dynamic load-balancing approaches
are suitable. The approach of HPF might be relevant here, with the
programmer

RE: [Caml-list] Infix function composition operator

2010-11-10 Thread Jon Harrop

A pipeline operator is usually preferred over function composition in impure 
languages like OCaml and F# due to the value restriction. For example, your 
example would be written in F# as:

  x | op1 | op2 | op3 | op4 | op5

This style is very common in F#, particularly when dealing with collections.

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of m...@proof-technologies.com
 Sent: 10 November 2010 07:00
 To: ymin...@gmail.com; ar...@noblesamurai.com
 Cc: caml-l...@inria.fr
 Subject: Re: [Caml-list] Infix function composition operator
 
 on 10/11/10 3:45 AM, ymin...@gmail.com wrote:
 
  This is probably a minority opinion, but I have written and read
 quite a
 lot
  of OCaml code over the years, and I've seen surprisingly few
 effective
 uses
  of the composition operator.  Somehow, I usually find that code that
 avoids
  it is simpler and easier to read.
 
 I agree that using a composition operator can make the code obtuse, and
 so
 should not be overused.  But it's incredibly useful for certain
 situations:
 
 1) If you are performing a long chain of composed operations, it avoids
 nested bracketing piling up.
 
 For example:
   (op5 - op4 - op3 - op2 - op1) x
 Instead of:
   op5 (op4 (op3 (op2 (op1 x
 
 This sort of thing happens quite a lot in certain applications, e.g. in
 language processing, to get at subexpressions.
 
 2) Creating an anonymous function to be passed as an argument, it
 avoids
 explicitly mentioning arguments of that function.
 
 This sort of thing can happen a lot in functional programming
 generally.
 
 For example:
   List.map (op2 - op1) xs
 Instead of:
   List.map (fun x - op2 (op1 x)) xs
 
 Mark Adams
 
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Infix function composition operator

2010-11-10 Thread Jon Harrop

In OCaml, the value restriction leads to non-generalized type variables ('_a 
etc.) if you try to define functions like:

# let (  ) f g x = f(g x);;
val (  ) : ('a - 'b) - ('c - 'a) - 'c - 'b = fun

# let cons h t = h::t;;
val cons : 'a - 'a list - 'a list = fun

# cons  cons;;
- : '_a - ('_a list - '_a list) list - ('_a list - '_a list) list = fun

This is a silly example but you are most likely to hit this problem in practice 
in the context of parser combinators. Due to JIT compilation, F# cannot relax 
the value restriction so that does not even compile.

In MLs, you usually want the eta-expanded form:

# let cons2 x = (cons  cons) x;;
val cons2 : 'a - ('a list - 'a list) list - ('a list - 'a list) list =
  fun

But a pipeline is prettier:

# let ( | ) x f = f x;;
val ( | ) : 'a - ('a - 'b) - 'b = fun

# let cons2 x = x | cons | cons;;
val cons2 : 'a - ('a list - 'a list) list - ('a list - 'a list) list =
  fun

This is one advantage of Haskell over OCaml/F#. However, I don't see it as a 
useful advantage in practice because parser combinators are so tedious during 
development (they require constant attention as types evolve): you want code 
generation like ocamlyacc or camlp4. OCaml is a very strong contender here, of 
course.

Cheers,
Jon.

 -Original Message-
 From: m...@proof-technologies.com [mailto:m...@proof-technologies.com]
 Sent: 10 November 2010 13:44
 To: jonathandeanhar...@googlemail.com; ymin...@gmail.com;
 ar...@noblesamurai.com
 Cc: caml-l...@inria.fr
 Subject: Re: [Caml-list] Infix function composition operator
 
 So how does value restriction affect things here?  (excuse my lack of
 knowledge)
 
 One thing about using a pipeline like this is that it relies on '|'
 being
 left-associative (which it is due to OCaml's convention on operators
 that
 start with |).
 
 Mark.
 
 
 on 10/11/10 12:52 PM, Jon Harrop jonathandeanhar...@googlemail.com
 wrote:
 
  A pipeline operator is usually preferred over function composition in
 impure
  languages like OCaml and F# due to the value restriction. For
 example,
 your
  example would be written in F# as:
 
  x | op1 | op2 | op3 | op4 | op5
 
  This style is very common in F#, particularly when dealing with
 collections.
 
  Cheers,
  Jon.
 
  -Original Message-
  From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
  boun...@yquem.inria.fr] On Behalf Of m...@proof-technologies.com
  Sent: 10 November 2010 07:00
  To: ymin...@gmail.com; ar...@noblesamurai.com
  Cc: caml-l...@inria.fr
  Subject: Re: [Caml-list] Infix function composition operator
 
  on 10/11/10 3:45 AM, ymin...@gmail.com wrote:
 
   This is probably a minority opinion, but I have written and read
  quite a
  lot
   of OCaml code over the years, and I've seen surprisingly few
  effective
  uses
   of the composition operator.  Somehow, I usually find that code
 that
  avoids
   it is simpler and easier to read.
 
  I agree that using a composition operator can make the code obtuse,
 and
  so
  should not be overused.  But it's incredibly useful for certain
  situations:
 
  1) If you are performing a long chain of composed operations, it
 avoids
  nested bracketing piling up.
 
  For example:
(op5 - op4 - op3 - op2 - op1) x
  Instead of:
op5 (op4 (op3 (op2 (op1 x
 
  This sort of thing happens quite a lot in certain applications, e.g.
 in
  language processing, to get at subexpressions.
 
  2) Creating an anonymous function to be passed as an argument, it
  avoids
  explicitly mentioning arguments of that function.
 
  This sort of thing can happen a lot in functional programming
  generally.
 
  For example:
List.map (op2 - op1) xs
  Instead of:
List.map (fun x - op2 (op1 x)) xs
 
  Mark Adams
 
  ___
  Caml-list mailing list. Subscription management:
  http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
  Archives: http://caml.inria.fr
  Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
  Bug reports: http://caml.inria.fr/bin/caml-bugs
 
 
 
 

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Infix function composition operator

2010-11-10 Thread Jon Harrop

I am currently writing a concurrent server under contract in F#. Due to the 
poor performance of the default .NET serialization (over 170x slower than 
Marshal in OCaml!), I had to completely replace the serialization layer. 
Ideally, I would just write a script to compile my type definitions into 
serializers and deserializers over them. However, I haven't found time to do 
that yet so I am maintaining a set of combinators instead. They account for 
over 15% of the source code now. Every time I change a type (e.g. adding a 
field to a record type), I must update the combinators to reflect the changes. 
This is getting seriously tedious as the project is experimental and my types 
need to evolve rapidly.

If I were using OCaml, Marshal would be easily fast enough and I wouldn't even 
need to update my type definitions by hand because I could use structural 
typing (polymorphic variants and objects). Unfortunately, this is not yet 
possible in F#. Another problem area has been the incrementality of garbage 
collection. OCaml is good in this respect among functional languages but still 
orders of magnitude behind the state-of-the-art. With enough care and 
attention, .NET can be almost as good as OCaml is ordinarily but .NET's worse 
cases are horrific:

  
http://flyingfrogblog.blogspot.com/2010/10/can-you-repro-this-64-bit-net-gc-bug.html

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Stephan Tolksdorf
 Sent: 10 November 2010 16:11
 To: caml-l...@inria.fr
 Subject: Re: [Caml-list] Infix function composition operator
 
 On Wed, Nov 10, 2010 at 14:13 -, Jon Harrop wrote:
  However, I don't see it as a useful advantage in practice because
 parser combinators are so tedious during development (they require
 constant attention as types evolve): you want code generation like
 ocamlyacc or camlp4. OCaml is a very strong contender here, of course.
 
 Could you maybe elaborate a bit on what you find tedious with regard to
 evolving types in the context of parser combinators?
 
 In my parser code (using FParsec in F#) most types get inferred by the
 compiler and in the remaining instances the type annotations can hardly
 be called tedious. Actually, I find the types and the Visual Studio
 tooltips with the inferred types rather helpful for development.
 
 - Stephan
 
 
  Cheers,
  Jon.
 
  -Original Message-
  From: m...@proof-technologies.com [mailto:m...@proof-
 technologies.com]
  Sent: 10 November 2010 13:44
  To: jonathandeanhar...@googlemail.com; ymin...@gmail.com;
  ar...@noblesamurai.com
  Cc: caml-l...@inria.fr
  Subject: Re: [Caml-list] Infix function composition operator
 
  So how does value restriction affect things here?  (excuse my lack
 of
  knowledge)
 
  One thing about using a pipeline like this is that it relies on '|'
  being
  left-associative (which it is due to OCaml's convention on operators
  that
  start with |).
 
  Mark.
 
 
  on 10/11/10 12:52 PM, Jon Harropjonathandeanhar...@googlemail.com
  wrote:
 
  A pipeline operator is usually preferred over function composition
 in
  impure
  languages like OCaml and F# due to the value restriction. For
  example,
  your
  example would be written in F# as:
 
  x |  op1 |  op2 |  op3 |  op4 |  op5
 
  This style is very common in F#, particularly when dealing with
  collections.
 
  Cheers,
  Jon.
 
  -Original Message-
  From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
  boun...@yquem.inria.fr] On Behalf Of m...@proof-technologies.com
  Sent: 10 November 2010 07:00
  To: ymin...@gmail.com; ar...@noblesamurai.com
  Cc: caml-l...@inria.fr
  Subject: Re: [Caml-list] Infix function composition operator
 
  on 10/11/10 3:45 AM, ymin...@gmail.com wrote:
 
  This is probably a minority opinion, but I have written and read
  quite a
  lot
  of OCaml code over the years, and I've seen surprisingly few
  effective
  uses
  of the composition operator.  Somehow, I usually find that code
  that
  avoids
  it is simpler and easier to read.
 
  I agree that using a composition operator can make the code
 obtuse,
  and
  so
  should not be overused.  But it's incredibly useful for certain
  situations:
 
  1) If you are performing a long chain of composed operations, it
  avoids
  nested bracketing piling up.
 
  For example:
 (op5- op4- op3- op2- op1) x
  Instead of:
 op5 (op4 (op3 (op2 (op1 x
 
  This sort of thing happens quite a lot in certain applications,
 e.g.
  in
  language processing, to get at subexpressions.
 
  2) Creating an anonymous function to be passed as an argument, it
  avoids
  explicitly mentioning arguments of that function.
 
  This sort of thing can happen a lot in functional programming
  generally.
 
  For example:
 List.map (op2- op1) xs
  Instead of:
 List.map (fun x -  op2 (op1 x)) xs
 
  Mark Adams
 
  ___
  Caml-list mailing

RE: [Caml-list] How does OCaml update references when values are moved by the GC?

2010-10-30 Thread Jon Harrop

I was hoping for a little more detail, of course. :-)

How is the mapping from old to new pointers stored? Does the GC rewrite all
of the thread-local stacks in series before allowing any of them to
continue? Does the write barrier record pointers written into the major heap
so only those specific locations are rewritten at minor heap collections but
the entire major heap is rewritten upon compaction? Can the GC distinguish
between an array of ints and an array of pointers at run-time in order to
avoid traversing all of the ints when trying to rewrite pointers?

Also, any idea what the maximum proportion of the running time of a program
is spent doing this rewriting? For example, if you fill a float-float hash
table with values does updating the pointers in the major heap account for a
significant proportion of the total running time of the program?

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Damien Doligez
 Sent: 29 October 2010 08:48
 To: caml users
 Subject: Re: [Caml-list] How does OCaml update references when values
 are moved by the GC?
 
 
 On 2010-10-28, at 23:48, Jon Harrop wrote:
 
  How does OCaml update references in the stacks and heap when values
 are
  moved by the GC?
 
 
 They are updated by the GC, of course.
 
 -- Damien
 
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] How does OCaml update references when values are moved by the GC?

2010-10-30 Thread Jon Harrop

I just found some interesting statements made by others on this post about
optimizing OCaml's write barrier:

  http://eigenclass.org/R2/writings/optimizing-caml_modify

In the comments, Mauricio said that caml_modify (the write barrier)
accounted for over 30% of the total running time of the whole application.

 -Original Message-
 From: Jon Harrop [mailto:j...@ffconsultancy.com]
 Sent: 30 October 2010 18:16
 To: 'Damien Doligez'; 'caml-l...@inria.fr'
 Subject: RE: [Caml-list] How does OCaml update references when values
 are moved by the GC?
 
 I was hoping for a little more detail, of course. :-)
 
 How is the mapping from old to new pointers stored? Does the GC rewrite
 all of the thread-local stacks in series before allowing any of them to
 continue? Does the write barrier record pointers written into the major
 heap so only those specific locations are rewritten at minor heap
 collections but the entire major heap is rewritten upon compaction? Can
 the GC distinguish between an array of ints and an array of pointers at
 run-time in order to avoid traversing all of the ints when trying to
 rewrite pointers?
 
 Also, any idea what the maximum proportion of the running time of a
 program is spent doing this rewriting? For example, if you fill a
 float-float hash table with values does updating the pointers in the
 major heap account for a significant proportion of the total running
 time of the program?
 
 Cheers,
 Jon.
 
  -Original Message-
  From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
  boun...@yquem.inria.fr] On Behalf Of Damien Doligez
  Sent: 29 October 2010 08:48
  To: caml users
  Subject: Re: [Caml-list] How does OCaml update references when values
  are moved by the GC?
 
 
  On 2010-10-28, at 23:48, Jon Harrop wrote:
 
   How does OCaml update references in the stacks and heap when values
  are
   moved by the GC?
 
 
  They are updated by the GC, of course.
 
  -- Damien
 
  ___
  Caml-list mailing list. Subscription management:
  http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
  Archives: http://caml.inria.fr
  Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
  Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] How does OCaml update references when values are moved by the GC?

2010-10-30 Thread Jon Harrop

Hi Michael,

Thanks for the info. I stumbled upon Richard's excellent web page describing
the internals of OCaml and the write barrier but it does not describe how
the pointers actually get rewritten.

Cheers,
Jon.

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of Michael Ekstrand
 Sent: 30 October 2010 18:38
 To: caml-list@yquem.inria.fr
 Subject: Re: [Caml-list] How does OCaml update references when values
 are moved by the GC?
 
 On 10/30/2010 12:15 PM, Jon Harrop wrote:
  I was hoping for a little more detail, of course. :-)
 
  How is the mapping from old to new pointers stored? Does the GC
 rewrite all
  of the thread-local stacks in series before allowing any of them to
  continue?
 
 I imagine so.
 
  Does the write barrier record pointers written into the major heap
  so only those specific locations are rewritten at minor heap
 collections but
  the entire major heap is rewritten upon compaction?
 
 Yes.  The runtime maintains a 'remembered set', a list of pointers from
 the major heap back to the minor heap.  Maintaining this set is why
 mutable data can be expensive in OCaml - any time you store a pointer
 into a mutable field, the runtime must check whether the new link is
 from the major to the minor heap and update the refs list accordingly.
 Richard WM Jones has details here:
 
 http://rwmj.wordpress.com/2009/08/08/ocaml-internals-part-5-garbage-
 collection/
 
  Can the GC distinguish
  between an array of ints and an array of pointers at run-time in
 order to
  avoid traversing all of the ints when trying to rewrite pointers?
 
 Not that I know of.  The tag block does not have a documented reserved
 value to indicate that - there are values to indicate an unboxed float
 array, a string, and an array of opaque values, but not an integer
 array
 (unless the opaque value flag is set for integer arrays).
 
  Also, any idea what the maximum proportion of the running time of a
 program
  is spent doing this rewriting? For example, if you fill a float-
 float hash
  table with values does updating the pointers in the major heap
 account for a
  significant proportion of the total running time of the program?
 
 In my data analysis jobs (which wind up allocating quite large heaps),
 the compactor almost never (detectably) runs.  Minor cycles and major
 slices are a much larger concern in my experience.  I work around that
 by increasing the minor heap size to decrease minor heap thrashing (my
 general rule of thumb is that I want each work unit, whatever that
 may
 be, to fit in the minor heap).
 
 It could well be that other applications will have different
 characteristics that trigger more compactions.  I cannot speak for
 those
 applications.  Further, when I have huge floating-point data
 structures,
 I'm usually using bigarrays (not because I choose them over arrays,
 typically, but because such code in my work frequently has to interact
 with BLAS or SPARSKIT at some point).
 
 - Michael
 
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[Caml-list] How does OCaml update references when values are moved by the GC?

2010-10-28 Thread Jon Harrop

How does OCaml update references in the stacks and heap when values are
moved by the GC?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[Caml-list] Asynchronous IO programming in OCaml

2010-10-24 Thread Jon Harrop

Is there a tutorial on using something like LWT for asynchronous programming
in OCaml? I'm looking for an example like an echo server that handles
clients concurrently without blocking threads, so it can handle thousands of
clients without significant performance degradation.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] from if...else to pattern matching

2010-10-02 Thread Jon Harrop

Note that you are using physical equality (==) and not structural equality
(=).

Also, you can always rewrite:

  if pred then true else false

more simply as:

  pred

 -Original Message-
 From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-
 boun...@yquem.inria.fr] On Behalf Of ben kuin
 Sent: 02 October 2010 15:05
 To: caml-l...@inria.fr
 Subject: [Caml-list] from if...else to pattern matching
 
 hi
 I try to transform an if-else clause into pattern matching, I think
 I've tried a lot of approaches, but apperently I'm doing something
 fundemently wrong.
 
 
 ~
 (** Define behaviors against a constang   'x = c' with if...else and
 pattern matching.
 I want to see how to match a value against funtion **)
 
 (* defining a  '=' operator *)
 let (=) a b = if a  b then true else if a == b then true else false;;
 
 let if_test c  =
 if ( 4 = c )then
   true
 else
   false
 ;;
 
 (* and now use the operator in pattern matching * )
 let match_test c =
   match  ( _ = c ) (* pseudocode *)
 | 4 - true
 | _ - false
 ;;
 ~~~
 
 
 thanks in advance
 ben
 
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Compiling Ocaml sources to c sources

2010-09-15 Thread Jon Harrop

   * Hacking a C generator inside Ocaml is non-trivial, because of the
 garbage collector, currified function calls, and tail recursion
 etc.

BTW, I was always surprised nobody had converted the bytecode interpreter
into a via-C compiler by unwinding the C code the interpreter goes through
to execute a specific OCaml bytecode program. You could even use this with
something like Clang to get JIT compilation to native code. Not very
efficient but surely much faster than interpreted bytecode...

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Compiling Ocaml sources to c sources

2010-09-15 Thread Jon Harrop

 Hmm, this would only optimize the bytecode fetch/decode step of the ocaml
execution cycle. I am not sure that it will result in much real-world
speedup.

Would be interesting to try. I suspect the C compiler would then optimize
the sequences of operations on the data as well. For example, something like
vector addition.

 In fact, that seems to be the main problem with many of these so-called
JIT interpreters, which in my opinion, do not seem to have learnt from the
HAL architectures of IBM OS's etc. Was probably also the problem with
Transmeta; cheap compilation entails cheap performance.

Can you elaborate?

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: How to re-implement the GC?

2010-09-13 Thread Jon Harrop

Hi Eray,

Retrofitting a new multicore-friendly GC onto OCaml is difficult for two
main reasons:

1. You want plug-and-play GCs but the OCaml compiler is tightly coupled to
the old GC (although OC4MC has decoupled them!).

2. Recovering similar performance whilst reusing the same data
representation is extremely difficult because the current design relies so
heavily on lightweight allocation. You really want to change the data
representation to avoid unnecessary boxing (e.g. never box or tag int,
floats or tuples) in order to reduce the allocation rate and, consequently,
the stress on the garbage collector but OCaml cannot express value types and
its ability to represent polymorphic recursion makes this extremely
difficult to implement.

As Sylvain has said, OC4MC is your best bet if you want to try to write a
new GC for OCaml. However, making more extensive changes has the potential
to address many more problems (e.g. convey run-time type information for
generic printing) so you might consider alternatives like trying to compile
OCaml's bytecode into HLVM's IR for JIT compilation because HLVM already has
a multicore friendly GC and it is much easier to develop.

 Ah, that's interesting. I wonder if it provides any real speedup on new
 architectures compared to storing the pointer in RAM.

For a multicore GC, using a register to refer to thread-local data is a huge
win because accessing thread-local data is very slow. I made that mistake in
HLVM's first multicore-capable GC and it was basically useless as a
consequence because all of the time was spent looking up thread-local data.
When I started passing the thread-local data around as an extra argument to
every function (not necessarily consuming a register all the time because
LLVM is free to spill it), performance improved enormously.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: How to re-implement the GC?

2010-09-13 Thread Jon Harrop

 Other GC algorithm for Java/C# often made the assumption of long-living
 objects with mutation. This is not the case for OCaml.

They do favour mutation and, consequently, have cheaper write barriers but
both the JVM and CLR use pointer-bumping minor heaps for the first nursery
generation to collect short-lived objects efficiently, just like OCaml.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Re: Question about float refs.

2010-08-31 Thread Jon Harrop

You might prefer to multiply by 1.0 because adding 0.0 changes -0.0.

Cheers,
Jon.

 Some black magic is needed:
 
 let r = ref 0.0 in
 for i = 0 to 1000_000_000 do r := float i done;
 Printf.printf %f\n (!r +. 0.);
 Printf.printf words: %f\n (Gc.stat ()).Gc.minor_words
 
 and the reference will be placed in a register (Caml heap will not be
 used inside the loop)
 
 - Dmitry Bely


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Llama Light: a simple implementation of Caml

2010-08-30 Thread Jon Harrop

Try to remove all assumptions of uniform run-time representation from the
compiler because avoiding boxing gives huge performance gains and makes it
much easier to write a performant garbage collector. You'll need to
sacrifice polymorphic recursion though, which you probably already have
anyway.

 

Cheers,

Jon.

 

From: caml-list-boun...@yquem.inria.fr
[mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of ivan chollet
Sent: 30 August 2010 11:57
To: Jeremy Bem
Cc: caml-list List
Subject: Re: [Caml-list] Llama Light: a simple implementation of Caml

 

OK.

This looks nice and I would be pleased if you could put a few pointers or
explanations on your webpage about your typechecker implementation and how
it differs with OCaml typechecker. 
I will get some free time this week and to implement yet another runtime and
bytecode compiler from scratch. Not sure if it will be completed at the end
of the week, but i'll be definitely interested to know more about the
theoretical motivations of works like yours!




On Mon, Aug 30, 2010 at 2:37 AM, Jeremy Bem jere...@gmail.com wrote:

bluestorm:

 

Thank you for the bug report.  The toplevel issue has been fixed in the
version now posted.

 

Do you see a nice way to add let-generalization without reintroducing type
levels?  I was pleased to remove those.

 

Ivan:

 

It was originally forked from Caml Light but now includes more code from
OCaml.  The typechecker is mostly original code at this point; the compiler
is OCaml's with minimal changes to accommodate the new typechecker; the
runtime is almost identical to OCaml's.

 

-Jeremy

 

On Sun, Aug 29, 2010 at 6:52 AM, bluestorm bluestorm.d...@gmail.com wrote:

When using the toplevel, declaration phrases fail (looks like a linking
problem), but expressions work as intented :

$ llama

Llama Light version 0.0828

# 1 + 1;;

- : int = 2

# let x = 1 + 1;;

Error: Reference to undefined global `Toploop'

 

I made my tests using llamac -i foo.ml.

 


I found it startling that the most important difference to my eyes are
buried, on the web page, under lines of relatively boring documentation :

In Llama Light (and in contrast to other Caml implementations):

 

- let does not generalize.

- Phrases are generalized immediately. In particular, let foo = ref []
does not typecheck.

- The value restriction is not relaxed. (This is similar to Caml Light.)

 

These choices simplify the implementation while having relatively little
impact on the user.

 

You cite the Let Should Not Be Generalised paper. There is however a
difference in your application of the maxim : in the paper, local let that
have polymorphic type annotations are generalised, while in your system it
is not possible to force generalisation.

 

I had a look at the typer, and it's indeed rather simple; it seems it would
be quite simple to implement generalized let (when encountering annotations
or with a different syntaxic construct : letg .. = .. in ...), but the
added complexity is equivalent to adding let generalization by default.

 

Is the presence of let-generalization a real issue in your work ?

 


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

 

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] Llama Light: a simple implementation of Caml

2010-08-30 Thread Jon Harrop

If you can keep it agnostic wrt boxing then you should be able to plug it
into HLVM easily for much faster numerics and parallelism.

 

Cheers,

Jon.

 

From: ivan chollet [mailto:ivan.chol...@gmail.com] 
Sent: 30 August 2010 18:10
To: Jon Harrop
Cc: Jeremy Bem; caml-list List
Subject: Re: [Caml-list] Llama Light: a simple implementation of Caml

 

clearly didn't plan to support polymorphic recursion in any way. I don't
even plan to support lexical scoping...
As for data representation I'm actually boxing everything except ints and
function pointers. I found out that it leads to the simplest design, which
is appropriate for a project that I don't want to take me more than a few
days.



On Tue, Aug 31, 2010 at 1:57 AM, Jon Harrop
jonathandeanhar...@googlemail.com wrote:

Try to remove all assumptions of uniform run-time representation from the
compiler because avoiding boxing gives huge performance gains and makes it
much easier to write a performant garbage collector. You'll need to
sacrifice polymorphic recursion though, which you probably already have
anyway.

 

Cheers,

Jon.

 

From: caml-list-boun...@yquem.inria.fr
[mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of ivan chollet
Sent: 30 August 2010 11:57
To: Jeremy Bem
Cc: caml-list List
Subject: Re: [Caml-list] Llama Light: a simple implementation of Caml

 

OK.

This looks nice and I would be pleased if you could put a few pointers or
explanations on your webpage about your typechecker implementation and how
it differs with OCaml typechecker. 
I will get some free time this week and to implement yet another runtime and
bytecode compiler from scratch. Not sure if it will be completed at the end
of the week, but i'll be definitely interested to know more about the
theoretical motivations of works like yours!



On Mon, Aug 30, 2010 at 2:37 AM, Jeremy Bem jere...@gmail.com wrote:

bluestorm:

 

Thank you for the bug report.  The toplevel issue has been fixed in the
version now posted.

 

Do you see a nice way to add let-generalization without reintroducing type
levels?  I was pleased to remove those.

 

Ivan:

 

It was originally forked from Caml Light but now includes more code from
OCaml.  The typechecker is mostly original code at this point; the compiler
is OCaml's with minimal changes to accommodate the new typechecker; the
runtime is almost identical to OCaml's.

 

-Jeremy

 

On Sun, Aug 29, 2010 at 6:52 AM, bluestorm bluestorm.d...@gmail.com wrote:

When using the toplevel, declaration phrases fail (looks like a linking
problem), but expressions work as intented :

$ llama

Llama Light version 0.0828

# 1 + 1;;

- : int = 2

# let x = 1 + 1;;

Error: Reference to undefined global `Toploop'

 

I made my tests using llamac -i foo.ml.

 


I found it startling that the most important difference to my eyes are
buried, on the web page, under lines of relatively boring documentation :

In Llama Light (and in contrast to other Caml implementations):

 

- let does not generalize.

- Phrases are generalized immediately. In particular, let foo = ref []
does not typecheck.

- The value restriction is not relaxed. (This is similar to Caml Light.)

 

These choices simplify the implementation while having relatively little
impact on the user.

 

You cite the Let Should Not Be Generalised paper. There is however a
difference in your application of the maxim : in the paper, local let that
have polymorphic type annotations are generalised, while in your system it
is not possible to force generalisation.

 

I had a look at the typer, and it's indeed rather simple; it seems it would
be quite simple to implement generalized let (when encountering annotations
or with a different syntaxic construct : letg .. = .. in ...), but the
added complexity is equivalent to adding let generalization by default.

 

Is the presence of let-generalization a real issue in your work ?

 


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

 

 

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] interest in a much simpler, but modern, Caml?

2010-08-12 Thread Jon Harrop

Presumably, in general, that would need to be:

 

  If Bool.(Int.(x=42) || Int.(x=45)) .

 

At which point things start to look hairy. To be honest, I see this
delimited overloading as the best of a bad job in the absence of equality
types or per-type functionality. I think what we really want is to be able
to define custom equality, comparison and hashing on types when structural
versions are not applicable. F#'s solution is pretty good. Type classes are
a generalization but I do not see that they buy you much for the added
complexity.

 

I'd have thought this (at least equality types) would be worth putting in a
minimalistic language because it is so useful.

 

Cheers,

Jon.

 

From: Jeremy Bem [mailto:jere...@gmail.com] 
Sent: 12 August 2010 01:22
To: Jon Harrop
Cc: bluestorm; caml-list List; Florian Weimer
Subject: Re: [Caml-list] interest in a much simpler, but modern, Caml?

 

On Wed, Aug 11, 2010 at 9:02 AM, Jon Harrop
jonathandeanhar...@googlemail.com wrote:

 

What happens when you do:

 if Int.(x = 42 || x = 45) then ... else ...

Presumably it either barfs on the assumption that || refers to bitwise-or
between ints, or we're back to inventing progressively more absurd operator
names for each individual combination of types over which they might work.

 

How so?  I think this is a borderline case (even in C++, || does not refer
to bitwise-or).  But even if Int.(||) *were* defined as some sort of integer
operation, one could simply write:

  if Int.(x = 42) || Int.(x = 45)

 

Also, I think the discussion has shifted.  For me, the local open is a
reasonably appealing way to stop using OCaml's exotic polymorphic operators
whose behavior depends on the runtime representation and which don't respect
type abstraction.  (For example, ocamllex uses Pervasives.(=) to test
whether Map's are empty, but this breaks if the Map representation changes.)
Moreover the syntax even maintains OCaml compatibility thanks to the recent
update.  But now we seem to be talking about operator overloading, and I'm
just not convinced it's necessary at all in a system with a minimalist
aesthetic.

 

Back to the local opens, I find that I'm hesitant to add so many of them,
especially for equality.  Polymorphic equality is hardly unnatural, after
all (cf. higher-order logic).  I wonder, do any practical languages use
quotient types to implement custom equality predicates?  In principle,
Pervasives.(=) ought to be the finest reasonable equivalence relation on a
type, which could then be coarsened:

 

type foo = Foo of int | Goo of string

let _ = assert (Foo 3  Goo 3) (* duh *)

let foo_equiv x y =

  match x, y with

Foo a, Foo b - a=b

  | Goo a, Goo b - a=b

  | Foo a, Goo b

  | Goo b, Foo a - string_of_int a = b

type goo = foo / foo_equiv (* automatically creates goo_of_foo *)

let _ = assert (goo_of_foo (Foo 3) = goo_of_foo (Goo 3))

 

This would require runtime support.  I envision that every goo is a block
whose tag is Quotient_tag and which stores a foo_equiv closure in its
first Obj field.

 

As it happens, this approach would dovetail with my plans for an integrated
proof assistant.  Of course it lacks the conservatism I've been promoting
:)

 

-Jeremy

 

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-19 Thread Jon Harrop

Xavier Clerc wrote:
 Jon Harrop jonathandeanhar...@googlemail.com a écrit :
  I don't think this is heated at all. We were talking about high
  performance languages and you cited a bunch of languages that get
 whipped
  by Python on this benchmark:
 
 
  http://flyingfrogblog.blogspot.com/2009/04/f-vs-ocaml-vs-haskell-
 hash-table.
  html
 
 Acknowledged.
 Whipped is here 2 times slower on that particular benchmark,
 while Python is rarely within an order of magnitude of OCaml code
 (cf. the language shootout).
 Moreover, hashtables are ubiquitous in Python (and hence probably
 particularly optimized), while they are not so common in Haskell
 or Caml.

I greatly value efficient generic collections.

  and references to benchmarks that back up your claims in this
 thread.
 
http://fsharpnews.blogspot.com/2010/05/java-vs-f.html
 
 Point taken.
 Just notice that the 17x factor is observed on the micro-benchmark,
 while on the larger one the two platforms seem on par.

Sure but those two benchmarks are testing completely different things.  The
shootout's knucleotide is a larger benchmark that still uses hash tables and
Java is still 4x slower because the JVM cannot express a generic hash table
efficiently. There are many such problems where the lack of value types on
the JVM is a serious impediment to performance.

 Here is a question about the micro-benchmark: do you know if F# do
 monomorphize the collection in this example?
 If it turns out to be done, one may probably argue that the problem
 is more related to the language than to the platform (just recycling
 an objection made on the page you pointed out).

I'm not sure what you mean here. Both of the programs are just using the
generic hash table implementation native to their platform. Neither language
does anything of relevance to optimize the code. .NET is a lot faster and
uses a lot less memory than the JVM because it stores keys and values
unboxed in the spine of the hash table because they are value types whereas
the JVM boxes them, and because the insertion function is monomorphized at
JIT time.

  Scala on the JVM is 7x slower than C++ on this benchmark:
 
 http://shootout.alioth.debian.org/u64q/benchmark.php?test=alllang=scal
 alan
  g2=gpp
 
 Agreed, but it seems that if you aggregate the results of the different
 benchmarks, Scala is on average only 1.5x from C++ (but far away in
 terms
 of memory consumption). The 7x factor is observed the worst result,
 the median being 2x.

Sure. This seems to be a difference in our interpretation of high
performance. If a language or platform can be 17x slower than others then I
would not call it high performance even if it is competitively performant
elsewhere. Indeed, I would completely disregard averages on benchmark suites
and focus on outliers because they convey more useful information. Granted
that was a microbenchmark but the effect is severe enough to afflict many
real programs.

  It may just end up that we have different perceptions of high
  performance, and of the trade-offs we are going to make in our
  language / platform choices.
 
  Probably. What languages do not you not consider to be high
  performance?
 
 I am not sure it is that easy to compare languages, but measuring
 compiler
 performances: any compiler that produces code that runs within -let's
 say- 5x
 of the fastest one around, on a bunch of wide-spectrum benchmarks (e.
 g.
 numerical code *plus* string matching *plus* tree walking, etc).
 Maybe it should also be mentioned that I am more versed into symbolic
 computations.

Then you're probably more interested in OCaml's GC vs parallelism when
performance is important.

 Regarding trade-offs, I am also inclined to favor Open Source solutions
 and
 higher-level languages (the trade-off being here execution time vs
 programming/debugging time).

I agree in general, of course, but I'm not sure higher-level languages
means much in this context. Is C# higher level than Java? Maybe, but I'm
interested in the value types and approach to generics here.

Monomorphization and unboxed tuples get you a long way towards decent
performance without a top-notch GC tuned for functional languages. That
makes it feasible to implement with a naïve multicore-capable GC, which is
precisely the current direction of HLVM.

 PS: as an aside, I used the word references for academic publications
 that went through a reviewing process, not blog entries.

I see. I value reproducible results far higher than peer reviewed academic
papers, particularly in this context.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-18 Thread Jon Harrop

Xavier Clerc wrote:
 Jon Harrop jonathandeanhar...@googlemail.com a écrit :
  Xavier Clerc:
  Le 14 mai 2010 à 12:40, Jon Harrop a écrit :
   Xavier Clerc wrote:
   Limiting myself to the JVM...
   Moreover, at least Scala and Bigloo deliver excellent
 performances.
  
   I have benchmarks where the JVM is well over 10x slower than .NET.
 So
   I do not regard any JVM-based language as high performance.
 
  Quite ironically, by scratching the surface, one would discover that
  both quoted projects can also target .NET (not tested that though).
 
  Does Bigloo.NET support value types? Does Scala.NET use .NET (2.0)
  generics?
  Not AFAICT. Name dropping them in the context of high performance
  language
  implementations is more than a little bit silly...
 
 First off, public insult seems quite superfluous.

I was not trying to insult you. Your examples are silly because they are
incomplete and untested. Do you even have either of them working right now?
AFAICT, Scala.NET is known not to work and Bigloo.NET is still have dozens
of core bugs fixed.

 We should be able to handle a heated debate without resorting to that.

I don't think this is heated at all. We were talking about high
performance languages and you cited a bunch of languages that get whipped
by Python on this benchmark:

 
http://flyingfrogblog.blogspot.com/2009/04/f-vs-ocaml-vs-haskell-hash-table.
html

 And I still wait for a clear statement of your level for high
 performance,

Within 2x of ANSI C compiled with gcc on all practically-relevant benchmarks
without dropping to low-level code, e.g. GHC's FFI in Haskell.

 and references to benchmarks that back up your claims in this thread.

  http://fsharpnews.blogspot.com/2010/05/java-vs-f.html

 As you seem to come from an academic background, I expect facts
 and references, and not ad hominem attacks and fuzzy unbacked claims.

An ad-hominem attack is an attack against a person. I attacked your
examples, not you.

 Unless you show that neither Bigloo nor Scala meet your (to be defined)
 criteria for high performance, my counterexamples still stand.

Are you talking about Bigloo.NET and Scala.NET or have you gone back to the
original discussion about JVM-based languages?

Scala on the JVM is 7x slower than C++ on this benchmark:

 
http://shootout.alioth.debian.org/u64q/benchmark.php?test=alllang=scalalan
g2=gpp

The JVM's hash table is 17x slower than .NET's on this benchmark:

  http://fsharpnews.blogspot.com/2010/05/java-vs-f.html

I think that is not high performance by any reasonable definition and this
reflects fundamental deficiencies in the VM itself, so there is no hope of
working around it in general.

I have not been able to get Bigloo to run: it was deleted from Debian and
Ubuntu (and the shootout) and the source distribution barfs during
configuration with  ./install-gc-7.1: 39: patch: not found.

 It may just end up that we have different perceptions of high
 performance, and of the trade-offs we are going to make in our
 language / platform choices.

Probably. What languages do not you not consider to be high performance?

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-16 Thread Jon Harrop

Xavier Clerc:
 Le 14 mai 2010 à 12:40, Jon Harrop a écrit :
  Xavier Clerc wrote:
  Limiting myself to the JVM...
  Moreover, at least Scala and Bigloo deliver excellent performances.
 
  I have benchmarks where the JVM is well over 10x slower than .NET. So
  I do not regard any JVM-based language as high performance.
 
 Quite ironically, by scratching the surface, one would discover that
 both quoted projects can also target .NET (not tested that though).

Does Bigloo.NET support value types? Does Scala.NET use .NET (2.0) generics?
Not AFAICT. Name dropping them in the context of high performance language
implementations is more than a little bit silly...

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-15 Thread Jon Harrop

Goswin wrote:
 Hardly any business today has an inhomogene environment. And if the
 environment is homogene then the vm gives you 0 advantage. It just
 costs you overhead to emulate.

A Common Language Runtime (CLR) is an obvious counter example = the shared
VM gives you safe and high-level interoperability between languages. For
example, you can pass garbage collected data structures between languages by
reference because they share a single GC. Without a CLR, you generally
resort to painstakingly copying everything for no reason and give up for
anything non-trivial (like closures) and often end up with segmentation
faults and no working code.

I was once asked how someone might interoperate between Standard ML and
OCaml (two very closely related languages) and my answer was XMLRPC.
Contrast that with F# development where almost every program in existence
relies entirely upon seamless two-way interoperability with C# libraries
like WPF.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-15 Thread Jon Harrop

Raoul Duke wrote:
 On Fri, May 14, 2010 at 11:59 AM, ben kuin benk...@gmail.com wrote:
  but that would be the big benefit of a clr like vm: It doesn't matter
  how messed up, chaotic or just heterogen the environment is as long
  as you can count on a regular execution of your portable bytecode.
 
 of course it matters: there must be the resources to get the vm ported
 across all the fubar variations of the ecosystem. the combinatorics
 has to be dealt with somewhere. that kind of complexity is less in the
 hegemonic windows os world, i hypothesize.

Not really. Windows supports a far wider variety of hardware than Linux and
Apple supports even less. Providing consistency was one of the major
advantages of .NET that had people building on it originally. If you want
robustly-deployable hardware-accelerated GUI apps then WPF and, therefore,
.NET is your only choice on Windows and you have zero choices on Linux. If
you wanted to build something comparable on Linux you would use OpenGL and
must then test all software/hardware combinations to see which were
unreliable (OpenGL drivers are often very unreliable on Linux).

I saw this first hand when I productized OpenGL-based presentation software
written in OCaml. It was a catastrophe: with segfaults on 80% of customers
machines that we could not reproduce. We canned it and never tried to sell
Linux-based software again.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-14 Thread Jon Harrop

Xavier Clerc wrote:
 Limiting myself to the JVM...
 Moreover, at least Scala and Bigloo deliver excellent performances.

I have benchmarks where the JVM is well over 10x slower than .NET. So I do
not regard any JVM-based language as high performance.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-14 Thread Jon Harrop

Ben Kuin wrote:
 technical:
 - in .NET everything is easy (from the surface): you have your source
 file (hello.cs) you take your compiler (cs.exe) and compile it to a
 msil bytecode file (hello.dll). You can run reflection tools to
 hello.dll or link it to a exe or generate back to source.  This
 bytecodefile is your friend. You can run it on a bunch of runtimes
 like .net clr, or on mono, or rotor, or dotgnu. You can register
 libraries in a container to prevent versioning problems with future
 releases. I couldn't figure out those equivalents int hlvm or llvm.

They haven't been written yet. :-)

 - with HLVM things are complex (for me): What is the role of the big
 underlying llvm infrastructure.

LLVM provides the compilation to native code, both static and JIT.

 Why do even need hlvm if we have llvm plus ocaml bindings.

LLVM's intermediate representation is a low-level assembly language with no
memory safety and no automatic memory management. HLVM is a layer on top of
LLVM that provides a higher-level intermediate language that is memory safe
and garbage collected.

 Does hlvm has its own bytecode? If yes where is it specified?

HLVM's IR is evolving much too quickly to be worth specifying yet.

 Is hlvm a ocaml library or is it a free standing vm?

Today, HLVM is an OCaml library. In the future, HLVM will be a free standing
VM.

 Maybe for you these a trivial questions, but I still dont get it,
 while with .net I never had problem to get the idea.

Ultimately, HLVM should be just as easy to understand. The only reason it
appears complicated today is that HLVM is a set of evolving fragments that
are gradually stabilizing and being pulled together to form a progressively
more capable core VM.

 What if hlvm would really take off, could they set it free and move the
homepage to sourceforge?

HLVM is free. I would always keep the core (VM and standard library) open
source. I might write an HLVM for Scientists book, sell libraries or
implement features for a fee but nothing evil. :-)

In fact, one dream I have is to create a commerce-friendly platform for
tools and libraries. I think Linux has really suffered from being
anti-commerce rather than truly free.

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-12 Thread Jon Harrop

Ben Kuin wrote:
  A little off topic, but how is Mono/Unix these days?
  Still leaks memory,
 you refer to your examinations?
 (http://flyingfrogblog.blogspot.com/2009/01/mono-22-still-leaks-
 memory.html?showComment=1233522107493#c7872630239059031867)
 where you say yes and the mono devs are say no to memory leaking?

Yes.

  has broken TCO
 Again, I think other people do not have the same opion on this (
 http://flyingfrogblog.blogspot.com/2009/01/mono-does-not-support-tail-
 calls.html)

Yes. They are wrong.

 I've introduced the post with some license related concerns, maybe I
 should take a step back and think about what I want:
 
 1. - programming in a ML like language ( especially the caml family
 since I really like those lightweigt type definitions and the pattern
 matching that can be applied on those)
 
 2. - high performance runtime, preferably a jit based vm, no problems
 with TCO
 
 3. - a true open source license (approved by Open Source Initiative or
 by Free Software Foundation)
 
 I think this 3 point are REASONABLE but the combination of those 3
 items is INEXISTENT.

I'm afraid the situation is far worse. Even if you relax your conditions
from ML-like to any functional language and even allow broken TCO, there
are not only no language implementations satisfying those weaker conditions
but nobody is even trying to create such a language implementation.

 Ocaml on HLVM: I would appreciate if Jon could make a clear statement
 if this is something serious or just a little toy.

HLVM is not yet ready for serious use and it may well never compile OCaml
but at least it is now compiling code like this:

http://flyingfrogblog.blogspot.com/2010/04/variant-types-and-pattern-matchin
g-in.html

 A last idea: What do you think about libjit? They claim that a jvm/clr
 like runtime could be built in weeks. Wouldn't it be nice to have a
 fast vm for Ocaml (ocamljit) ? Does someone has experience with this,
 I think writing a fast vm is hard, but a fast vm for a functional
 language is nearly impossible? Maybe OcamIL could then be used as a
 model for a jit backend, since its MSIL output already runs on libjit
 (DotGnu, alias pnet)

I think LLVM is a much better choice than libjit. Once you've got that kind
of solid foundation to build upon and a decent language like OCaml to write
in, you can write a decent FPL implementation in a few man-months. The
problem is finding the time...

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

RE: [Caml-list] about OcamIL

2010-05-10 Thread Jon Harrop

 A little off topic, but how is Mono/Unix these days?  Last I checked
 (2 years ago) it implemented the basic libraries and runtimes but had
terrible
 performance.  Is it now on par with Windows?

Still leaks memory, has broken TCO and runs like a dog. Mono has also fallen
even farther behind now that .NET 4 is out. However, they have at least
stated that they intend to start trying to support F# on Mono. Then again,
they stated about 6 years ago they were going to replace their crappy
conservative GC with one that might actually work but they never managed to
do that either...

Cheers,
Jon.


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: WAS Re: [Caml-list] Re: The need to specify 'rec' in a recursive function defintion

2010-02-16 Thread Jon Harrop

On Tuesday 16 February 2010 16:47:03 Grant Rettke wrote:
 On Tue, Feb 16, 2010 at 10:21 AM, Ashish Agarwal agarwal1...@gmail.com 
wrote:
  let rec

 Do OCaml'er look at let rec more as being a message to the programmer,
 rather than the compiler, that the way I want to define this function
 is recursively so even if 'f' was previously bound you know which one
 I mean?

I see it as resolving an ambiguity for both the programmer and compiler. There 
are alternatives as others have mentioned but none seem particularly good or 
bad to me. Moreover, the burden of rec is tiny so I don't think it is worth 
discussing in such detail.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: The need to specify 'rec' in a recursive function defintion

2010-02-15 Thread Jon Harrop

On Monday 15 February 2010 15:46:58 Stefan Monnier wrote:
 Till Varoquaux had written:
  Let's make things clear here: the rec *really* is a feature;

 Nobody said otherwise.  Eliminating the rec is also a feature.
 Those two features are mostly incompatible, and many reasonable people
 disagree on which one of the two is more important.

 Stefan who extensively used that feature in SML, but happens
 to prefer the other feature nevertheless

Standard ML doesn't have the feature that Till described.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[Caml-list] deriving show

2010-02-12 Thread Jon Harrop


IIRC, there is an OCaml macro that will autogenerate the code to print values 
of a variant type from the type definition. Does it handle polymorphic 
variants?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] An extended comparative study of language support for generic programming

2010-02-10 Thread Jon Harrop

On Wednesday 10 February 2010 23:00:44 Raoul Duke wrote:
  http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.110.122rep=rep1
 type=pdf
 
 :-)

 http://groups.google.com/group/comp.lang.functional/browse_thread/thread/d6
54fedd6efdf753/3ee82770d5e79402#3ee82770d5e79402

See my response there:

http://groups.google.com/group/comp.lang.functional/msg/2cb15a6281087b04

:-)

I was wondering if anyone here was familiar with this work and/or had anything 
to say about their OCaml solutions and discussion?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] llvm?

2010-02-10 Thread Jon Harrop

On Wednesday 10 February 2010 22:51:33 Raoul Duke wrote:
 hi,

 any news about / anybody working on ocaml-on-llvm?

I don't believe anyone is working on porting OCaml to LLVM.

The nearest work is probably my own HLVM project which reached a major 
milestone recently and is now capable of high-performance parallel 
programming:

  http://flyingfrogblog.blogspot.com/2010/01/naive-parallelism-with-hlvm.html

From the point of view of an OCaml programmer, HLVM is a DSL that drops 
various features from OCaml (e.g. polymorphic recursion) in order to provide 
easily-obtained, predictable and high performance.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: The need to specify 'rec' in a recursive function defintion

2010-02-10 Thread Jon Harrop

On Wednesday 10 February 2010 22:25:44 Till Varoquaux wrote:
 Some (including me) would even argue that it is sad that type
 definitions don't use rec.

Agreed. Less useful than rec on function definitions but that would still be 
useful sometimes.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] The need to specify 'rec' in a recursive function defintion

2010-02-09 Thread Jon Harrop

On Tuesday 09 February 2010 22:01:03 Gerd Stolpmann wrote:
 In the ML community it is consensus that a recursive function is a total
 different thing than a non-recursive function...

Note that Standard ML does this differently to CAML/OCaml.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] The need to specify 'rec' in a recursive function defintion

2010-02-09 Thread Jon Harrop

On Tuesday 09 February 2010 20:50:33 Saptarshi Guha wrote:
 Hello,
  I was wondering why recursive functions need to be specified with
 rec. According to Practical Ocaml, to inform the compiler that the
 function exists. But when entering the function definition, can't the
 compiler note that the function is being defined so that when it sees the
 function calling itself, it wont say Unbound value f?

 How is the knowledge of a function being rec taken advantage of (in
 ocaml) as opposed to other languages
 (leaving aside tail call optimization).

 Wouldn't one of way of detecting a recursive function would be to see
 if the indeed the function calls itself?

  let f x = x + 1
  let f x = 2 * f x

Is the latter f recursive or not?

See my answer to the same question on stack overflow:

http://stackoverflow.com/questions/900585/why-are-functions-in-ocaml-f-not-recursive-by-default/1891573

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] The need to specify 'rec' in a recursive function defintion

2010-02-09 Thread Jon Harrop

On Tuesday 09 February 2010 22:31:49 Saptarshi Guha wrote:
 Yes, I see  that f isn't recursive, because it simplifies down to
 2*(x+1) but for a reader(at least myself) it can be bit tricky to
 consume.

But for a writer it can be useful to produce. For example, imagine a function 
in an interpreter that evaluates an expression to a value in the context of a 
set of variable bindings:

  let rec eval vars = function
...
| Add(f, g) - eval vars f + eval vars g
| Mul(f, g) - eval vars f * eval vars g
...

Such functions make a *lot* of recursive calls.

Now imagine you want to inject some code around every external call into 
that expr function but not recursive calls, e.g. logging exceptional 
returns. Do you want to change the name of that expr function not only in 
its definition but at every recursive call in its body? No.

Fortunately, in OCaml you can just append a definition to supercede expr 
without touching the original definition at all:

  let eval vars expr =
...
let result = eval vars expr in
...
result

This is a useful and common OCaml idiom.

 My experience of reading the /definition/ of a function which 
 includes a call to itself is that it is recursive.

Definitions are often superceded in ML.

 On the stackoverflow post, you mentioned that the British ML branch forces
 different names (since recursion is by default), and though it does pollute
 the namespace, personally I find it easier to read.

You can do that in OCaml as well, if you choose.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[Caml-list] Power serious

2010-02-07 Thread Jon Harrop


I stumbled upon the following article that describes a remarkably simple 
implementation of arithmetic over power series in Haskell:

  http://www.cs.dartmouth.edu/~doug/powser.html

This is the only compelling example of Haskell I have ever seen and I'd like 
to see this rewritten in other languages for comparison. Has anyone tried to 
translate this into OCaml?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] memory profiling

2010-01-09 Thread Jon Harrop

On Tuesday 05 May 2009 13:45:18 dmitry grebeniuk wrote:
 2009/5/5 Christoph Bauer christoph.ba...@lmsintl.com:
  what is the best option to do memory profiling with ocaml?
  Is there a patch of ocaml-memprof for 3.11.0? What about
  objsize?

   If you want to use objsize with ocaml 3.11, you should get
 the new version of objsize -- 0.12:
 http://forge.ocamlcore.org/frs/?group_id=3
   OCaml has new heap since 3.11...

Can anyone elaborate on this?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[Caml-list] HLVM ray tracer performance

2010-01-08 Thread Jon Harrop


I just published results for the ray tracer benchmark written in HLVM and
compared to other languages including OCaml:

http://flyingfrogblog.blogspot.com/2010/01/hlvm-on-ray-tracer-language-comparison.html

Note that these results were obtained with HLVM's multicore garbage collector 
enabled.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] ocaml as editor extension language

2010-01-05 Thread Jon Harrop

On Tuesday 05 January 2010 07:24:42 Joel Reymont wrote:
 You cannot embed OCaml and use it as an editor extension language unless

 1) your editor is open source, or

I believe there are several issues surrounding the QPL license that much of 
the OCaml distribution is under:

1. You do not have the right to distribute your source code freely and must 
distribute only independently (e.g. patches only).

2. The initial developer is given rights to your derivative work.

3. The QPL is not GPL compatible so your derivative work cannot use GPL 
libraries.

See:

  http://www.gnu.org/licenses/license-list.html

Note that OCaml's tools are under the QPL, including ocamllex and ocamlyacc. 
So you cannot even use that code to write a syntax highlighter for your 
editor without being subject to these requirements.

 2) you are a member of the consortium and pay 2K EUR/year

 Is that correct?

You could always ask nicely if they mind you violating their license. :-)

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Favorite OCaml editor?

2010-01-05 Thread Jon Harrop

On Tuesday 05 January 2010 08:45:04 Maxence Guesdon wrote:
 My favorite editor is Chamo:
   http://home.gna.org/cameleon/chamo.en.html

Nice! Why do you prefer it?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Favorite OCaml editor?

2010-01-05 Thread Jon Harrop

On Tuesday 05 January 2010 07:31:45 Daniel Bünzli wrote:
  Your favorite is key here here; I appreciate you human input as I
  can use a search engine to find any old OCaml editor easily.

 Then I think a more interesting question is, what features do you
 absolutely need to be productive ?

 I'm rather low tech and not the power user type but still I couldn't
 do it without (keyboard access to) :

 1) Syntax highlighting and reasonably automatic identation following
 ocaml's programming guidelines [1]

Yes.

 2) Ability to invoke a build tool so that reported errors allow me to
 automatically jump to the offending lines.

Yes but I'd rather have an IDE constantly recompiling and automatically 
flagging errors such that I can jump directly to them using the GUI.

 3) Ability to invoke built programs so that reported stack traces
 allow me to automatically jump to the offending lines.

I don't use that so often, but yes.

 4) Ability to read annot files so that I can query the type of the
 symbol under my cursor.

Absolutely essential.

 5) Ability to switch rapidly between an ml file and its corresponding mli.

Interesting.

 6) Ability to edit C sources.

Bah. Real men use LLVM.

 I guess many people would add

 7) Ability to access the documentation of the symbol under my cursor.

That should go in with the type throwback. Also, it should support typeset 
math and vector graphics. And the source should be unicode with easy access 
to common alphabets and symbols.

 Regarding 7) I have a low tech approach which is to use gnome do (on
 linux) or quicksilver (on osx) to index
 the documentation generated by ocamldoc. Since the latter
 intelligently produces an html file Module.html for a module named
 Module I can quickly access its documentation by invoking gnome do
 with its hot key, type an abbreviation of Module and hit return.
 This opens the document in my browser where I scroll or search in the
 page to get to the symbol.

I tend to use ocamlbrowser.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Favorite OCaml editor?

2010-01-04 Thread Jon Harrop

On Tuesday 05 January 2010 06:03:39 Grant Rettke wrote:
 Hi,

 What is your favorite editor for hacking with OCaml?

I use Emacs+Tuareg but I wouldn't call it my favorite because I loathe it. 
Two good reasons to use it are autoindentation of OCaml code with ALT+Q and 
type throwback on the currently selected expression with CTRL+C CTRL+T.

OCaIDE looked good for something Eclipse-based when I tried it:

  http://www.algo-prog.info/ocaide/

Eclipse is grindingly slow, cumbersome and remarkably confusing for a GUI app 
but at least OCaIDE is still under active development.

There is also the OCaml Development Tools (ODT) project:

  http://ocamldt.free.fr/

but it hasn't seen an update in 11 months.

 Your favorite is key here here; I appreciate you human input as I
 can use a search engine to find any old OCaml editor easily.

I think the best way to write a decent editor for OCaml would be to write one 
using LablGTK for the GUI and camlp4 to parse OCaml code.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: general question, was Re: OCaml is broken

2010-01-03 Thread Jon Harrop

On Sunday 03 January 2010 10:49:38 Sylvain Le Gall wrote:
 The only point of the whole discussion -- which is a recurring point by
 some of those who participate -- is the lack of shared-memory
 parallelism in the core language.

I solved the problem: the latest version of HLVM now facilitates 
high-performance shared-memory parallelism.

The remaining challenges to making this more user friendly are:

1. High-level constructs for parallelism in HLVM (task queues).

2. OCaml-HLVM interop, probably by destructuring values passed from the 
OCaml world so that HLVM programs can use them directly and return results by 
mutating values on the OCaml side.

3. Camlp4 macros so users can write their HLVM code in an OCaml-like DSL.

I believe this is basically an optimal solution for OCaml's multicore problem 
given the practical constraints.

The future's looking bright again. :-)

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[Caml-list] Prevalence of duplicate references in the heap

2010-01-02 Thread Jon Harrop


I took an odd design decision with HLVM and made references a struct of 
run-time type, metadata (e.g. array length), pointer to mark state and 
pointer to data. So every reference consumes 4x32=128 bits rather than the 
usual 32 bits but heap-allocated values no longer require a header.

My performance results really surprised me. For example, the gc benchmark in 
the HLVM test suite fills a hash table that is represented by an array spine 
containing references to array buckets. Despite having (fat) references 
everywhere, HLVM is 2.2x faster than OCaml on x86.

The main disadvantage of HLVM's approach is probably that every duplicate 
reference now duplicates the header information, wasting 96 bits. However, I 
do not believe references are duplicated in the heap very often. Both trees 
and hash tables contain many references but none are duplicated.

So I'm wondering if anyone has studied the makeup of heaps in idiomatic OCaml 
code and could point me to data on the proportion of the heap typically 
consumed by duplicate references, i.e. how much space is HLVM likely to 
waste?

Many thanks,
-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] ocaml, llvm and generating code at runtime

2010-01-01 Thread Jon Harrop

On Friday 01 January 2010 17:39:15 Basile STARYNKEVITCH wrote:
 LLVM is rumored to be a bit faster, but is also rumored to be slow as a
 pure JIT (just in time) code generated (w.r.t. to other non Ocaml
 implementations - eg SBCL or CLISP common lisp).

Are you saying that LLVM's JIT is slow to generate code or that the code it 
generates runs slow?

 Polyml http://polyml.org/ 
 is also supposed to be a JIT-ed SML implementation (it is SML not Ocaml). A
 few years ago, metaocaml existed, but seems dead today.

Walid said that MetaOCaml was going to relive back in August 2008:

http://ocamlnews.blogspot.com/2008/07/metaprogramming-with-metaocaml.html?showComment=121761762#c4695232398067055037

 Happy new year to everyone!

Happy New Year!

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] ocaml, llvm and generating code at runtime

2010-01-01 Thread Jon Harrop

On Friday 01 January 2010 16:45:34 Joel Reymont wrote:
 Does anybody have example code that shows how to generate OCaml bindings at
 runtime with LLVM?

The OCaml Journal article Calling C code from OCaml (10th July 2008) 
described an OCaml program that invoked a C function both via a stub written 
in C and via JIT compiled interface code using LLVM (no more stinking C 
stubs!).

I would like to write more on JIT-compiled interface code in the future and 
perhaps write a library to make it easy.

 My goal is to compile an AST into code that uses OCaml functions within the
 same binary that's doing the compiling.

 I don't think it can be done with OCaml since it requires a standalone
 assembler, linker, etc. Correct me if I'm wrong, though. Mine is a
 web-based compiler with potentially many concurrent sessions. Running gas,
 ld, etc. seems a much heavier and less scalable approach that generating
 code at runtime.

Sounds like you really want HLVM. :-)

   Thanks and Happy New Year, Joel

Happy New Year!

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] ocaml, llvm and generating code at runtime

2010-01-01 Thread Jon Harrop

On Friday 01 January 2010 20:23:35 Basile STARYNKEVITCH wrote:
 I heard that LLVM code generation time is significantly higher (i.e.
 slower) than other JIT technologies. So machine code generation time is
 apparently significant which might be an issue inside a web server) but
 performance of the generated code is supposedly good (inside a web server
 this is important only if the generated code runs a lot, in particular more
 than in a single session).

 I don't have enough personal experience to validate that claim.

 However, both MONO  PARROT sites are saying something similar:

 http://www.mono-project.com/Mono_LLVM

 http://trac.parrot.org/parrot/wiki/JITRewrite

 http://cliffhacks.blogspot.com/2007/03/experimenting-with-llvm.html

 But again, I may be wrong. Only real benchmarks on real applications can
 tell.

 I believe that libjit  GNU lightning should probably both generate machine
 code quicker than LLVM does, but the performance of the generated code (by
 libjit or by lightning) is worse than when using LLVM.

 And some benchmarks on
 http://www.phoronix.com/scan.php?page=articleitem=apple_llvm_gccnum=1
 suggest that LLVM generated machine code is less efficient than GCC
 generated machine code.

 Again, take all this with a grain of salt...

That's a fair assessment but LLVM is only about 2x slower than the fastest 
compilers and it generates code that runs 2-10x faster. For example, 
compiling the 155kLOC of LLVM IR generated by HLVM's test suite takes 9.65s.

I think it was a big mistake for the Go developers at Google and the Mono 
developers at Novell to not build upon LLVM.

My main concern about other JITs is maturity: AFAIK LLVM has orders of 
magnitude more users that all of the others combined.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] ocaml, llvm and generating code at runtime

2010-01-01 Thread Jon Harrop

On Friday 01 January 2010 22:33:42 Yoann Padioleau wrote:
 On Jan 1, 2010, at 2:29 PM, Jon Harrop wrote:
  I think it was a big mistake for the Go developers at Google and the
  Mono developers at Novell to not build upon LLVM.
 
  My main concern about other JITs is maturity: AFAIK LLVM has orders of
  magnitude more users that all of the others combined.

 And ?

That maturity gives me confidence in LLVM's reliability and on-going 
development.

 C has many orders of magnitude more users than ocaml ...

Indeed.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: multicore wish

2009-12-24 Thread Jon Harrop

On Thursday 24 December 2009 12:58:18 Goswin von Brederlow wrote:
 Jon Harrop j...@ffconsultancy.com writes:
  On Tuesday 22 December 2009 18:02:32 Edgar Friendly wrote:
  On 12/22/2009 01:12 PM, Jon Harrop wrote:
   On Tuesday 22 December 2009 13:09:27 Goswin von Brederlow wrote:
   The advantage with ocaml though is that you never have pointers into
   a structure. Makes thinks a lot simpler for the GC and avoids large
   overheads in memory.
  
   I don't understand what you mean by OCaml never has pointers into a
   structure. Half the problem with OCaml is that OCaml almost always
   uses pointers and the programmer has no choice, e.g. for complex
   numbers.
 
  I think he means that ocaml structs (records, tuples) will only ever
  have pointers pointing to their beginning - you can't have a pointer to
  somewhere in the middle of a structure.

 Exactly.

  If so then I do not understand the relevance. You cannot have pointers
  into a structure in F# or HLVM either...

 If you have an array a of (int * int) then in ocaml a is an array of
 pointers to tuples. a.(5) is a pointer to the 6th tuple.

Yes.

 You said that in F# the array will be really an array of tuples. Then
 a.(5) will be a pointer into the array at the position where the 6th
 tuple is.

No. The expression a.(5) loads all fields of the value type. For example, if 
it were an array of complex number then a.(5) would load the real and 
imaginary parts.

 That means as long as a.(5) is reachable the full array has to remain
 allocated or the GC has to recognise that only one (a few) items of an
 array are reachable and copy them before freeing the array. The GC
 also needs a way to find the begining of an allocated block from a
 pointer into the block. Which means extra overhead in both memory and
 time.

 Another think is that tuples are immutable but arrays are mutable. In
 ocaml you get this nice behaviour:

 # let a = Array.init 5 (fun x - (x,x));;
 val a : (int * int) array = [|(0, 0); (1, 1); (2, 2); (3, 3); (4, 4)|]
 # let x = a.(1);;
 val x : int * int = (1, 1)
 # a.(1) - (2,2);;
 - : unit = ()
 # x;;
 - : int * int = (1, 1)

 In F# you would get (2,2) for your fast sortable array. A different
 semantic.

No. Even if F# represented tuples as structs (which it does not, 
unfortunately) you would still get the same result as you do in OCaml.

HLVM already implements tuples as structs and your example works exactly as 
the OCaml does:

# let xs = create(6, (1, 1)) in let x = xs.(1) in xs.(1) - (2, 2); x;;
- : `Struct[`Int; `Int] = (1, 1)

The only difference is that HLVM unboxes the pairs in the array so the heap 
contains only a single pointer and accessing a pair requires only one 
indirection.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: multicore wish

2009-12-24 Thread Jon Harrop

On Thursday 24 December 2009 13:19:52 Goswin von Brederlow wrote:
 Jon Harrop j...@ffconsultancy.com writes:
  No, in OCaml I fork every child. That is the only transparent way to give
  the child a coherent view of the heap but it is extremely slow (~1ms):

 So if you add a (sleep 60) to the ocaml code then ocaml gets even
 more worse than F#? You are comparing an implementation that is
 specifically optimized to use things F# is good at and ocaml is
 bad. To give a fair comparison you need to optimize the implementation 
 to the language.

Post a better OCaml solution if you can.

  Each process then has a work queue which is works through. So they will
  always use the local data. Only when a queue runs dry they steal from
  another process and ruin locality.
 
  You are correctly describing the efficient solution based upon
  work-stealing queues that F# uses but OCaml cannot express it.

 You mean that you didn't implement that way.

No, I mean OCaml cannot express it.

 I can easily express that with closures and pre-forked worker threads.

OCaml's threads do not run in parallel so that will not work.

  So I don't see where your argument fits. You are not creating childs
  on the fly. Only at the start and they run till all the work is done.
  At least in this example.
 
  No, every recursive invocation of the parallel quicksort spawns another
  child on-the-fly. That's precisely why it parallelizes so efficiently
  when you have wait-free work-stealing task deques and a shared heap. In
  general, you rewrite algorithms into this recursive divide and conquer
  form and parallelize when possible. You can parallelize a *lot* of
  problems efficiently that way.

 That seems awfully ineficient. Then you have a million children
 running on maybe 4 cores and each job queue holds no job.

 I think we mean different things by children. By children I mean what
 the kernel sees as children. Newly forked threads. I think you mean 
 jobs that get put into the queues. There is no reason to fork a new
 system thread every time the parallel quicksort splits an array in
 two.

The children are lightweight tasks, not threads or processes.

  Why would you fork in invoke?
 
  Fork is currently the only transparent way to implement invoke but it
  is extremely slow and unreliable.

 No, it is the insane way. Use closures.

Please demonstrate.

  parallel programs that leverage multicores and run a *lot* faster than
  anything that can be written in OCaml. You can even make this accessible
  to OCaml programmers as a DSL with automatic interop to make high
  performance parallel programming as easy as possible from OCaml.

 Or just implement it properly in ocaml. The problem part is the
 GC. The rest is easy.

No kidding.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: multicore wish

2009-12-22 Thread Jon Harrop

On Tuesday 22 December 2009 13:09:27 Goswin von Brederlow wrote:
Jon Harrop j...@ffconsultancy.com writes:
1. The array a is just an ordinary array of any type of values on the
shared heap in F# but, for generality in OCaml, this must be both the
underlying ordinary data and a manually-managed shared big array of
indices into the ordinary data where the indices get sorted while the
original data remain in place until they are permuted at the end.

Unless you have a primitive type that isn't a pointer.

In OCaml, you would need to write a custom quicksort optimized for that
particular type. In F#, the generic version just works and works efficiently.

The advantage with ocaml though is that you never have pointers into a
structure. Makes thinks a lot simpler for the GC and avoids large
overheads in memory.

I don't understand what you mean by OCaml never has pointers into a
structure. Half the problem with OCaml is that OCaml almost always uses
pointers and the programmer has no choice, e.g. for complex numbers.

2. The sorted subarrays are contiguous in memory and, at some
subdivision, will fit into L2 cache. So F# offers optimal locality. In
contrast, there is no locality whatsoever in the OCaml code and most
accesses into the unsorted original array will incur cache misses right
up to main memory. So the OCaml approach does not scale as well and will
never see superlinear speedup because it cannot be cache friendly.

On the other hand swapping two elements in the array has a constant
cost no matter what size they have. At some size there will be a break
even point where copying the data costs more than the cache misses and
with increasing size the cache won't help F# so much either.

In theory, yes. In practice, that threshold is far larger than any value type
that a real program would use so it is of no practical concern. Moreover, F#
gives the programmer control over whether data are unboxed (value types) or
boxed (reference types) anyway. In contrast, OCaml is tied to a few value
types that are hard-coded into the GC.

3. Child tasks are likely to be executed on the same core as their parent
and use a subset of their parent's data in F#, so they offer the best
possible locality. In contrast, child processes are likely to be executed
on another core in OCaml and offer the worst possible locality.

But, if I understood you right, you first fork one process per
core.

No, in OCaml I fork every child. That is the only transparent way to give the
child a coherent view of the heap but it is extremely slow (~1ms):

F# can do 60MFLOPS of computation in the time it takes OCaml
to fork a single process -
http://caml.inria.fr/pub/ml-archives/caml-list/2009/06/542b8bed77022b4a4824de2da5b7f714.en.html

Those should then also each pin themself to one core.

You have no idea which core a forked child process will run on.

Each process then has a work queue which is works through. So they will
always use the local data. Only when a queue runs dry they steal from
another process and ruin locality.

You are correctly describing the efficient solution based upon work-stealing
queues that F# uses but OCaml cannot express it.

So I don't see where your argument fits. You are not creating childs
on the fly. Only at the start and they run till all the work is done.
At least in this example.

No, every recursive invocation of the parallel quicksort spawns another child
on-the-fly. That's precisely why it parallelizes so efficiently when you have
wait-free work-stealing task deques and a shared heap. In general, you
rewrite algorithms into this recursive divide and conquer form and
parallelize when possible. You can parallelize a *lot* of problems
efficiently that way.

4. Task deques can handle an arbitrary number of tasks limited only by
memory whereas processes are a scarce resource and forking is likely to
fail, whereupon the invoke combinator will simply execute sequentially.
So it is much easier to write reliable and performant code in F# than
OCaml.

Why would you fork in invoke?

Fork is currently the only transparent way to implement invoke but it is
extremely slow and unreliable.

5. OCaml's fork-based invoke combinator is many orders of magnitude
slower than pushing a closure onto a concurrent task deque in F#.

6. The threshold value is a magic number derived from measurements on a
given machine in my OCaml code but is dynamically adjusted in a
machine-independent way by the invoke combinator in my F# code using
atomic operations and real time profiling of the proportion of time spent
spawning tasks vs doing actual work.

5+6 seem to be an implementation detail of some specific
implementation you are talking about.

Yes. I'm talking about today's OCaml and F# implementations.

I don't see anything in the theory that would require that.

If by in theory you mean that a new performant concurrent GC for OCaml would
solve

Re: [Caml-list] Re: multicore wish

2009-12-22 Thread Jon Harrop

On Tuesday 22 December 2009 18:02:32 Edgar Friendly wrote:
 On 12/22/2009 01:12 PM, Jon Harrop wrote:
  On Tuesday 22 December 2009 13:09:27 Goswin von Brederlow wrote:
  The advantage with ocaml though is that you never have pointers into a
  structure. Makes thinks a lot simpler for the GC and avoids large
  overheads in memory.
 
  I don't understand what you mean by OCaml never has pointers into a
  structure. Half the problem with OCaml is that OCaml almost always uses
  pointers and the programmer has no choice, e.g. for complex numbers.

 I think he means that ocaml structs (records, tuples) will only ever
 have pointers pointing to their beginning - you can't have a pointer to
 somewhere in the middle of a structure.

If so then I do not understand the relevance. You cannot have pointers into a 
structure in F# or HLVM either...

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Looking for information regarding use of OCaml in scientific computing and simulation

2009-12-22 Thread Jon Harrop

On Tuesday 22 December 2009 13:11:58 Eray Ozkural wrote:
 On Tue, Dec 22, 2009 at 6:40 AM, Linas Vepstas linasveps...@gmail.com 
wrote:
  However, if you are  interested in merely using the system
  to do your real work, then writing message-passing code
  is an utter waste of time -- its difficult, time-consuming, error
  prone, hard to balance and optimize  tune, works well only
  for embarrasingly parallel code, etc.  Even the evil
  slow-down of NUMA is often better than trying to
  performance-tune a message-passing system.

 Message passing doesn't work well only for embarrassingly parallel
 code.

Message passing doesn't necessarily work well for embarrassingly-parallel 
problems either because you cannot use in-place algorithms and scatter and 
gather are O(n).

 For instance, you can implement the aforementioned parallel 
 quicksort rather easily,

But you cannot improve performance easily and performance is the *only* 
motivation for parallelism. So the fact that you can make naive use of 
message passing easily from OCaml is useless in practice.

 What message passing really is, it is the perfect match to a
 distributed memory architecture. Since, as you suggest, multicore
 chips have more or less a shared memory architecture, message passing
 is indeed not a good match.

Yes. Conversely, shared memory is effectively a hardware accelerated form of 
message passing.

  Let me put it this way: suggesting that programmers can
  write their own message-passing system is kind of like
  telling them that they can write their own garbage-collection
  system, or design their own closures, or they can go
  create their own type system. Of course they can ... and
  if they wanted to do that, they would be programming in
  C or assembly, and would probably be designing new
  languages.  Cause by the time you get done with message
  passing, you've created a significant and rich programming
  system that resembles a poorly-designed language... been
  there, done that.

 For a functional language, am I right in expecting a high-level and
 clean interface for explicit parallelism?

I think that is a perfectly reasonable thing to expect but you still need to 
understand its characteristics and how to leverage them in order to make good 
use of the feature.

 I suppose a spawn directive would not be very hard to implement.

You cannot implement it with useful efficiency in OCaml.

 Message Passing/Distributed Memory can also be accommodated I suppose.

Sure but it is worth remembering that distributed parallelism across clusters 
is a tiny niche compared to multicores.

 OcamlP3l looks pretty cool. Parallel combinators? Definitely what I'm
 talking about, as usual the future is here with ocaml ;)

 http://ocamlp3l.inria.fr/eng.htm

Try solving some real problems with OCamlP3L and F#. I'm sure you'll agree 
that the OCaml approach is certainly not the future.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: OCaml is broken

2009-12-21 Thread Jon Harrop

On Monday 21 December 2009 05:32:38 Markus Mottl wrote:
 On Sun, Dec 20, 2009 at 23:30, Jon Harrop j...@ffconsultancy.com wrote:
  Traffic here:
 
  2007: 5814
  2008: 4051
  2009: 3071

 That's because I don't have much time to post here nowaydays.  I'm
 sure if Jon followed my example, we would have a parallel GC for OCaml
 by the end of the year.

HLVM already has a parallel GC. :-)

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: multicore wish

2009-12-21 Thread Jon Harrop

On Monday 21 December 2009 14:19:36 Mihamina Rakotomandimby wrote:
 Ok, so for the beginner I am (must I ask on the beginners ML?): is
 multicore support just useless or not?

I have found a great many uses for multicores but you need a decent foundation 
to make effective use of it.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] general question, was Re: OCaml is broken

2009-12-21 Thread Jon Harrop

On Monday 21 December 2009 14:19:05 Keyan wrote:
 Hi,

 i have a large project written in C++, for which i am planing to write
 add-ons and tools in ocaml, e.g. different tools to analyse my code
 (dependency stuff), an interpreter for a script-language i plan to include,
 etc, etc. form my time at the uni i remembered that ocaml allows to compile
 libraries which can be included in c/c++ program, and i know people who use
 it extensively in other projects. therefore, i decided to give ocaml a try.
 i like functional programming, and my first steps with ocaml are very
 promising.

 following this discussion, i am not so sure anymore, if ocaml is a good
 decision. may be i got this discussion wrong, but if ocaml is dying out, i
 might have to look for another functional programming language to use with
 my project.

OCaml isn't dying out and it is still ideal for the applications you listed 
because they are not performance critical and do rely heavily upon an 
efficient GC. This discussion was essentially about the exact opposite end of 
the spectrum of applications, where performance is critical but GC is largely 
irrelevant.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: multicore wish [Was: Re: [Caml-list] Re: OCaml is broken]

2009-12-21 Thread Jon Harrop

On Monday 21 December 2009 13:31:10 Goswin von Brederlow wrote:
 Jon Harrop j...@ffconsultancy.com writes:
  We've discussed the problems with that before. Writing a parallel generic
  quicksort seems to be a good test of a decent multicore capable language
  implementation. Currently, F# is a *long* way ahead of everything open
  source.

 How do you implement it?

 1) divide at the top and then let each core sort its smaller array

 Say you have 2^n cores then the first n splits in merge sort would
 first sort the values into the 2 regions and then start a thread for
 each region (start a new one, do the other part in this thread). After
 n splits you would switch to a uni-core quicksort.

 For this you need to split well so each core ends up with a roughly
 equal sized chunk.

That is basically what is now being called nested data parallelism. You 
precompute a parallel strategy, partition the work upfront and then execute 
embarassingly parallel tasks. This worked well decades ago for sparse linear 
algebra on supercomputers but it sucks on multicores because load is variable 
and there is no load balancing. So one of those equal-sized partitions is 
likely to take 10x longer than the others due to competing processes, 
scheduling or cache issues and all cores end up twiddling their thumbs 
waiting for that one to complete.

 2) factory/worker approach

 Each core runs a factory/worker thread waiting on a job queue. You
 start by dumping the full array into the job queue. Some random core
 picks it up, splits it into 2 regions and dumps one into the job
 queue. If the job gets too small (PAGE_SIZE? cache line size? total
 size / cores^2?) the factory/worker switches to normal uni-core
 quicksort and sorts the whole chunk.

 The job queue should probably be priority based so larger chunks are
 sorted before smaller.

 Here the quality of each split is not so important. If a chunk is
 smaller, and therefore faster, the core just picks up the next job in
 the queue. But you need more syncronization between the cores for the
 job queue. On the other hand you aren't limited to 2^n cores. Any
 number will do.

This is a work sharing queue which is better because it is dynamically load 
balanced but still bad because you have global contention for a shared 
resource: the shared queue.

Cilk pioneered wait-free work-stealing task deques and Microsoft's Task 
Parallel Library (which will be part of .NET 4 in March 2010) copied the 
idea. You have a separate deque of tasks for each core. A core tries to pop a 
task off its deque. If there are no tasks on its deque then it tries to pop 
off a randomly-chosen other deque. If there is a task then the core begins 
executing that task and any child tasks are pushed back onto the local deque 
(and might get stolen before they are executed).

An important characteristic of this design is that child tasks are likely to 
be executed on the same core as their parent and, therefore, locality between 
related tasks is valuable. Consequently, many algorithms that recursively 
subdivide a large dataset (like quicksort) attain superlinear speedups 
because they make effective use of the sum of all L2 caches rather than just 
a single L2 cache. So this works extremely well with the multi-level 
hierarchical caches with interconnects that are the hallmark of multicore 
architecture.

You can then implement quicksort (and many other parallel divide and conquer 
algorithms) something like this:

  let rec serial_qsort cmp a i0 i3  =
if i3 - i0  1 then
  let i1, i2 = partition cmp a i0 i3 in
  serial_qsort i0 i1;
  serial_qsort i2 i3

  let rec parallel_qsort cmp a i0 i3  =
if i3 - i0  1 then
  if i3 - i0  threshold then
serial_qsort cmp a i0 i3
  else
let i1, i2 = partition cmp a i0 i3 in
let future = invoke (parallel_qsort cmp i0) i1 in
parallel_qsort i2 i3
future()

where the invoke combinator pushes a task onto the local task deque and 
returns a closure that, when applied, waits for that task to complete and 
returns its (unit) result. The threshold value protects against the 
function spending more time spawning tasks that doing actual work.

You can implement this in OCaml and F# but there are some important 
differences:

1. The array a is just an ordinary array of any type of values on the shared 
heap in F# but, for generality in OCaml, this must be both the underlying 
ordinary data and a manually-managed shared big array of indices into the 
ordinary data where the indices get sorted while the original data remain in 
place until they are permuted at the end.

2. The sorted subarrays are contiguous in memory and, at some subdivision, 
will fit into L2 cache. So F# offers optimal locality. In contrast, there is 
no locality whatsoever in the OCaml code and most accesses into the unsorted 
original array will incur cache misses right up to main memory. So the OCaml 
approach does not scale as well

Re: [Caml-list] Re: OCaml is broken

2009-12-20 Thread Jon Harrop

On Sunday 20 December 2009 14:27:00 Dario Teixeira wrote:
 Hi,

  It's too bad that INRIA is not interested in fixing this bug. No
  matter what people say I consider this a bug. Two cores is standard by
  now, I'm used to 8, next year 32 and so on. OCaml will only become
  more and more irrelevant. I hate to see that happening.

 This is a perennial topic in this list.  Without meaning to dwell too
 long on old arguments, I simply ask you to consider the following:

 - Do you really think a concurrent GC with shared memory will scale neatly
 to those 32 cores?

 - Will memory access remain homogeneous for all cores as soon as we get
 into the dozens of cores?

The following web page describes a commercial machine sold by Azul Systems 
that has up to 16 54-core CPUs (=864 cores) and 768 GB of memory in a flat 
SMP configuration:

  http://www.azulsystems.com/products/compute_appliance.htm

As you can see, a GC with shared memory can already scale across dozens of 
cores and memory access is no more heterogeneous than it was 20 years ago. 
Also, note that homogeneous memory access is a red herring in this context 
because it does not undermine the utility of a shared heap on a multicore.

 - Have you considered that many Ocaml users prefer a GC that offers maximum
 single core performance, 

OCaml's GC is nowhere near offering maximum single core performance. Its 
uniform data representation renders OCaml many times slower than its 
competitors for many tasks. For example, filling a 10M float-float hash 
table is over 18x slower with OCaml than with F#. FFT with a complex number 
type is 5.5x slower with OCaml than F#. Fibonacci with floats is 3.3x slower 
with OCaml than my own HLVM project (!).

   because their application is parallelised via multiple processes
   communicating via message passing? 

A circular argument based upon the self-selected group of remaining OCaml 
users. Today's OCaml users use OCaml despite its shortcomings. If you want to 
see the impact of OCaml's multicore unfriendliness, consider why the OCaml 
community has haemorrhaged 50% of its users in only 2 years.

 In this context, your bug is actually a feature.

I'm not even sure you can substantiate that in the very specific context of 
distributed parallel theorem provers because other languages are so much more 
efficient at handling common abstractions like parametric polymorphism. Got 
any benchmarks?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: OCaml is broken

2009-12-20 Thread Jon Harrop

 is not beneficial.

 You can consider multicore as a marketing trick of the chip 
 industry to let the ordinary desktop user pay for a feature that is mostly
 interesting for datacenters. 

Ordinary desktop users have been paying top dollar for parallel computers in 
the form of GPUs for some time now. The use of GPUs for more general 
programming has been a really hot topic for years and just became viable. 
Even games consoles have multicores. ARM are making quadcores for your phone 
and netbook!

If I can get HLVM to make parallel OCaml-style programming easy, I think a lot 
of people would love it.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: OCaml is broken

2009-12-19 Thread Jon Harrop

On Saturday 19 December 2009 19:38:41 Jeff Shaw wrote:
 My understanding is that since jocaml uses the regular ocaml runtime, it
 is also not multicore enabled.

 Haskell is a functional language that has good performance

GHC and the Haskell language itself have serious performance problems.

 that can use multiple processors, but the learning curve is steeper and
 higher. 

And Haskell lacks many of the features OCaml programmers take for granted.

 OCaml is a close relative of Standard ML, so there might be some
 implementation of SML that you like. MLTon might allow multicore use,
 but I'm not sure how mature it is. SML/NJ has a library or language
 extension called Concurrent ML, but I think SML/NJ might not use
 multiple processors.

MLton and SML/NJ are both multicore incapable. The PolyML implementation of 
SML is multicore friendly but last time I looked (many years ago) it was 100x 
slower than OCaml for floating point.

As long as you're looking at OCaml's close relatives with multicore support, 
F# is your only viable option. Soon, HLVM will provide a cross-platform open 
source solution. If you look further you will also find Scala and Clojure.

 Note that if you're not using a lot of threads, you can use Unix.fork to 
 do true multithreaded programming ocaml.

We've discussed the problems with that before. Writing a parallel generic 
quicksort seems to be a good test of a decent multicore capable language 
implementation. Currently, F# is a *long* way ahead of everything open 
source.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] How to write a CUDA kernel in ocaml?

2009-12-15 Thread Jon Harrop

On Tuesday 15 December 2009 23:18:57 David Allsopp wrote:
 Basile Starynkevitch wrote:
  Eray Ozkural wrote:
 
  Compiling Ocaml to efficient C is not easy and probably impossible (or
  extremely difficult) in the general case. In
  particular, tail recursive calls are essential in Ocaml, and are not
  available in C in most compilers.

 What's this based on (out of interest)? Most C compilers don't necessarily
 identify tail recursion in *C* code but if you're emitting C as an OCaml
 backend then there's no reason not to convert tail recursive *OCaml*
 functions to C code based on goto or similar looping constructs (yes, you'd
 almost-always-virtually-never use goto in a hand-crafted C program without
 apologising profusely at Dijkstra's grave but if you're using C as an
 intermediate language then that's a different matter). If I recall
 correctly from an old post on this list, this is how Felix handles tail
 recursion when translating Felix to C++

And trampolines to eliminate tail calls that cannot be eliminated using goto. 
However, trampolines are ~10x slower than TCO in the code gen.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Looking for information regarding use of OCaml in scientific computing and simulation

2009-11-29 Thread Jon Harrop

% 
from 2007-2008 and another 50% from 2008-2009). I believe this is because the 
OCaml community was ~80% technical users in 2006 and most of them have since 
left for languages that address the disadvantages I just listed, primarily 
parallelism. The main problem as I see it is that all of the other solutions 
for technical users outside Windows (i.e. except F#) introduce serious 
problems of their own (e.g. no TCO in Clojure and Scala, poor inference in 
Scala and Haskell, unpredictable time and space in Haskell, conservative GC 
on Mono and ATS).

The good news is that I've been working on a project (HLVM) specifically 
designed to address these problems, with high performance parallel numerics 
as well as having C-like data representation with an easy-to-use FFI and many 
other advantages including an optimizing native-code REPL, generic printing 
and serialization. This is just a hobby but I'm amazed at how easily I've 
been able to progress thanks to the awesome OCaml+LLVM combo.

  - My colleagues are working a lot with Mathlab, is there any synergy
 there (bindings, ways to integrate Mathlab within OCaml code or vice
 versa, ...)?

 You can reply to me on this list or off list. In case of personal
 reply, let me know if I can reuse your name and affiliation.

Please do. :-)

 If this presentation is done, I'll make the slides available under a
 free license.

Great, thanks.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[Caml-list] Parallel programming in HLVM

2009-11-29 Thread Jon Harrop


I managed to find the time to do a little more work on my HLVM project this 
week and created a first working parallel version with threads that run 
simultaneously using a stop-the-world GC.

The current implementation is extremely inefficient but it seems to work 
correctly and should be easy to optimize.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] MetaOcaml and high-performance [was: AST versus Ocaml]

2009-11-10 Thread Jon Harrop

On Monday 09 November 2009 04:23:28 o...@okmij.org wrote:
 Because offshoring produces a portable C or Fortran code file, you can
 use the code on 32 or 64-bit platform. The reason the native MetaOCaml
 without offshoring does not work on amd64 is because at that time
 OCaml didn't emit PIC code for amd64. So, dynamic linking was
 impossible. That problem has long been fixed in later versions of
 OCaml...

Has the problem been fixed in MetaOCaml?

 Fortunately, some people have considered MetaOCaml to be a viable
 option for performance users and have reported good results. For
 example,

   Tuning MetaOCaml Programs for High Performance
   Diploma Thesis of Tobias Langhammer.
   http://www.infosun.fmi.uni-passau.de/cl/arbeiten/Langhammer.pdf

 Here is a good quotation from the Introduction:

 ``This thesis proposes MetaOCaml for enriching the domain of
 high-performance computing by multi-staged programming. MetaOCaml extends
 the OCaml language.
 ...
 Benchmarks for all presented implementations confirm that the
 execution time can be reduced significantly by high-level
 optimizations. Some MetaOCaml programs even run as fast as respective
 C implementations. Furthermore, in situations where optimizations in
 pure MetaOCaml are limited, computation hotspots can be explicitly or
 implicitly exported to C. This combination of high-level and low-level
 techniques allows optimizations which cannot be obtained in pure C
 without enormous effort.''

That thesis contains three benchmarks:

1. Dense float matrix-matrix multiply.

2. Blur of an int image matrix as convolution with a 3x3 stencil matrix.

3. Polynomial multiplication with distributed parallelism.

I don't know about polynomial multiplication (suffice to say that it is not 
leveraging shared-memory parallelism which is what performance users value in 
today's multicore era) but the code for the first two benchmarks is probably 
10-100x slower than any decent implementation. For example, his fastest 
2048x2048 matrix multiply takes 167s whereas Matlab takes only 3.6s here.

In essence, the performance gain (if any) from offshoring to C or Fortran is 
dwarfed by the lack of shared-memory parallelism.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Unboxed float tuples

2009-11-08 Thread Jon Harrop

On Sunday 08 November 2009 16:01:48 Daniel Bünzli wrote:
 Using tuples (vs records) to handles points and vectors is
 syntactically more lightweight (IMHO).

If you want to handle structs and vectors then you probably want structs 
(value types) and not boxed representations like tuples and records anyway.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] OC4MC : OCaml for Multicore architectures

2009-11-08 Thread Jon Harrop

On Friday 25 September 2009 00:28:57 Jon Harrop wrote:
 Just to quantify this with a data point: the fastest (serial) version of my
 ray tracer benchmark is 10x slower with the new GC. However, this is
 anomalous with respect to complexity and the relative performance is much
 better for simpler renderings. For example, the new GC is only 1.7x slower
 with n=6 instead of n=9.

The new SmartPumpkin release of OC4MC does a lot better. Specifically, the 
version compiled with partial collections is now only 3.9x slower on a serial 
ray tracer with n=9 (compared to 10x slower before). I'll try it in more 
detail...

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Unboxed float tuples

2009-11-08 Thread Jon Harrop

On Sunday 08 November 2009 19:39:03 Brian Hurt wrote:
 Fixing this would require a major rearchitecting and rewrite of not only
 the compiler, but also the garbage collector, the run time, and all the C
 bindings that have been written.

If you're willing to endure a whole program optimization pass then you can 
avoid having to touch the GC and run-time by monomorphizing the code, e.g. 
converting all tuples into monomorphic record types.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Re: ATS versus Ocaml

2009-11-07 Thread Jon Harrop

On Friday 06 November 2009 15:38:27 Jan Kybic wrote:
 I notice that in ATS you have to give the type of the array explicitly

   fn bubble_sort (a : array0 double ) : void =

  so you should also do so in the OCaml code, using

 You are right, I am sorry for this omission. Having done that, the
 ration between Ocaml and ATS times drops to 3:1 (Ocaml being slower).

On x86, I get:

ATS:   0.189s
HLVM:  0.486s
OCaml: 0.552s

On x64, I get:

OCaml: 0.299s

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] tip for tail recursive map

2009-11-02 Thread Jon Harrop

On Monday 02 November 2009 20:00:31 Christophe Raffalli wrote:
 List of size 1, 1 times with standard map : 7.564473s
 List of size 1, 1 times with map with rev : 15.452965s
 List of size 1, 1 times with map with prelist : 12.672792s
 List of size 1, 1 times with map with obj : 11.572724s

Note that standard map is still very fast on this list length.

 List of size 10, 1000 times with standard map : 33.018063s
 List of size 10, 1000 times with map with rev : 42.142634s
 List of size 10, 1000 times with map with prelist : 22.161385s
 List of size 10, 1000 times with map with obj : 20.801299s

Standard map is now relatively slower but only because it is O(n^2). Look at 
page 152 figure 7.4 of OCaml for Scientists to see this effect clearly. It is 
caused by the periodic traversal of the O(n) deep stack by the GC and it 
slows everything down (you get a similar effect with hash tables because the 
GC traverses arrays of pointers, like the spine, atomically).

 standard map with size 100 segfaults on my machine
 List of size 100, 100 times with map with rev : 55.211450s
 List of size 100, 100 times with map with prelist : 23.549472s
 List of size 100, 100 times with map with obj : 21.777361s

You can use ulimit to get a bigger function call stack and keep running the 
ordinary map as far as you want.

 Conclusion : dirty map wins for large lists, Standard map wins for small
 lists...

I think you can do a lot better than this and I think Xavier's recommendation 
stands (Obj is a horiffically bad idea unless you wrote the compiler and 
run-time *and* have the memory of an elephant ;-). Specifically, you just 
need to get rid of the O(n^2) behaviour by bounding the stack depth, perhaps 
using a trampoline.

IIRC, this was discussed on this list many years ago. One notable observation 
was that adding a depth accumulator does not degrade performance. Another 
alternative is to convert the list into an array rather than reversing it and 
use the array as a kind of alternative to the function call stack (I think F# 
does this).

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] OC4MC : OCaml for Multicore architectures

2009-10-09 Thread Jon Harrop

On Saturday 26 September 2009 00:26:50 Benjamin Canou wrote:
 On the maintenance side, as Philippe said, we already have some half
 working version with ocaml 3.11.x, but partly because of the changes
 made to the native runtime in this release and partly because of [1],
 porting the patch is not trivial.

OC4MC seems to work very well for numerical problems that do not allocation at 
all but introducing even the slightest mutation (not even in the inner loop) 
completely destroys performance and scaling. I'm guessing the reason is that 
any allocations eventually trigger collections and those are copying the 
entire heap which, in this case, consists almost entirely of float array 
arrays.

My guess was that using big arrays would alleviate this problem by placing 
most of the data outside the OCaml heap (I'm guessing that oc4mc leaves the 
element data of a big array alone and copies only the small reference to 
it?). However, it does not seem to handle bigarrays:

../out/lib/ocaml//libbigarray.a(bigarray_stubs.o): In function 
`caml_ba_compare':
bigarray_stubs.c:(.text+0x1e5): undefined reference to 
`caml_compare_unordered'
bigarray_stubs.c:(.text+0x28d): undefined reference to 
`caml_compare_unordered'
collect2: ld returned 1 exit status
Error during linking

If I am correct then I would value functioning bigarrays above OCaml 3.11 
support.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Constructors are not functions

2009-10-06 Thread Jon Harrop

On Tuesday 06 October 2009 13:19:58 Philippe Wang wrote:
 Hello,

 I don't know the actual reason, but I guess it's simply a choice of
 semantics.

 It woud be weird to be able to deconstruct (Bar 42) while not be able
 to deconstruct (Bar) as it's not constructed.
 I mean we can write (match x with Bar y - y).
 If partial construction were accepted we may like to write
 (match x with Bar - x) but we couldn't because Bar is like a function
 then.

 With type t = A | B of int
 what would be the error warning for
 (match x with A - 42 | B - 43) ?

 Well, then I guess the reason is that it would be complicated to
 choose some sound enough semantics for partial application of
 constructors, since the solution of having to write (fun x - Bar x)
 is much simpler.

Can you not just say that Bar in an expression is a function (fun x - Bar x)?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

1 2 3 >

1 - 100 of 294 matches

Mail list logo