RE: Value types (Was: [Caml-list] ocamlopt LLVM support)
Benedikt wrote: Nice. But it would be even better to avoid the dummy, in your example let x = u in let y = v in if b then x - v; y - u This does not only avoid the dummy, but would also allow lowering to cmovcc instructions in the backend selector (atleast for x86-32/64). FWIW, when you're using LLVM (as a library!) to generate equivalent code then you can just represent the tuples as structs (value types) and LLVM will take care of all of this for you. Again, it generates really efficient code with minimal effort. The latest example front-end for HLVM represents tuples as structs so it gets this for free. However, polymorphic recursion is *much* harder to implement with that design choice. Also, there is the question of when boxed vs unboxed tuples are more efficient. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: optimize div to right shift (NOT!)
Edwin wrote: http://www.hackersdelight.org/HDcode/magic.c.txt http://www.hackersdelight.org/HDcode/magicu.c.txt Nice! :-) Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: Value types (Was: [Caml-list] ocamlopt LLVM support)
Török Edwin wrote: Do you really need to use Int64 for that though? Won't the 63-bit version do? I'm running 32-bit. I am unable to reproduce your results. Here, the time falls from 24s to 19.5s (using ocamlopt 3.12.0 on Intel x86) which is still 26× slower than HLVM. Sorry, I'm actually using an Opteron x86 (logged in from an Intel x86!). Do you still have 'idiv' in the compiled code? See my attached assembly, and compare it with yours please. I was doing the test on 64-bit, with ocamlopt 3.11.2 and 3.12.0. I get what appear to be calls to C code: camlCollatz__collatzLen_1030: subl$8, %esp .L103: movl%eax, 4(%esp) movl%ebx, 0(%esp) pushl $camlCollatz__10 pushl %ebx movl$caml_equal, %eax callcaml_c_call .L104: addl$8, %esp cmpl$1, %eax je .L102 movl4(%esp), %eax addl$8, %esp ret .align 16 .L102: pushl $camlCollatz__8 movl4(%esp), %eax pushl %eax movl$caml_int64_and, %eax callcaml_c_call .L105: addl$8, %esp pushl $camlCollatz__9 pushl %eax movl$caml_equal, %eax callcaml_c_call .L106: addl$8, %esp cmpl$1, %eax je .L101 pushl $3 movl4(%esp), %eax pushl %eax movl$caml_int64_shift_right, %eax callcaml_c_call .L107: addl$8, %esp movl%eax, %ebx jmp .L100 .align 16 .L101: movl0(%esp), %eax pushl %eax pushl $camlCollatz__6 movl$caml_int64_mul, %eax callcaml_c_call .L108: addl$8, %esp pushl $camlCollatz__7 pushl %eax movl$caml_int64_add, %eax callcaml_c_call .L109: addl$8, %esp movl%eax, %ebx .L100: movl4(%esp), %eax addl$2, %eax jmp .L103 FWIW the original code took 2.8 seconds here, so only 4x slower (this is an AMD Phenom II x6 1090T CPU). It probably depends how fast/slow the 'idiv' is on your CPU. The performance of idiv is irrelevant here. The bottleneck may be those C calls but I don't understand why they are being generated. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: Value types (Was: [Caml-list] ocamlopt LLVM support)
Brian Hurt wrote: On Sun, 12 Dec 2010, Jon Harrop wrote: let rec collatzLen(c, n) : int = if n = 1L then c else collatzLen (c+1, if Int64.rem n 2L = 0L then Int64.div n 2L else Int64.add (Int64.mul 3L n) 1L);; let rec loop(i, (nlen, n)) = if i = 1L then n else let ilen = collatzLen(1, i) in let nlen, n = if ilen nlen then ilen, i else nlen, n in loop (Int64.sub i 1L, (nlen, n));; Congratulations, Jon, you win today's Captain Obvious award. Using Int64's, which are forced to be boxed, really slows things down. Apparently boxing isn't the issue here, as I had assumed. On 32-bit, OCaml compiles each arithmetic operation on the int64s to a C function call. Also, uncurrying all your arguments also slows things down. I see 3% performance improvement from currying everything. Running your original code on my 64-bit laptop, it took 6.35s to run the 1M example. The following alternate code only took 0.82s, for a speed up of almost 7.75x. According to Edwin, you should be able to get C-like performance by running the OCaml in 64-bit and replacing the div and mod operations with shifts and logical ANDs. Scaling your timings by a similar amount gives Ocaml a running speed of 3.14s in your set up, or competitive with F#. I'd be wary of scaling timings by measurements made across different architectures. OCaml seems to be doing completely different things on x86 and x64 here. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: Value types (Was: [Caml-list] ocamlopt LLVM support)
Benedict wrote: A C compiler would optimize this to a right shift. Changing that to 'Int64.shift_right n 1' speeds up the code. This is easy to fix in ocamlopt (see attached patch ocamlopt- natint.patch), by applying the same optimizations already used for constant int's to constant natint's (Int64 is Nativeint on 64bit). Note however, that mod 2 is not really and 1, neither is div 2 equivalent to lsr 1; that would be the case for unsigned arithmetic (doesn't matter in your example, tho). That's great. Does it optimize div and mod by any constant integer as C compilers do? 1. Unboxing can give huge performance improvements on serial code, s/Unboxing/arithmetic optimizations/ Please find an example where the performance benefit is due to unboxing, and not due to arithmetic optimizations performed on the unboxed code. The boxing involved is relevant, but boxing in general is not the issue. In this special case, the let nlen, n = if... code requires heap allocation, because of the way the pattern is compiled. This could be fixed by moving the condition out of the code and using two if's to select n/nlen separately (doesn't speed up that much). Fixing the pattern compiler to handle these cases might be interesting for general benefit. I already mentioned this multiple times, but here we go again: Unboxing optimizations may indeed prove useful if applied wisely (cmmgen.ml is of special interest here, the unboxing optimizations are more or less special cases; that could be extended to include interesting cases like moving boxing out of if-then-else in return position, etc). But (here comes the special Harrop note) this has absolutely nothing to do with LLVM (and of course really, really, really nothing to do with HLVM). Using a different data representation for the heap requires a nearly complete rewrite of the OCaml system (you would probably need to start at the Typing level); if one wants to do this, enjoy and come back with code. But even then, data representation issues will have to be considered long before it comes to actual code generation (if you are serious, you'll have to think about the runtime first prior to talking about code generation for a non-existing runtime), so even then it has nothing to do with LLVM (or C-- or C or whatever IR you can think of). OCaml programmers can benefit from more appropriate data representations by using LLVM as a library to generate code from OCaml. HLVM is an example of this that anyone can play with. Combining alloc's across if-then-else constructs further reduces code size in your example (and probably other programs as well), see attached file ocamlopt-comballoc-ifthenelse.patch. It's quickdirty, but it should illustrate the idea. I think that is an even more valuable improvement to ocamlopt than the int optimization. This doesn't mean that LLVM wouldn't be useful (in fact, I've just started an LLVM backend for ocamlopt). But it is important to note that LLVM is not the solution to everything. As the name implies, it's low level, it does a few higher level optimizations for C, but these are special cases (and somewhat ugly if you take the time to read the code). It won't make a faster OCaml magically, just like it didn't make a faster Haskell by itself. Absolutely. I could go on by quoting common Harrop jokes like you need types in the low-level IR, etc. trying to tell him that this is simply wrong; but after reading through the Caml/LISP mailing list archives (thanks for the pointers by several readers), I'm pretty much convinced that Jon simply decided to end his war against LISP just to start a new one against ocamlopt. Suggesting that OCaml programmers use LLVM as a library because you can get huge performance gains is hardly a war against ocamlopt. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: Value types (Was: [Caml-list] ocamlopt LLVM support)
Edwin wrote: On Sun, 12 Dec 2010 22:05:34 - Jon Harrop jonathandeanhar...@googlemail.com wrote: Edwin wrote: AFAICT LLVM's OCaml bindings are only good for generating LLVM IR from OCaml, not for actually performing transformations on it (there is no binding to retrieve the type of a value for example). I'll probably be looking into fixing that in the near future, and this may indirectly help your LLVM backend (if you intend to write OCaml specific transformations on the LLVM IR). That's a lot of work. Wouldn't it be preferable to do the passes on the OCaml side and focus on generating high quality LLVM IR? Yes, that is probably a better approach (generating the optimized IR in the first place). FWIW, just the basic existing bindings were still buggy last I looked. For example, HLVM had to use a workaround to add a dummy argument to a function with no arguments because the bindings didn't handle that case correctly. Ironing the bugs out of the existing bindings would be more useful than making them even bigger, IMHO. Going back to Richard's idea, writing an OCaml library that uses LLVM to JIT interface code on-the-fly as a replacement for writing separate C stubs would be incredibly useful and you could even use it to interface to LLVM itself more easily! Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: ocamlopt LLVM support (Was: [Caml-list] OCamlJIT2 vs. OCamlJIT)
Richard Jones wrote: It might kick-start efforts to automatically generate bindings, at least to C code, which wouldn't be a bad thing. camlidl is clumsy and doesn't handle a very important case (pointers to handles) so it's not really useful for this. Yes, LLVM is ideal for this. Using LLVM to JIT compile OCaml-C interop stubs should be quite easy. I did a bit of this in some of the OCaml Journal article that were on LLVM. You could also kill two birds with one stone by creating an OCaml-HLVM bridge with it. That would make it easier to create a bigarray from OCaml and call into high-performance parallel numerical code to chomp on it. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: ocamlopt LLVM support (Was: [Caml-list] OCamlJIT2 vs. OCamlJIT)
Benedikt wrote: So, let's take that LLVM thing to a somewhat more serious level (which means no more arguing with toy projects that simply ignore the difficult stuff). You can learn a lot from prototypes. 3. A better LLVM. If we do it right, other's implementing functional languages using LLVM will not be faced with the same problems again. Dunno if that's possible. What exactly are you thinking of that goes beyond what the other functional languages that already target LLVM have already had to implement? Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] OCamlJIT2 vs. OCamlJIT
Erik wrote: Jon wrote: LLVM is also much better documented than ocamlopt's internals. LLVM has well over 20 full time programmers working on it. The Ocaml compiler has how many? Another reason why LLVM deserves its hype. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] OCamlJIT2 vs. OCamlJIT
Benedikt wrote: This has nothing to do with LLVM, but is simply due to the fact that your code does not box the float parameters/results. Exactly, yes. The following peace of C code is most probably even faster than your code, so what? double fib(double x) { if (x 1.5) return x else return fib(x-1) + fib(x-2); } FWIW, gcc -O2 takes longer to compile and produces slower code than HLVM anyway: $ time gcc -O2 fib.c -o fib real0m0.161s user0m0.124s sys 0m0.036s $ time ./fib 102334155.00 real0m2.792s user0m2.792s sys 0m0.000s If you're asking what the advantages of using LLVM over generating C code are, I'd say functionality like more numeric types, tail calls and JIT compilation come top but also the fact that LLVM bundles nice OCaml bindings and makes it easy to generate fast code. Also, I have other examples (e.g. the random number generator from the SciMark2 benchmark) where LLVM can generate code that runs 2x faster than anything I've been able to get out of GCC. So this is about data representation not code generation (using LLVM with boxed floats would result in same/lower performance); Yes. LLVM lets you choose your own data representation and applications like numerics can benefit enormously from that. All the more reason to use LLVM. HLVM ignores complex stuff like polymorphism, modules, etc. (at least last time I checked), which makes code generation almost trivial. OCaml ignores complex stuff like parallelism and value types (at least last time I checked ;-). This is precisely why OCaml and HLVM complement each other. Ideally, we would want all of this at the same time but, for now, we'll have to settle for separate solutions. The literature includes various solutions to implement stuff like ML polymorphism: tagged integers/boxed floats/objects is just one solution, not necessarily the best; but examples that simply ignore the complex stuff, and therefore deliver better performance don't really help to make progress. Right. Reified generics can be a much better solution for performance, particularly when combined with value types, and something else that ocamlopt cannot express but HLVM can. A possible solution to improve ocamlopt's performance in this case would be to compile the recursive fib calls in a way that the parameter/result is passed in a floating point register (not boxed), and provide a wrapper for calls from outside, which unboxes the parameter, invokes the actual function code, and boxes the result. This should be doable on the Ulambda/Cmm level, so it is actually quite portable and completely independent of the low-level code generation (which is where LLVM comes into play). That way ocamlopt code will be as fast as the C code for this example. Wow, I'd love to see your OCaml version that is as fast as C. I have microbenchmarks where LLVM generates code over 10x faster than ocamlopt (specifically, floating point code on x86) and larger numerical programs that also wipe the floor with OCaml. ocamlopt's x86 floating point backend is easy to beat, as demonstrated in the original post. Even a simple byte-code JIT engine (which still boxes floating point results, etc.) is able to beat it. Yes. Another reason to consider alternatives to ocamlopt for such applications. Your benchmarks do not prove that LLVM generates faster code than ocamlopt. My benchmarks show LLVM generating faster code than ocamlopt. Instead you proved that OCaml's floating point representation comes at a cost for number crunching applications (which is obvious). Use the same data representation with LLVM (or C) and you'll notice that the performance is the same (most likely worse) compared to ocamlopt. You are saying is that LLVM might not be faster if it were also afflicted with ocamlopt's design, which is unproductive speculation. The point is that LLVM and HLVM are not constrained by those design decisions as OCaml is, so you can use them to generate much faster code. LLVM is also much better documented than ocamlopt's internals. Definitely, but LLVM replaces just the low-level stuff of ocamlopt (most probably starting at the Cmm level), so a lot of (undocumented) ocamlopt internals remain. Sure. OCaml has a good pattern match compiler, for example. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] OCamlJIT2 vs. OCamlJIT
Benedikt wrote: On Dec 1, 2010, at 15:11 , Jon Harrop wrote: If you're asking what the advantages of using LLVM over generating C code are, I'd say functionality like more numeric types, tail calls and JIT compilation come top but also the fact that LLVM bundles nice OCaml bindings and makes it easy to generate fast code. Also, I have other examples (e.g. the random number generator from the SciMark2 benchmark) where LLVM can generate code that runs 2x faster than anything I've been able to get out of GCC. LLVM sure does better as an intermediate language than C does, no question. But that wasn't my point in this example. I think we are talking at cross purposes. My post was to explain some of the many benefits of LLVM in response to Yoann's statement: I don't understand why there is so much hype around LLVM. Why would you think something written in C++ would be far better than the ocaml code we have in the ocamlopt compiler? I was not just talking about porting ocamlopt to LLVM but more generally about the advantages of using LLVM from OCaml in any way. So this is about data representation not code generation (using LLVM with boxed floats would result in same/lower performance); Yes. LLVM lets you choose your own data representation and applications like numerics can benefit enormously from that. All the more reason to use LLVM. As does assembler, so even more reasons to emit assembler? LLVM makes it a *lot* easier to generate efficient code, of course. The literature includes various solutions to implement stuff like ML polymorphism: tagged integers/boxed floats/objects is just one solution, not necessarily the best; but examples that simply ignore the complex stuff, and therefore deliver better performance don't really help to make progress. Right. Reified generics can be a much better solution for performance, particularly when combined with value types, and something else that ocamlopt cannot express but HLVM can. So this is an area to work on within ocamlopt. If you can add value types and reified generics to ocamlopt and get them accepted that would be great. You are saying is that LLVM might not be faster if it were also afflicted with ocamlopt's design, which is unproductive speculation. The point is that LLVM and HLVM are not constrained by those design decisions as OCaml is, so you can use them to generate much faster code. We are talking about using LLVM within the OCaml compiler, so we have to talk about OCaml, not HLVM or anything else. I was talking more generally about the benefits of LLVM in response to Yoann's statement about the hype surrounding LLVM. HLVM demonstrates some of the things a next-generation ocamlopt might be capable of. Don't get me wrong, I'm curious to see an LLVM backend for ocamlopt, especially since that way we would get a native top-level for free (at least for the JIT capable LLVM targets). But I doubt that LLVM will make a noticeable difference (except for probably reduced number of target specific code), since it will be faced with the same data representation used right now and LLVM will not be able to optimize much (assuming the ocamlopt higher level optimizations were already applied). Absolutely. What you are talking about is replacing several design decisions within OCaml (which have nothing to do with the low-level code generator); Except that those design decisions have a huge effect on the optimizations LLVM will do (if it is your code gen). this might be a good idea, and may indeed improve generated code. HLVM's benchmark results would certainly seem to indicate that, yes. I don't know the opinion of the OCaml core developers, but from my POV, this sounds like an interesting challenge; however getting beyond a simplified proof-of-concept requires a lot of hard work. Yes. At the end of the day, the performance gains that can be obtained by bringing the GC up to date already dwarf all of these other effects so a new GC is surely the highest priority. This begs a couple of questions: * How dependent is OCaml bytecode on the data representation and GC? * Could you write an OCaml bytecode interpreter in OCaml? Perhaps we could write a multicore friendly OCamlJIT3. :-) Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: Threading and SharedMem (Re: [Caml-list] Re: Is OCaml fast?)
What would be responsible for collecting the shared heap? Cheers, Jon. Eray wrote: Seconded, why is this not possible? That is to say, why cannot each thread maintain a separate GC, if so desired? ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: Is OCaml fast?
I assume he means one thread has one behaviour and another has the other behaviour, in which case there certainly is a problem! Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Christophe Raffalli Sent: 29 November 2010 07:33 To: oli...@first.in-berlin.de Cc: Caml List Subject: Re: [Caml-list] Re: Is OCaml fast? Le 28/11/2010 19:17, oli...@first.in-berlin.de a écrit : On Sat, Nov 27, 2010 at 04:58:55PM +0100, Christophe Raffalli wrote: Hello, To the extent that this rule is the same for all languages and that most languages on the shootout are also garbage collected, I think OCaml's problem with this benchmark do point at a weakness of the current GC code. This is untrue ... the bintree example, is just bad in OCaml because the default value of the minor heap size is the correct value for reactive programs where you want fast minor GC slice, because they interrupt the program ... [...] And if your program contains both kinds of functionality? What possible solution would you recommend? Changing the value of the minor heap size at runtime ... There is no pb with this ... Ciao, Oliver ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs -- Christophe Raffalli Universite de Savoie Batiment Le Chablais, bureau 21 73376 Le Bourget-du-Lac Cedex tel: (33) 4 79 75 81 03 fax: (33) 4 79 75 87 42 mail: christophe.raffa...@univ-savoie.fr www: http://www.lama.univ-savoie.fr/~RAFFALLI - IMPORTANT: this mail is signed using PGP/MIME At least Enigmail/Mozilla, mutt or evolution can check this signature. The public key is stored on www.keyserver.net - ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] OCaml GC [was Is OCaml fast?]
I don't understand why this would help here though. Wouldn't that help when a long-lived structure was single large block but, in this case, the long-lived structure is a tree composed of many small heap-allocated blocks and, therefore, they would undergo the same wasteful allocate in young only to be copied to old pathological behaviour? Surely what you really want the ability to spot when a full minor heap contains mostly live objects and, therefore, make the whole minor heap part of the major heap, allocate yourself a new minor heap and clear the remembered sets. IIRC, G1 does something like this. On a related note, When designing HLVM I thought that a mallocs function that allocated many small blocks of the same size such that they could be freed individually would be helpful. Another option might be to preallocate larger numbers of blocks at the same time under the assumption that a thread allocating many small surviving blocks would continue to do so, as a kind of pool hybrid. Cheers, Jon. From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of Eray Ozkural Sent: 28 November 2010 18:57 To: Benedikt Meurer Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] OCaml GC [was Is OCaml fast?] On Sun, Nov 28, 2010 at 4:29 PM, Benedikt Meurer benedikt.meu...@googlemail.com wrote: Speaking of the OCaml GC in general, wouldn't it make sense to replace the current generational collector with a collector framework that requires less copying in the common case. For example, dividing the heap into two parts, large object space and small object space, where LOS is managed by mark-sweep/mark-compact (could use the current major heap) and SOS is managed by a recent mark-copy/mark-region collector (Immix [1] comes to mind here). That way objects would no longer need to be copied between generations, and using Immix may also reduce cache misses and improve locality of small/medium sized objects (with proper evacuation heuristics). This should be doable with a moderate amount of work, and should require no incompatible changes, while opening the door for concurrent/parallel garbage collection (this would require incompatible changes then, replacing caml_young_ptr/caml_young_limit with TLS vars, etc.). Benedikt [1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.144.3640 I was suggesting dealing with small fixed-size objects differently in another post. This doesn't sound like a bad combination, nice idea :) Best, -- Eray Ozkural, PhD candidate. Comp. Sci. Dept., Bilkent University, Ankara http://groups.yahoo.com/group/ai-philosophy http://myspace.com/arizanesil http://myspace.com/malfunct ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] OCaml GC [was Is OCaml fast?]
I see. Yes, that sounds like a great idea. How well does Immix cope with high allocation rates of short-lived objects? Been a while since I read the Immix paper... Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Benedikt Meurer Sent: 28 November 2010 20:00 To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] OCaml GC [was Is OCaml fast?] On Nov 28, 2010, at 20:40 , Jon Harrop wrote: I don't understand why this would help here though. Wouldn't that help when a long-lived structure was single large block but, in this case, the long-lived structure is a tree composed of many small heap- allocated blocks and, therefore, they would undergo the same wasteful allocate in young only to be copied to old pathological behaviour? There is no young and no old in this scheme. There are two different heaps, one for large objects, one for small (probably 2-8k max object size, so not really small in the sense of OCaml). The small area is managed by Immix, which avoids copying of long-lived objects if they are allocated together with other long-lived objects (likely), or if not evacuates a set of probably related objects to a single chunk (this is somewhat dependent on the evacuation strategy, which will be differnt for OCaml compared to Java), further improving locality. There are simple heuristics to ensure that an object is not evacuated too often (it is already unlikely with the base algorithm), so there will be a lot less copying. One difficulty remains however: the pause times. It would probably be necessary to adopt Immix to a (semi-)incremental style to get the same pause times as with the current GC. Benedikt ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Is OCaml fast?
Boxing and generics with type erasure making it difficult or impossible to reduce the density of references in the heap. Structs and reified generics can give huge performance improvements. Hash tables and complex arithmetic are some examples where I often see F# running 5-20× faster than idiomatic OCaml/Java code. Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Erik de Castro Lopo Sent: 24 November 2010 01:24 To: caml-l...@inria.fr Subject: Re: [Caml-list] Is OCaml fast? Jon Harrop wrote: Many of the things that can make OCaml and Java slow were inherited from Lisp. They are, in effect, designs that still bear the burden of dynamic typing despite being statically typed. I'm curious, what in your opinion are those things? Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/ ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: Is OCaml fast?
And almost all of the Haskell solutions (reverse-complement, spectral-norm, Mandelbrot, n-body, fannkuch-redux, k-nucleotide, regex-dna) abuse GHC's FFI in order to work around Haskell. The k-nucleotide benchmark in Haskell even uses malloc! Ketil Malde crafted a much better solution but noted: This is considered cheating, since it is the easy and natural way to do it. - http://www.haskell.org/haskellwiki/Shootout/Knucleotide Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Christophe TROESTLER Sent: 23 November 2010 10:38 To: igo...@yahoo.com Cc: caml-l...@inria.fr Subject: Re: [Caml-list] Re: Is OCaml fast? On Tue, 23 Nov 2010 02:03:48 +, Isaac Gouy wrote: C version : 12.11 secs OCaml version : 47.22 secs OCaml version with GC parameters tuned (interesting alternative section) : 12.67 secs And of course you know because that GC tuned OCaml program is shown on the benchmarks game website ;-) http://shootout.alioth.debian.org/u32/program.php?test=binarytrees; lang=ocamlid=1 Since you are here, please explain why C can use memory pools and vec tor instructions but tuning the GC of OCaml ― although it is part of the standard library ― is considered an alternative. Best, C. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: Optimizing garbage collection
I would increase the default minor heap size to something closer to today's common L2 cache sizes. But I see that Damien already did this for us: http://caml.inria.fr/cgi-bin/viewcvs.cgi/ocaml/version/3.12/byterun/config.h ?rev=10787 http://caml.inria.fr/cgi-bin/viewcvs.cgi/ocaml/version/3.12/byterun/config. h?rev=10787r1=10496r2=10787 r1=10496r2=10787 Cheers, Jon. From: Eray Ozkural [mailto:examach...@gmail.com] Sent: 22 November 2010 23:14 To: Jon Harrop Cc: Sylvain Le Gall; caml-l...@inria.fr Subject: Re: [Caml-list] Re: Optimizing garbage collection On Mon, Nov 22, 2010 at 11:14 PM, Jon Harrop jonathandeanhar...@googlemail.com wrote: What happens if you just increase the default size? Well we don't want to be a memory hog like Java do we? It's something that kind of depends on the app, what would you set it to? Cheers, -- Eray Ozkural, PhD candidate. Comp. Sci. Dept., Bilkent University, Ankara http://groups.yahoo.com/group/ai-philosophy http://myspace.com/arizanesil http://myspace.com/malfunct ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: Is OCaml fast?
Note that the regex-dna solution for Haskell tweaks its GC parameters via the -H command-line parameter: http://shootout.alioth.debian.org/u64q/program.php?test=regexdnalang=ghcid =2 -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Christophe TROESTLER Sent: 23 November 2010 10:38 To: igo...@yahoo.com Cc: caml-l...@inria.fr Subject: Re: [Caml-list] Re: Is OCaml fast? On Tue, 23 Nov 2010 02:03:48 +, Isaac Gouy wrote: C version : 12.11 secs OCaml version : 47.22 secs OCaml version with GC parameters tuned (interesting alternative section) : 12.67 secs And of course you know because that GC tuned OCaml program is shown on the benchmarks game website ;-) http://shootout.alioth.debian.org/u32/program.php?test=binarytrees; lang=ocamlid=1 Since you are here, please explain why C can use memory pools and vec tor instructions but tuning the GC of OCaml ― although it is part of the standard library ― is considered an alternative. Best, C. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: Is OCaml fast?
Yes, an answer to a better question. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Isaac Gouy Sent: 23 November 2010 18:07 To: caml-l...@inria.fr Subject: [Caml-list] Re: Is OCaml fast? Jon Harrop jonathandeanharrop at googlemail.com writes: Ketil Malde crafted a much better solution but noted: This is considered cheating, since it is the easy and natural way to do it. - http://www.haskell.org/haskellwiki/Shootout/Knucleotide Not even cheating - just an answer to a different question. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: Optimizing garbage collection
What happens if you just increase the default size? ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Native toplevel? (was: OCamlJit 2.0)
Also for interactive data massaging and visualization. From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of Ashish Agarwal Sent: 18 November 2010 16:50 To: Alain Frisch Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Native toplevel? (was: OCamlJit 2.0) On Wed, Nov 17, 2010 at 3:44 AM, Alain Frisch al...@frisch.fr wrote: Does performance really matter that much for rapid prototyping/development? Rapid prototyping for me often involves a couple of lines of code that read in a very large file and do something with it. I have to keep compiling these small programs to native code because the performance of the toplevel is too slow. Then, I have to recompile and re-read the whole file for every little additional thing I want to compute. A high-performance toplevel would help in this kind of work. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] SMP multithreading
This is actually a quick way to use multiple cores with ocaml. Find a often called function that takes considerable time and offload it to C Or HLVM, F#, Scala, Clojure or any of the other languages that permit shared memory parallelism. C is particularly poor in this regard so I would not just restrict yourself to C... Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] [Was: OCamlJit 2.0]
Do we have example of big companies porting their whole codebase to another language ? Yes, of course. Companies modernise all the time. We have a client who just started porting 1MLOC of C++ to F#. Flying Frog have ported hundreds of thousands of lines of OCaml to F#. It happens all the time but it is even more likely to happen as a consequence of multicore. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] [Was: OCamlJit 2.0]
Yoann Padioleau wrote: On Nov 20, 2010, at 9:08 AM, Jon Harrop wrote: Do we have example of big companies porting their whole codebase to another language ? Yes, of course. Companies modernise all the time. We have a client who just started porting 1MLOC of C++ to F#. How they do that ? Are they using compiler frontends to assist them in automatically translating part of the code to F# ? We do it for them for a fee, using magic. We tried automated translation (from Fortran) but it is usually easier just to knuckle down by hand. Old code bases are usually 90% cruft anyway. That 1MLOC of C++ could have been 100kLOC of C++ but now it will be 10kLOC of F# instead. Flying Frog have ported hundreds of thousands of lines of OCaml to F#. OCaml and F# are very similar language, so such porting does not look that hard. Yes. The main thing is that OCaml source is nice and readable so it is easy to translate even if it makes heavy use of features that don't port (polymorphic variants, macros). C++ source can be a nightmare so you really need to pin down a precise specification for what it does using advanced techniques like poking it with a stick. Translating from C++ to F#, or from PHP to Java seems more complicated. I wonder what kind of techniques people have developed to help such language-to-language translation. The only viable option seems to be to do it by hand. You can churn through an enormous amount of code when you put your mind to it... Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: [Was: OCamlJit 2.0]
Sylvain Le Gall: I doubt an old code, not written with multicore in mind is easily portable to multicore. So basically, the migration you are talking about is starting a new project that will replace one software/library by another. Yes, the systems are kept loosely coupled during the transition so it is more like a new project to supercede the old one than a gradual transition between code bases. However, only the developers need be aware of this distinction. From the point of view of everyone else in the company, the implementation of a product is being modernized. Management get improved productivity/maintainability. Sales get shiny new things. HR get to fire any developers who resist (but most are assimilated ;-). The avalanche happens when the cost of rewriting falls below the cost of adding essential features to a legacy code base. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] SMP multithreading
Can you cite any papers from this century? ;-) Cheers, Jon. From: Eray Ozkural [mailto:examach...@gmail.com] Sent: 17 November 2010 13:41 To: Eray Ozkural; Jon Harrop; caml-list@yquem.inria.fr Subject: Re: [Caml-list] SMP multithreading On Wed, Nov 17, 2010 at 8:50 AM, Gabriel Kerneis kern...@pps.jussieu.fr wrote: On Wed, Nov 17, 2010 at 06:27:14AM +0200, Eray Ozkural wrote: As I said even in C good results can be achieved, I've seen that, so I know it's doable with ocaml, just a difficult kind of compiler. The functional features would expose more concurrency. Could you share a pointer to a paper describing this compiler? I can't reveal much, but just to point out that there are indeed more sophisticated compilers than gcc: http://www.research.ibm.com/vliw/compiler.html So, uh, there are compilers that turn loops into threads, and also parallelize independent blocks Both coarse-grain and fine-grain parallelization strategies in existing compiler research can be effectively applied to the multi-core architectures. In fact, some of the more advanced compilers (like that of the RAW architecture) must be able to target it already, but who knows. :) Just consider that most of the parallelization technology is language independent, they can be applied to any imperative language. So, would such a thing be able to work on ocaml generated binaries? Most definitely, I believe, it is in principle possible to start from the sequential binary and emit parallel code! Best, -- Eray Ozkural, PhD candidate. Comp. Sci. Dept., Bilkent University, Ankara http://groups.yahoo.com/group/ai-philosophy http://myspace.com/arizanesil http://myspace.com/malfunct ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] SMP multithreading
Wolfgang wrote: I'd like to point out how the big competitor to OCaml deals with it. The GHC Haskell system has SMP parallization built in for some time, and it does it quite well. I beg to differ. Upon trying to reproduce many of the Haskell community's results, I found that even their own parallel Haskell programs often exhibit huge slowdowns. This is because Haskell's unpredictable performance leads to unpredictable granularity and, consequently, more time can be spent administering the tiny parallel computations than is gained by doing so. The results I found here are typical: http://flyingfrogblog.blogspot.com/2010/01/naive-parallelism-with-hlvm.html Note that the absolute performance peaks at an unpredictable number of cores only in the case of Haskell. This is because the GC does not scale beyond about 4 cores for any Haskell programs doing significant amounts of allocation, which is basically all Haskell programs because allocations are everywhere in Haskell. Ultimately, running on all cores attains no speedup at all with Haskell in that case. This was branded the last core slowdown but the slowdown clearly started well before all 8 cores. There was a significant development towards improving this situation but it won't fix the granularity problem: http://hackage.haskell.org/trac/ghc/blog/new-gc-preview The paper Regular, shape-polymorphic, parallel arrays in Haskell cites 2.5x speedups when existing techniques were not only already getting 7x speedups but better absolute performance as well. Cache complexity is the problem, as I explained here: http://flyingfrogblog.blogspot.com/2010/06/regular-shape-polymorphic-paralle l.html Probably the best solution for multicore programming is Cilk. This technique has already been adopted both in Intel's TBB and Microsoft's .NET 4 but, AFAIK, the only functional language with access to it is F#. There are some great papers on multicore-friendly cache oblivious algorithms written in Cilk: http://www.fftw.org/~athena/papers/tocs08.pdf Note, in particular, that Cilk is not only much faster but also much easier to use than explicit message passing. To do something like this, threads need to be able to run in parallel and mutate the same shared heap. Although that is objectively easy (I did it in HLVM), OCaml's reliance upon very high allocation rates, efficient collection of young garbage and a ridiculous density of pointers in the heap make it a *lot* harder. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] SMP multithreading
Granularity and cache complexity are the reasons why not. If you find anything and everything that can be done in parallel and parallelize it then you generally obtain only slowdowns. An essential trick is to exploit locality via mutation but, of course, purely functional programming sucks at that by design not least because it is striving to abstract that concept away. I share your dream but I doubt it will ever be realized. Cheers, Jon. From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of Eray Ozkural Sent: 16 November 2010 23:05 To: Gerd Stolpmann Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] SMP multithreading On Wed, Nov 17, 2010 at 12:13 AM, Gerd Stolpmann i...@gerd-stolpmann.de wrote: Am Dienstag, den 16.11.2010, 22:35 +0200 schrieb Eray Ozkural: On Tue, Nov 16, 2010 at 7:04 PM, Gerd Stolpmann i...@gerd-stolpmann.de wrote: Am Montag, den 15.11.2010, 22:46 -0800 schrieb Edgar Friendly: * As somebody mentioned implicit parallelization: Don't expect anything from this. Even if a good compiler finds ways to parallelize 20% of the code (which would be a lot), the runtime effect would be marginal. 80% of the code is run at normal speed (hopefully) and dominates the runtime behavior. The point is that such compiler-driven code improvements are only local optimizations. For getting good parallelization results you need to restructure the design of the program - well, maybe compiler2.0 can do this at some time, but this is not in sight. I think you are underestimating parallelizing compilers. I was more citing Amdahl's law, and did not want to criticize any effort in this area. It's more the usefulness for the majority of problems. How useful is the best parallelizing compiler if only a small part of the program _can_ actually benefit from it? Think about it. If you are not working in an area where many subroutines can be sped up, you consider this way of parallelizing as a waste of time. And this is still true for the majority of problems. Also, for many problems that can be tackled, the scalability is very limited. What makes you think only 20% of the code can be parallelized? I've worked in such a compiler project, and there were way too many opportunities for parallelization in ordinary C code, let alone a functional language; implicit parallelism would work wonders there. Of course you may think whatever you were thinking, but I know that a high degree of parallelism can be achieved through a functional language. I can't tell you much more though. If you think that a very small portion of the code can be parallelized you probably do not appreciate the kinds of static and dynamic analysis those compilers perform. Of course if you are thinking of applying it to some office or e-mail application, this might not be the case, but automatic parallelization strategies would work best when you apply them to a computationally intensive program. The really limiting factor for current functional languages would be their reliance on inherently sequential primitives like list processing, which may in some cases limit the compiler to only pipeline parallelism (something non-trivial to do by hand actually). Instead they would have to get their basic forms from high-level parallel PLs (which might mean rewriting a lot of things). The programs could look like something out of a category theory textbook then. But I think even without modification you could get a lot of speedup from ocaml code by applying state-of-the-art automatic parallelization techniques. By the way, how much of the serial work is parallelized that's what matters rather than the code length that is. Even if the parallelism is not so obvious, the analysis can find out which iterations are parallelizable, which variables have which kinds of dependencies, memory dependencies, etc. etc. In the public, there is this perception that automatic parallelization does not work, which is wrong, while it is true that the popular compiler vendors do not understand much in this regard, all I've seen (in popular products) are lame attempts to generate vector instructions That is not to say, that implicit parallelization of current code can replace parallel algorithm design (which is AI-complete). Rather, I think, implicit parallelization is one of the things that will help parallel computing people: by having them work at a more abstract level, exposing more concurrency through functional forms, yet avoiding writing low-level comms code. High-level explicit parallelism is also quite favorable, as there may be many situations where, say, dynamic load-balancing approaches are suitable. The approach of HPF might be relevant here, with the programmer
RE: [Caml-list] Infix function composition operator
A pipeline operator is usually preferred over function composition in impure languages like OCaml and F# due to the value restriction. For example, your example would be written in F# as: x | op1 | op2 | op3 | op4 | op5 This style is very common in F#, particularly when dealing with collections. Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of m...@proof-technologies.com Sent: 10 November 2010 07:00 To: ymin...@gmail.com; ar...@noblesamurai.com Cc: caml-l...@inria.fr Subject: Re: [Caml-list] Infix function composition operator on 10/11/10 3:45 AM, ymin...@gmail.com wrote: This is probably a minority opinion, but I have written and read quite a lot of OCaml code over the years, and I've seen surprisingly few effective uses of the composition operator. Somehow, I usually find that code that avoids it is simpler and easier to read. I agree that using a composition operator can make the code obtuse, and so should not be overused. But it's incredibly useful for certain situations: 1) If you are performing a long chain of composed operations, it avoids nested bracketing piling up. For example: (op5 - op4 - op3 - op2 - op1) x Instead of: op5 (op4 (op3 (op2 (op1 x This sort of thing happens quite a lot in certain applications, e.g. in language processing, to get at subexpressions. 2) Creating an anonymous function to be passed as an argument, it avoids explicitly mentioning arguments of that function. This sort of thing can happen a lot in functional programming generally. For example: List.map (op2 - op1) xs Instead of: List.map (fun x - op2 (op1 x)) xs Mark Adams ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Infix function composition operator
In OCaml, the value restriction leads to non-generalized type variables ('_a etc.) if you try to define functions like: # let ( ) f g x = f(g x);; val ( ) : ('a - 'b) - ('c - 'a) - 'c - 'b = fun # let cons h t = h::t;; val cons : 'a - 'a list - 'a list = fun # cons cons;; - : '_a - ('_a list - '_a list) list - ('_a list - '_a list) list = fun This is a silly example but you are most likely to hit this problem in practice in the context of parser combinators. Due to JIT compilation, F# cannot relax the value restriction so that does not even compile. In MLs, you usually want the eta-expanded form: # let cons2 x = (cons cons) x;; val cons2 : 'a - ('a list - 'a list) list - ('a list - 'a list) list = fun But a pipeline is prettier: # let ( | ) x f = f x;; val ( | ) : 'a - ('a - 'b) - 'b = fun # let cons2 x = x | cons | cons;; val cons2 : 'a - ('a list - 'a list) list - ('a list - 'a list) list = fun This is one advantage of Haskell over OCaml/F#. However, I don't see it as a useful advantage in practice because parser combinators are so tedious during development (they require constant attention as types evolve): you want code generation like ocamlyacc or camlp4. OCaml is a very strong contender here, of course. Cheers, Jon. -Original Message- From: m...@proof-technologies.com [mailto:m...@proof-technologies.com] Sent: 10 November 2010 13:44 To: jonathandeanhar...@googlemail.com; ymin...@gmail.com; ar...@noblesamurai.com Cc: caml-l...@inria.fr Subject: Re: [Caml-list] Infix function composition operator So how does value restriction affect things here? (excuse my lack of knowledge) One thing about using a pipeline like this is that it relies on '|' being left-associative (which it is due to OCaml's convention on operators that start with |). Mark. on 10/11/10 12:52 PM, Jon Harrop jonathandeanhar...@googlemail.com wrote: A pipeline operator is usually preferred over function composition in impure languages like OCaml and F# due to the value restriction. For example, your example would be written in F# as: x | op1 | op2 | op3 | op4 | op5 This style is very common in F#, particularly when dealing with collections. Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of m...@proof-technologies.com Sent: 10 November 2010 07:00 To: ymin...@gmail.com; ar...@noblesamurai.com Cc: caml-l...@inria.fr Subject: Re: [Caml-list] Infix function composition operator on 10/11/10 3:45 AM, ymin...@gmail.com wrote: This is probably a minority opinion, but I have written and read quite a lot of OCaml code over the years, and I've seen surprisingly few effective uses of the composition operator. Somehow, I usually find that code that avoids it is simpler and easier to read. I agree that using a composition operator can make the code obtuse, and so should not be overused. But it's incredibly useful for certain situations: 1) If you are performing a long chain of composed operations, it avoids nested bracketing piling up. For example: (op5 - op4 - op3 - op2 - op1) x Instead of: op5 (op4 (op3 (op2 (op1 x This sort of thing happens quite a lot in certain applications, e.g. in language processing, to get at subexpressions. 2) Creating an anonymous function to be passed as an argument, it avoids explicitly mentioning arguments of that function. This sort of thing can happen a lot in functional programming generally. For example: List.map (op2 - op1) xs Instead of: List.map (fun x - op2 (op1 x)) xs Mark Adams ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Infix function composition operator
I am currently writing a concurrent server under contract in F#. Due to the poor performance of the default .NET serialization (over 170x slower than Marshal in OCaml!), I had to completely replace the serialization layer. Ideally, I would just write a script to compile my type definitions into serializers and deserializers over them. However, I haven't found time to do that yet so I am maintaining a set of combinators instead. They account for over 15% of the source code now. Every time I change a type (e.g. adding a field to a record type), I must update the combinators to reflect the changes. This is getting seriously tedious as the project is experimental and my types need to evolve rapidly. If I were using OCaml, Marshal would be easily fast enough and I wouldn't even need to update my type definitions by hand because I could use structural typing (polymorphic variants and objects). Unfortunately, this is not yet possible in F#. Another problem area has been the incrementality of garbage collection. OCaml is good in this respect among functional languages but still orders of magnitude behind the state-of-the-art. With enough care and attention, .NET can be almost as good as OCaml is ordinarily but .NET's worse cases are horrific: http://flyingfrogblog.blogspot.com/2010/10/can-you-repro-this-64-bit-net-gc-bug.html Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Stephan Tolksdorf Sent: 10 November 2010 16:11 To: caml-l...@inria.fr Subject: Re: [Caml-list] Infix function composition operator On Wed, Nov 10, 2010 at 14:13 -, Jon Harrop wrote: However, I don't see it as a useful advantage in practice because parser combinators are so tedious during development (they require constant attention as types evolve): you want code generation like ocamlyacc or camlp4. OCaml is a very strong contender here, of course. Could you maybe elaborate a bit on what you find tedious with regard to evolving types in the context of parser combinators? In my parser code (using FParsec in F#) most types get inferred by the compiler and in the remaining instances the type annotations can hardly be called tedious. Actually, I find the types and the Visual Studio tooltips with the inferred types rather helpful for development. - Stephan Cheers, Jon. -Original Message- From: m...@proof-technologies.com [mailto:m...@proof- technologies.com] Sent: 10 November 2010 13:44 To: jonathandeanhar...@googlemail.com; ymin...@gmail.com; ar...@noblesamurai.com Cc: caml-l...@inria.fr Subject: Re: [Caml-list] Infix function composition operator So how does value restriction affect things here? (excuse my lack of knowledge) One thing about using a pipeline like this is that it relies on '|' being left-associative (which it is due to OCaml's convention on operators that start with |). Mark. on 10/11/10 12:52 PM, Jon Harropjonathandeanhar...@googlemail.com wrote: A pipeline operator is usually preferred over function composition in impure languages like OCaml and F# due to the value restriction. For example, your example would be written in F# as: x | op1 | op2 | op3 | op4 | op5 This style is very common in F#, particularly when dealing with collections. Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of m...@proof-technologies.com Sent: 10 November 2010 07:00 To: ymin...@gmail.com; ar...@noblesamurai.com Cc: caml-l...@inria.fr Subject: Re: [Caml-list] Infix function composition operator on 10/11/10 3:45 AM, ymin...@gmail.com wrote: This is probably a minority opinion, but I have written and read quite a lot of OCaml code over the years, and I've seen surprisingly few effective uses of the composition operator. Somehow, I usually find that code that avoids it is simpler and easier to read. I agree that using a composition operator can make the code obtuse, and so should not be overused. But it's incredibly useful for certain situations: 1) If you are performing a long chain of composed operations, it avoids nested bracketing piling up. For example: (op5- op4- op3- op2- op1) x Instead of: op5 (op4 (op3 (op2 (op1 x This sort of thing happens quite a lot in certain applications, e.g. in language processing, to get at subexpressions. 2) Creating an anonymous function to be passed as an argument, it avoids explicitly mentioning arguments of that function. This sort of thing can happen a lot in functional programming generally. For example: List.map (op2- op1) xs Instead of: List.map (fun x - op2 (op1 x)) xs Mark Adams ___ Caml-list mailing
RE: [Caml-list] How does OCaml update references when values are moved by the GC?
I was hoping for a little more detail, of course. :-) How is the mapping from old to new pointers stored? Does the GC rewrite all of the thread-local stacks in series before allowing any of them to continue? Does the write barrier record pointers written into the major heap so only those specific locations are rewritten at minor heap collections but the entire major heap is rewritten upon compaction? Can the GC distinguish between an array of ints and an array of pointers at run-time in order to avoid traversing all of the ints when trying to rewrite pointers? Also, any idea what the maximum proportion of the running time of a program is spent doing this rewriting? For example, if you fill a float-float hash table with values does updating the pointers in the major heap account for a significant proportion of the total running time of the program? Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Damien Doligez Sent: 29 October 2010 08:48 To: caml users Subject: Re: [Caml-list] How does OCaml update references when values are moved by the GC? On 2010-10-28, at 23:48, Jon Harrop wrote: How does OCaml update references in the stacks and heap when values are moved by the GC? They are updated by the GC, of course. -- Damien ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] How does OCaml update references when values are moved by the GC?
I just found some interesting statements made by others on this post about optimizing OCaml's write barrier: http://eigenclass.org/R2/writings/optimizing-caml_modify In the comments, Mauricio said that caml_modify (the write barrier) accounted for over 30% of the total running time of the whole application. -Original Message- From: Jon Harrop [mailto:j...@ffconsultancy.com] Sent: 30 October 2010 18:16 To: 'Damien Doligez'; 'caml-l...@inria.fr' Subject: RE: [Caml-list] How does OCaml update references when values are moved by the GC? I was hoping for a little more detail, of course. :-) How is the mapping from old to new pointers stored? Does the GC rewrite all of the thread-local stacks in series before allowing any of them to continue? Does the write barrier record pointers written into the major heap so only those specific locations are rewritten at minor heap collections but the entire major heap is rewritten upon compaction? Can the GC distinguish between an array of ints and an array of pointers at run-time in order to avoid traversing all of the ints when trying to rewrite pointers? Also, any idea what the maximum proportion of the running time of a program is spent doing this rewriting? For example, if you fill a float-float hash table with values does updating the pointers in the major heap account for a significant proportion of the total running time of the program? Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Damien Doligez Sent: 29 October 2010 08:48 To: caml users Subject: Re: [Caml-list] How does OCaml update references when values are moved by the GC? On 2010-10-28, at 23:48, Jon Harrop wrote: How does OCaml update references in the stacks and heap when values are moved by the GC? They are updated by the GC, of course. -- Damien ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] How does OCaml update references when values are moved by the GC?
Hi Michael, Thanks for the info. I stumbled upon Richard's excellent web page describing the internals of OCaml and the write barrier but it does not describe how the pointers actually get rewritten. Cheers, Jon. -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of Michael Ekstrand Sent: 30 October 2010 18:38 To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] How does OCaml update references when values are moved by the GC? On 10/30/2010 12:15 PM, Jon Harrop wrote: I was hoping for a little more detail, of course. :-) How is the mapping from old to new pointers stored? Does the GC rewrite all of the thread-local stacks in series before allowing any of them to continue? I imagine so. Does the write barrier record pointers written into the major heap so only those specific locations are rewritten at minor heap collections but the entire major heap is rewritten upon compaction? Yes. The runtime maintains a 'remembered set', a list of pointers from the major heap back to the minor heap. Maintaining this set is why mutable data can be expensive in OCaml - any time you store a pointer into a mutable field, the runtime must check whether the new link is from the major to the minor heap and update the refs list accordingly. Richard WM Jones has details here: http://rwmj.wordpress.com/2009/08/08/ocaml-internals-part-5-garbage- collection/ Can the GC distinguish between an array of ints and an array of pointers at run-time in order to avoid traversing all of the ints when trying to rewrite pointers? Not that I know of. The tag block does not have a documented reserved value to indicate that - there are values to indicate an unboxed float array, a string, and an array of opaque values, but not an integer array (unless the opaque value flag is set for integer arrays). Also, any idea what the maximum proportion of the running time of a program is spent doing this rewriting? For example, if you fill a float- float hash table with values does updating the pointers in the major heap account for a significant proportion of the total running time of the program? In my data analysis jobs (which wind up allocating quite large heaps), the compactor almost never (detectably) runs. Minor cycles and major slices are a much larger concern in my experience. I work around that by increasing the minor heap size to decrease minor heap thrashing (my general rule of thumb is that I want each work unit, whatever that may be, to fit in the minor heap). It could well be that other applications will have different characteristics that trigger more compactions. I cannot speak for those applications. Further, when I have huge floating-point data structures, I'm usually using bigarrays (not because I choose them over arrays, typically, but because such code in my work frequently has to interact with BLAS or SPARSKIT at some point). - Michael ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] How does OCaml update references when values are moved by the GC?
How does OCaml update references in the stacks and heap when values are moved by the GC? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Asynchronous IO programming in OCaml
Is there a tutorial on using something like LWT for asynchronous programming in OCaml? I'm looking for an example like an echo server that handles clients concurrently without blocking threads, so it can handle thousands of clients without significant performance degradation. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] from if...else to pattern matching
Note that you are using physical equality (==) and not structural equality (=). Also, you can always rewrite: if pred then true else false more simply as: pred -Original Message- From: caml-list-boun...@yquem.inria.fr [mailto:caml-list- boun...@yquem.inria.fr] On Behalf Of ben kuin Sent: 02 October 2010 15:05 To: caml-l...@inria.fr Subject: [Caml-list] from if...else to pattern matching hi I try to transform an if-else clause into pattern matching, I think I've tried a lot of approaches, but apperently I'm doing something fundemently wrong. ~ (** Define behaviors against a constang 'x = c' with if...else and pattern matching. I want to see how to match a value against funtion **) (* defining a '=' operator *) let (=) a b = if a b then true else if a == b then true else false;; let if_test c = if ( 4 = c )then true else false ;; (* and now use the operator in pattern matching * ) let match_test c = match ( _ = c ) (* pseudocode *) | 4 - true | _ - false ;; ~~~ thanks in advance ben ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Compiling Ocaml sources to c sources
* Hacking a C generator inside Ocaml is non-trivial, because of the garbage collector, currified function calls, and tail recursion etc. BTW, I was always surprised nobody had converted the bytecode interpreter into a via-C compiler by unwinding the C code the interpreter goes through to execute a specific OCaml bytecode program. You could even use this with something like Clang to get JIT compilation to native code. Not very efficient but surely much faster than interpreted bytecode... Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Compiling Ocaml sources to c sources
Hmm, this would only optimize the bytecode fetch/decode step of the ocaml execution cycle. I am not sure that it will result in much real-world speedup. Would be interesting to try. I suspect the C compiler would then optimize the sequences of operations on the data as well. For example, something like vector addition. In fact, that seems to be the main problem with many of these so-called JIT interpreters, which in my opinion, do not seem to have learnt from the HAL architectures of IBM OS's etc. Was probably also the problem with Transmeta; cheap compilation entails cheap performance. Can you elaborate? Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: How to re-implement the GC?
Hi Eray, Retrofitting a new multicore-friendly GC onto OCaml is difficult for two main reasons: 1. You want plug-and-play GCs but the OCaml compiler is tightly coupled to the old GC (although OC4MC has decoupled them!). 2. Recovering similar performance whilst reusing the same data representation is extremely difficult because the current design relies so heavily on lightweight allocation. You really want to change the data representation to avoid unnecessary boxing (e.g. never box or tag int, floats or tuples) in order to reduce the allocation rate and, consequently, the stress on the garbage collector but OCaml cannot express value types and its ability to represent polymorphic recursion makes this extremely difficult to implement. As Sylvain has said, OC4MC is your best bet if you want to try to write a new GC for OCaml. However, making more extensive changes has the potential to address many more problems (e.g. convey run-time type information for generic printing) so you might consider alternatives like trying to compile OCaml's bytecode into HLVM's IR for JIT compilation because HLVM already has a multicore friendly GC and it is much easier to develop. Ah, that's interesting. I wonder if it provides any real speedup on new architectures compared to storing the pointer in RAM. For a multicore GC, using a register to refer to thread-local data is a huge win because accessing thread-local data is very slow. I made that mistake in HLVM's first multicore-capable GC and it was basically useless as a consequence because all of the time was spent looking up thread-local data. When I started passing the thread-local data around as an extra argument to every function (not necessarily consuming a register all the time because LLVM is free to spill it), performance improved enormously. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: How to re-implement the GC?
Other GC algorithm for Java/C# often made the assumption of long-living objects with mutation. This is not the case for OCaml. They do favour mutation and, consequently, have cheaper write barriers but both the JVM and CLR use pointer-bumping minor heaps for the first nursery generation to collect short-lived objects efficiently, just like OCaml. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Re: Question about float refs.
You might prefer to multiply by 1.0 because adding 0.0 changes -0.0. Cheers, Jon. Some black magic is needed: let r = ref 0.0 in for i = 0 to 1000_000_000 do r := float i done; Printf.printf %f\n (!r +. 0.); Printf.printf words: %f\n (Gc.stat ()).Gc.minor_words and the reference will be placed in a register (Caml heap will not be used inside the loop) - Dmitry Bely ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Llama Light: a simple implementation of Caml
Try to remove all assumptions of uniform run-time representation from the compiler because avoiding boxing gives huge performance gains and makes it much easier to write a performant garbage collector. You'll need to sacrifice polymorphic recursion though, which you probably already have anyway. Cheers, Jon. From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of ivan chollet Sent: 30 August 2010 11:57 To: Jeremy Bem Cc: caml-list List Subject: Re: [Caml-list] Llama Light: a simple implementation of Caml OK. This looks nice and I would be pleased if you could put a few pointers or explanations on your webpage about your typechecker implementation and how it differs with OCaml typechecker. I will get some free time this week and to implement yet another runtime and bytecode compiler from scratch. Not sure if it will be completed at the end of the week, but i'll be definitely interested to know more about the theoretical motivations of works like yours! On Mon, Aug 30, 2010 at 2:37 AM, Jeremy Bem jere...@gmail.com wrote: bluestorm: Thank you for the bug report. The toplevel issue has been fixed in the version now posted. Do you see a nice way to add let-generalization without reintroducing type levels? I was pleased to remove those. Ivan: It was originally forked from Caml Light but now includes more code from OCaml. The typechecker is mostly original code at this point; the compiler is OCaml's with minimal changes to accommodate the new typechecker; the runtime is almost identical to OCaml's. -Jeremy On Sun, Aug 29, 2010 at 6:52 AM, bluestorm bluestorm.d...@gmail.com wrote: When using the toplevel, declaration phrases fail (looks like a linking problem), but expressions work as intented : $ llama Llama Light version 0.0828 # 1 + 1;; - : int = 2 # let x = 1 + 1;; Error: Reference to undefined global `Toploop' I made my tests using llamac -i foo.ml. I found it startling that the most important difference to my eyes are buried, on the web page, under lines of relatively boring documentation : In Llama Light (and in contrast to other Caml implementations): - let does not generalize. - Phrases are generalized immediately. In particular, let foo = ref [] does not typecheck. - The value restriction is not relaxed. (This is similar to Caml Light.) These choices simplify the implementation while having relatively little impact on the user. You cite the Let Should Not Be Generalised paper. There is however a difference in your application of the maxim : in the paper, local let that have polymorphic type annotations are generalised, while in your system it is not possible to force generalisation. I had a look at the typer, and it's indeed rather simple; it seems it would be quite simple to implement generalized let (when encountering annotations or with a different syntaxic construct : letg .. = .. in ...), but the added complexity is equivalent to adding let generalization by default. Is the presence of let-generalization a real issue in your work ? ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] Llama Light: a simple implementation of Caml
If you can keep it agnostic wrt boxing then you should be able to plug it into HLVM easily for much faster numerics and parallelism. Cheers, Jon. From: ivan chollet [mailto:ivan.chol...@gmail.com] Sent: 30 August 2010 18:10 To: Jon Harrop Cc: Jeremy Bem; caml-list List Subject: Re: [Caml-list] Llama Light: a simple implementation of Caml clearly didn't plan to support polymorphic recursion in any way. I don't even plan to support lexical scoping... As for data representation I'm actually boxing everything except ints and function pointers. I found out that it leads to the simplest design, which is appropriate for a project that I don't want to take me more than a few days. On Tue, Aug 31, 2010 at 1:57 AM, Jon Harrop jonathandeanhar...@googlemail.com wrote: Try to remove all assumptions of uniform run-time representation from the compiler because avoiding boxing gives huge performance gains and makes it much easier to write a performant garbage collector. You'll need to sacrifice polymorphic recursion though, which you probably already have anyway. Cheers, Jon. From: caml-list-boun...@yquem.inria.fr [mailto:caml-list-boun...@yquem.inria.fr] On Behalf Of ivan chollet Sent: 30 August 2010 11:57 To: Jeremy Bem Cc: caml-list List Subject: Re: [Caml-list] Llama Light: a simple implementation of Caml OK. This looks nice and I would be pleased if you could put a few pointers or explanations on your webpage about your typechecker implementation and how it differs with OCaml typechecker. I will get some free time this week and to implement yet another runtime and bytecode compiler from scratch. Not sure if it will be completed at the end of the week, but i'll be definitely interested to know more about the theoretical motivations of works like yours! On Mon, Aug 30, 2010 at 2:37 AM, Jeremy Bem jere...@gmail.com wrote: bluestorm: Thank you for the bug report. The toplevel issue has been fixed in the version now posted. Do you see a nice way to add let-generalization without reintroducing type levels? I was pleased to remove those. Ivan: It was originally forked from Caml Light but now includes more code from OCaml. The typechecker is mostly original code at this point; the compiler is OCaml's with minimal changes to accommodate the new typechecker; the runtime is almost identical to OCaml's. -Jeremy On Sun, Aug 29, 2010 at 6:52 AM, bluestorm bluestorm.d...@gmail.com wrote: When using the toplevel, declaration phrases fail (looks like a linking problem), but expressions work as intented : $ llama Llama Light version 0.0828 # 1 + 1;; - : int = 2 # let x = 1 + 1;; Error: Reference to undefined global `Toploop' I made my tests using llamac -i foo.ml. I found it startling that the most important difference to my eyes are buried, on the web page, under lines of relatively boring documentation : In Llama Light (and in contrast to other Caml implementations): - let does not generalize. - Phrases are generalized immediately. In particular, let foo = ref [] does not typecheck. - The value restriction is not relaxed. (This is similar to Caml Light.) These choices simplify the implementation while having relatively little impact on the user. You cite the Let Should Not Be Generalised paper. There is however a difference in your application of the maxim : in the paper, local let that have polymorphic type annotations are generalised, while in your system it is not possible to force generalisation. I had a look at the typer, and it's indeed rather simple; it seems it would be quite simple to implement generalized let (when encountering annotations or with a different syntaxic construct : letg .. = .. in ...), but the added complexity is equivalent to adding let generalization by default. Is the presence of let-generalization a real issue in your work ? ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] interest in a much simpler, but modern, Caml?
Presumably, in general, that would need to be: If Bool.(Int.(x=42) || Int.(x=45)) . At which point things start to look hairy. To be honest, I see this delimited overloading as the best of a bad job in the absence of equality types or per-type functionality. I think what we really want is to be able to define custom equality, comparison and hashing on types when structural versions are not applicable. F#'s solution is pretty good. Type classes are a generalization but I do not see that they buy you much for the added complexity. I'd have thought this (at least equality types) would be worth putting in a minimalistic language because it is so useful. Cheers, Jon. From: Jeremy Bem [mailto:jere...@gmail.com] Sent: 12 August 2010 01:22 To: Jon Harrop Cc: bluestorm; caml-list List; Florian Weimer Subject: Re: [Caml-list] interest in a much simpler, but modern, Caml? On Wed, Aug 11, 2010 at 9:02 AM, Jon Harrop jonathandeanhar...@googlemail.com wrote: What happens when you do: if Int.(x = 42 || x = 45) then ... else ... Presumably it either barfs on the assumption that || refers to bitwise-or between ints, or we're back to inventing progressively more absurd operator names for each individual combination of types over which they might work. How so? I think this is a borderline case (even in C++, || does not refer to bitwise-or). But even if Int.(||) *were* defined as some sort of integer operation, one could simply write: if Int.(x = 42) || Int.(x = 45) Also, I think the discussion has shifted. For me, the local open is a reasonably appealing way to stop using OCaml's exotic polymorphic operators whose behavior depends on the runtime representation and which don't respect type abstraction. (For example, ocamllex uses Pervasives.(=) to test whether Map's are empty, but this breaks if the Map representation changes.) Moreover the syntax even maintains OCaml compatibility thanks to the recent update. But now we seem to be talking about operator overloading, and I'm just not convinced it's necessary at all in a system with a minimalist aesthetic. Back to the local opens, I find that I'm hesitant to add so many of them, especially for equality. Polymorphic equality is hardly unnatural, after all (cf. higher-order logic). I wonder, do any practical languages use quotient types to implement custom equality predicates? In principle, Pervasives.(=) ought to be the finest reasonable equivalence relation on a type, which could then be coarsened: type foo = Foo of int | Goo of string let _ = assert (Foo 3 Goo 3) (* duh *) let foo_equiv x y = match x, y with Foo a, Foo b - a=b | Goo a, Goo b - a=b | Foo a, Goo b | Goo b, Foo a - string_of_int a = b type goo = foo / foo_equiv (* automatically creates goo_of_foo *) let _ = assert (goo_of_foo (Foo 3) = goo_of_foo (Goo 3)) This would require runtime support. I envision that every goo is a block whose tag is Quotient_tag and which stores a foo_equiv closure in its first Obj field. As it happens, this approach would dovetail with my plans for an integrated proof assistant. Of course it lacks the conservatism I've been promoting :) -Jeremy ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Xavier Clerc wrote: Jon Harrop jonathandeanhar...@googlemail.com a écrit : I don't think this is heated at all. We were talking about high performance languages and you cited a bunch of languages that get whipped by Python on this benchmark: http://flyingfrogblog.blogspot.com/2009/04/f-vs-ocaml-vs-haskell- hash-table. html Acknowledged. Whipped is here 2 times slower on that particular benchmark, while Python is rarely within an order of magnitude of OCaml code (cf. the language shootout). Moreover, hashtables are ubiquitous in Python (and hence probably particularly optimized), while they are not so common in Haskell or Caml. I greatly value efficient generic collections. and references to benchmarks that back up your claims in this thread. http://fsharpnews.blogspot.com/2010/05/java-vs-f.html Point taken. Just notice that the 17x factor is observed on the micro-benchmark, while on the larger one the two platforms seem on par. Sure but those two benchmarks are testing completely different things. The shootout's knucleotide is a larger benchmark that still uses hash tables and Java is still 4x slower because the JVM cannot express a generic hash table efficiently. There are many such problems where the lack of value types on the JVM is a serious impediment to performance. Here is a question about the micro-benchmark: do you know if F# do monomorphize the collection in this example? If it turns out to be done, one may probably argue that the problem is more related to the language than to the platform (just recycling an objection made on the page you pointed out). I'm not sure what you mean here. Both of the programs are just using the generic hash table implementation native to their platform. Neither language does anything of relevance to optimize the code. .NET is a lot faster and uses a lot less memory than the JVM because it stores keys and values unboxed in the spine of the hash table because they are value types whereas the JVM boxes them, and because the insertion function is monomorphized at JIT time. Scala on the JVM is 7x slower than C++ on this benchmark: http://shootout.alioth.debian.org/u64q/benchmark.php?test=alllang=scal alan g2=gpp Agreed, but it seems that if you aggregate the results of the different benchmarks, Scala is on average only 1.5x from C++ (but far away in terms of memory consumption). The 7x factor is observed the worst result, the median being 2x. Sure. This seems to be a difference in our interpretation of high performance. If a language or platform can be 17x slower than others then I would not call it high performance even if it is competitively performant elsewhere. Indeed, I would completely disregard averages on benchmark suites and focus on outliers because they convey more useful information. Granted that was a microbenchmark but the effect is severe enough to afflict many real programs. It may just end up that we have different perceptions of high performance, and of the trade-offs we are going to make in our language / platform choices. Probably. What languages do not you not consider to be high performance? I am not sure it is that easy to compare languages, but measuring compiler performances: any compiler that produces code that runs within -let's say- 5x of the fastest one around, on a bunch of wide-spectrum benchmarks (e. g. numerical code *plus* string matching *plus* tree walking, etc). Maybe it should also be mentioned that I am more versed into symbolic computations. Then you're probably more interested in OCaml's GC vs parallelism when performance is important. Regarding trade-offs, I am also inclined to favor Open Source solutions and higher-level languages (the trade-off being here execution time vs programming/debugging time). I agree in general, of course, but I'm not sure higher-level languages means much in this context. Is C# higher level than Java? Maybe, but I'm interested in the value types and approach to generics here. Monomorphization and unboxed tuples get you a long way towards decent performance without a top-notch GC tuned for functional languages. That makes it feasible to implement with a naïve multicore-capable GC, which is precisely the current direction of HLVM. PS: as an aside, I used the word references for academic publications that went through a reviewing process, not blog entries. I see. I value reproducible results far higher than peer reviewed academic papers, particularly in this context. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Xavier Clerc wrote: Jon Harrop jonathandeanhar...@googlemail.com a écrit : Xavier Clerc: Le 14 mai 2010 à 12:40, Jon Harrop a écrit : Xavier Clerc wrote: Limiting myself to the JVM... Moreover, at least Scala and Bigloo deliver excellent performances. I have benchmarks where the JVM is well over 10x slower than .NET. So I do not regard any JVM-based language as high performance. Quite ironically, by scratching the surface, one would discover that both quoted projects can also target .NET (not tested that though). Does Bigloo.NET support value types? Does Scala.NET use .NET (2.0) generics? Not AFAICT. Name dropping them in the context of high performance language implementations is more than a little bit silly... First off, public insult seems quite superfluous. I was not trying to insult you. Your examples are silly because they are incomplete and untested. Do you even have either of them working right now? AFAICT, Scala.NET is known not to work and Bigloo.NET is still have dozens of core bugs fixed. We should be able to handle a heated debate without resorting to that. I don't think this is heated at all. We were talking about high performance languages and you cited a bunch of languages that get whipped by Python on this benchmark: http://flyingfrogblog.blogspot.com/2009/04/f-vs-ocaml-vs-haskell-hash-table. html And I still wait for a clear statement of your level for high performance, Within 2x of ANSI C compiled with gcc on all practically-relevant benchmarks without dropping to low-level code, e.g. GHC's FFI in Haskell. and references to benchmarks that back up your claims in this thread. http://fsharpnews.blogspot.com/2010/05/java-vs-f.html As you seem to come from an academic background, I expect facts and references, and not ad hominem attacks and fuzzy unbacked claims. An ad-hominem attack is an attack against a person. I attacked your examples, not you. Unless you show that neither Bigloo nor Scala meet your (to be defined) criteria for high performance, my counterexamples still stand. Are you talking about Bigloo.NET and Scala.NET or have you gone back to the original discussion about JVM-based languages? Scala on the JVM is 7x slower than C++ on this benchmark: http://shootout.alioth.debian.org/u64q/benchmark.php?test=alllang=scalalan g2=gpp The JVM's hash table is 17x slower than .NET's on this benchmark: http://fsharpnews.blogspot.com/2010/05/java-vs-f.html I think that is not high performance by any reasonable definition and this reflects fundamental deficiencies in the VM itself, so there is no hope of working around it in general. I have not been able to get Bigloo to run: it was deleted from Debian and Ubuntu (and the shootout) and the source distribution barfs during configuration with ./install-gc-7.1: 39: patch: not found. It may just end up that we have different perceptions of high performance, and of the trade-offs we are going to make in our language / platform choices. Probably. What languages do not you not consider to be high performance? Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Xavier Clerc: Le 14 mai 2010 à 12:40, Jon Harrop a écrit : Xavier Clerc wrote: Limiting myself to the JVM... Moreover, at least Scala and Bigloo deliver excellent performances. I have benchmarks where the JVM is well over 10x slower than .NET. So I do not regard any JVM-based language as high performance. Quite ironically, by scratching the surface, one would discover that both quoted projects can also target .NET (not tested that though). Does Bigloo.NET support value types? Does Scala.NET use .NET (2.0) generics? Not AFAICT. Name dropping them in the context of high performance language implementations is more than a little bit silly... Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Goswin wrote: Hardly any business today has an inhomogene environment. And if the environment is homogene then the vm gives you 0 advantage. It just costs you overhead to emulate. A Common Language Runtime (CLR) is an obvious counter example = the shared VM gives you safe and high-level interoperability between languages. For example, you can pass garbage collected data structures between languages by reference because they share a single GC. Without a CLR, you generally resort to painstakingly copying everything for no reason and give up for anything non-trivial (like closures) and often end up with segmentation faults and no working code. I was once asked how someone might interoperate between Standard ML and OCaml (two very closely related languages) and my answer was XMLRPC. Contrast that with F# development where almost every program in existence relies entirely upon seamless two-way interoperability with C# libraries like WPF. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Raoul Duke wrote: On Fri, May 14, 2010 at 11:59 AM, ben kuin benk...@gmail.com wrote: but that would be the big benefit of a clr like vm: It doesn't matter how messed up, chaotic or just heterogen the environment is as long as you can count on a regular execution of your portable bytecode. of course it matters: there must be the resources to get the vm ported across all the fubar variations of the ecosystem. the combinatorics has to be dealt with somewhere. that kind of complexity is less in the hegemonic windows os world, i hypothesize. Not really. Windows supports a far wider variety of hardware than Linux and Apple supports even less. Providing consistency was one of the major advantages of .NET that had people building on it originally. If you want robustly-deployable hardware-accelerated GUI apps then WPF and, therefore, .NET is your only choice on Windows and you have zero choices on Linux. If you wanted to build something comparable on Linux you would use OpenGL and must then test all software/hardware combinations to see which were unreliable (OpenGL drivers are often very unreliable on Linux). I saw this first hand when I productized OpenGL-based presentation software written in OCaml. It was a catastrophe: with segfaults on 80% of customers machines that we could not reproduce. We canned it and never tried to sell Linux-based software again. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Xavier Clerc wrote: Limiting myself to the JVM... Moreover, at least Scala and Bigloo deliver excellent performances. I have benchmarks where the JVM is well over 10x slower than .NET. So I do not regard any JVM-based language as high performance. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Ben Kuin wrote: technical: - in .NET everything is easy (from the surface): you have your source file (hello.cs) you take your compiler (cs.exe) and compile it to a msil bytecode file (hello.dll). You can run reflection tools to hello.dll or link it to a exe or generate back to source. This bytecodefile is your friend. You can run it on a bunch of runtimes like .net clr, or on mono, or rotor, or dotgnu. You can register libraries in a container to prevent versioning problems with future releases. I couldn't figure out those equivalents int hlvm or llvm. They haven't been written yet. :-) - with HLVM things are complex (for me): What is the role of the big underlying llvm infrastructure. LLVM provides the compilation to native code, both static and JIT. Why do even need hlvm if we have llvm plus ocaml bindings. LLVM's intermediate representation is a low-level assembly language with no memory safety and no automatic memory management. HLVM is a layer on top of LLVM that provides a higher-level intermediate language that is memory safe and garbage collected. Does hlvm has its own bytecode? If yes where is it specified? HLVM's IR is evolving much too quickly to be worth specifying yet. Is hlvm a ocaml library or is it a free standing vm? Today, HLVM is an OCaml library. In the future, HLVM will be a free standing VM. Maybe for you these a trivial questions, but I still dont get it, while with .net I never had problem to get the idea. Ultimately, HLVM should be just as easy to understand. The only reason it appears complicated today is that HLVM is a set of evolving fragments that are gradually stabilizing and being pulled together to form a progressively more capable core VM. What if hlvm would really take off, could they set it free and move the homepage to sourceforge? HLVM is free. I would always keep the core (VM and standard library) open source. I might write an HLVM for Scientists book, sell libraries or implement features for a fee but nothing evil. :-) In fact, one dream I have is to create a commerce-friendly platform for tools and libraries. I think Linux has really suffered from being anti-commerce rather than truly free. Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
Ben Kuin wrote: A little off topic, but how is Mono/Unix these days? Still leaks memory, you refer to your examinations? (http://flyingfrogblog.blogspot.com/2009/01/mono-22-still-leaks- memory.html?showComment=1233522107493#c7872630239059031867) where you say yes and the mono devs are say no to memory leaking? Yes. has broken TCO Again, I think other people do not have the same opion on this ( http://flyingfrogblog.blogspot.com/2009/01/mono-does-not-support-tail- calls.html) Yes. They are wrong. I've introduced the post with some license related concerns, maybe I should take a step back and think about what I want: 1. - programming in a ML like language ( especially the caml family since I really like those lightweigt type definitions and the pattern matching that can be applied on those) 2. - high performance runtime, preferably a jit based vm, no problems with TCO 3. - a true open source license (approved by Open Source Initiative or by Free Software Foundation) I think this 3 point are REASONABLE but the combination of those 3 items is INEXISTENT. I'm afraid the situation is far worse. Even if you relax your conditions from ML-like to any functional language and even allow broken TCO, there are not only no language implementations satisfying those weaker conditions but nobody is even trying to create such a language implementation. Ocaml on HLVM: I would appreciate if Jon could make a clear statement if this is something serious or just a little toy. HLVM is not yet ready for serious use and it may well never compile OCaml but at least it is now compiling code like this: http://flyingfrogblog.blogspot.com/2010/04/variant-types-and-pattern-matchin g-in.html A last idea: What do you think about libjit? They claim that a jvm/clr like runtime could be built in weeks. Wouldn't it be nice to have a fast vm for Ocaml (ocamljit) ? Does someone has experience with this, I think writing a fast vm is hard, but a fast vm for a functional language is nearly impossible? Maybe OcamIL could then be used as a model for a jit backend, since its MSIL output already runs on libjit (DotGnu, alias pnet) I think LLVM is a much better choice than libjit. Once you've got that kind of solid foundation to build upon and a decent language like OCaml to write in, you can write a decent FPL implementation in a few man-months. The problem is finding the time... Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] about OcamIL
A little off topic, but how is Mono/Unix these days? Last I checked (2 years ago) it implemented the basic libraries and runtimes but had terrible performance. Is it now on par with Windows? Still leaks memory, has broken TCO and runs like a dog. Mono has also fallen even farther behind now that .NET 4 is out. However, they have at least stated that they intend to start trying to support F# on Mono. Then again, they stated about 6 years ago they were going to replace their crappy conservative GC with one that might actually work but they never managed to do that either... Cheers, Jon. ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: WAS Re: [Caml-list] Re: The need to specify 'rec' in a recursive function defintion
On Tuesday 16 February 2010 16:47:03 Grant Rettke wrote: On Tue, Feb 16, 2010 at 10:21 AM, Ashish Agarwal agarwal1...@gmail.com wrote: let rec Do OCaml'er look at let rec more as being a message to the programmer, rather than the compiler, that the way I want to define this function is recursively so even if 'f' was previously bound you know which one I mean? I see it as resolving an ambiguity for both the programmer and compiler. There are alternatives as others have mentioned but none seem particularly good or bad to me. Moreover, the burden of rec is tiny so I don't think it is worth discussing in such detail. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: The need to specify 'rec' in a recursive function defintion
On Monday 15 February 2010 15:46:58 Stefan Monnier wrote: Till Varoquaux had written: Let's make things clear here: the rec *really* is a feature; Nobody said otherwise. Eliminating the rec is also a feature. Those two features are mostly incompatible, and many reasonable people disagree on which one of the two is more important. Stefan who extensively used that feature in SML, but happens to prefer the other feature nevertheless Standard ML doesn't have the feature that Till described. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] deriving show
IIRC, there is an OCaml macro that will autogenerate the code to print values of a variant type from the type definition. Does it handle polymorphic variants? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] An extended comparative study of language support for generic programming
On Wednesday 10 February 2010 23:00:44 Raoul Duke wrote: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.110.122rep=rep1 type=pdf :-) http://groups.google.com/group/comp.lang.functional/browse_thread/thread/d6 54fedd6efdf753/3ee82770d5e79402#3ee82770d5e79402 See my response there: http://groups.google.com/group/comp.lang.functional/msg/2cb15a6281087b04 :-) I was wondering if anyone here was familiar with this work and/or had anything to say about their OCaml solutions and discussion? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] llvm?
On Wednesday 10 February 2010 22:51:33 Raoul Duke wrote: hi, any news about / anybody working on ocaml-on-llvm? I don't believe anyone is working on porting OCaml to LLVM. The nearest work is probably my own HLVM project which reached a major milestone recently and is now capable of high-performance parallel programming: http://flyingfrogblog.blogspot.com/2010/01/naive-parallelism-with-hlvm.html From the point of view of an OCaml programmer, HLVM is a DSL that drops various features from OCaml (e.g. polymorphic recursion) in order to provide easily-obtained, predictable and high performance. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: The need to specify 'rec' in a recursive function defintion
On Wednesday 10 February 2010 22:25:44 Till Varoquaux wrote: Some (including me) would even argue that it is sad that type definitions don't use rec. Agreed. Less useful than rec on function definitions but that would still be useful sometimes. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] The need to specify 'rec' in a recursive function defintion
On Tuesday 09 February 2010 22:01:03 Gerd Stolpmann wrote: In the ML community it is consensus that a recursive function is a total different thing than a non-recursive function... Note that Standard ML does this differently to CAML/OCaml. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] The need to specify 'rec' in a recursive function defintion
On Tuesday 09 February 2010 20:50:33 Saptarshi Guha wrote: Hello, I was wondering why recursive functions need to be specified with rec. According to Practical Ocaml, to inform the compiler that the function exists. But when entering the function definition, can't the compiler note that the function is being defined so that when it sees the function calling itself, it wont say Unbound value f? How is the knowledge of a function being rec taken advantage of (in ocaml) as opposed to other languages (leaving aside tail call optimization). Wouldn't one of way of detecting a recursive function would be to see if the indeed the function calls itself? let f x = x + 1 let f x = 2 * f x Is the latter f recursive or not? See my answer to the same question on stack overflow: http://stackoverflow.com/questions/900585/why-are-functions-in-ocaml-f-not-recursive-by-default/1891573 -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] The need to specify 'rec' in a recursive function defintion
On Tuesday 09 February 2010 22:31:49 Saptarshi Guha wrote: Yes, I see that f isn't recursive, because it simplifies down to 2*(x+1) but for a reader(at least myself) it can be bit tricky to consume. But for a writer it can be useful to produce. For example, imagine a function in an interpreter that evaluates an expression to a value in the context of a set of variable bindings: let rec eval vars = function ... | Add(f, g) - eval vars f + eval vars g | Mul(f, g) - eval vars f * eval vars g ... Such functions make a *lot* of recursive calls. Now imagine you want to inject some code around every external call into that expr function but not recursive calls, e.g. logging exceptional returns. Do you want to change the name of that expr function not only in its definition but at every recursive call in its body? No. Fortunately, in OCaml you can just append a definition to supercede expr without touching the original definition at all: let eval vars expr = ... let result = eval vars expr in ... result This is a useful and common OCaml idiom. My experience of reading the /definition/ of a function which includes a call to itself is that it is recursive. Definitions are often superceded in ML. On the stackoverflow post, you mentioned that the British ML branch forces different names (since recursion is by default), and though it does pollute the namespace, personally I find it easier to read. You can do that in OCaml as well, if you choose. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Power serious
I stumbled upon the following article that describes a remarkably simple implementation of arithmetic over power series in Haskell: http://www.cs.dartmouth.edu/~doug/powser.html This is the only compelling example of Haskell I have ever seen and I'd like to see this rewritten in other languages for comparison. Has anyone tried to translate this into OCaml? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] memory profiling
On Tuesday 05 May 2009 13:45:18 dmitry grebeniuk wrote: 2009/5/5 Christoph Bauer christoph.ba...@lmsintl.com: what is the best option to do memory profiling with ocaml? Is there a patch of ocaml-memprof for 3.11.0? What about objsize? If you want to use objsize with ocaml 3.11, you should get the new version of objsize -- 0.12: http://forge.ocamlcore.org/frs/?group_id=3 OCaml has new heap since 3.11... Can anyone elaborate on this? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] HLVM ray tracer performance
I just published results for the ray tracer benchmark written in HLVM and compared to other languages including OCaml: http://flyingfrogblog.blogspot.com/2010/01/hlvm-on-ray-tracer-language-comparison.html Note that these results were obtained with HLVM's multicore garbage collector enabled. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] ocaml as editor extension language
On Tuesday 05 January 2010 07:24:42 Joel Reymont wrote: You cannot embed OCaml and use it as an editor extension language unless 1) your editor is open source, or I believe there are several issues surrounding the QPL license that much of the OCaml distribution is under: 1. You do not have the right to distribute your source code freely and must distribute only independently (e.g. patches only). 2. The initial developer is given rights to your derivative work. 3. The QPL is not GPL compatible so your derivative work cannot use GPL libraries. See: http://www.gnu.org/licenses/license-list.html Note that OCaml's tools are under the QPL, including ocamllex and ocamlyacc. So you cannot even use that code to write a syntax highlighter for your editor without being subject to these requirements. 2) you are a member of the consortium and pay 2K EUR/year Is that correct? You could always ask nicely if they mind you violating their license. :-) -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Favorite OCaml editor?
On Tuesday 05 January 2010 08:45:04 Maxence Guesdon wrote: My favorite editor is Chamo: http://home.gna.org/cameleon/chamo.en.html Nice! Why do you prefer it? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Favorite OCaml editor?
On Tuesday 05 January 2010 07:31:45 Daniel Bünzli wrote: Your favorite is key here here; I appreciate you human input as I can use a search engine to find any old OCaml editor easily. Then I think a more interesting question is, what features do you absolutely need to be productive ? I'm rather low tech and not the power user type but still I couldn't do it without (keyboard access to) : 1) Syntax highlighting and reasonably automatic identation following ocaml's programming guidelines [1] Yes. 2) Ability to invoke a build tool so that reported errors allow me to automatically jump to the offending lines. Yes but I'd rather have an IDE constantly recompiling and automatically flagging errors such that I can jump directly to them using the GUI. 3) Ability to invoke built programs so that reported stack traces allow me to automatically jump to the offending lines. I don't use that so often, but yes. 4) Ability to read annot files so that I can query the type of the symbol under my cursor. Absolutely essential. 5) Ability to switch rapidly between an ml file and its corresponding mli. Interesting. 6) Ability to edit C sources. Bah. Real men use LLVM. I guess many people would add 7) Ability to access the documentation of the symbol under my cursor. That should go in with the type throwback. Also, it should support typeset math and vector graphics. And the source should be unicode with easy access to common alphabets and symbols. Regarding 7) I have a low tech approach which is to use gnome do (on linux) or quicksilver (on osx) to index the documentation generated by ocamldoc. Since the latter intelligently produces an html file Module.html for a module named Module I can quickly access its documentation by invoking gnome do with its hot key, type an abbreviation of Module and hit return. This opens the document in my browser where I scroll or search in the page to get to the symbol. I tend to use ocamlbrowser. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Favorite OCaml editor?
On Tuesday 05 January 2010 06:03:39 Grant Rettke wrote: Hi, What is your favorite editor for hacking with OCaml? I use Emacs+Tuareg but I wouldn't call it my favorite because I loathe it. Two good reasons to use it are autoindentation of OCaml code with ALT+Q and type throwback on the currently selected expression with CTRL+C CTRL+T. OCaIDE looked good for something Eclipse-based when I tried it: http://www.algo-prog.info/ocaide/ Eclipse is grindingly slow, cumbersome and remarkably confusing for a GUI app but at least OCaIDE is still under active development. There is also the OCaml Development Tools (ODT) project: http://ocamldt.free.fr/ but it hasn't seen an update in 11 months. Your favorite is key here here; I appreciate you human input as I can use a search engine to find any old OCaml editor easily. I think the best way to write a decent editor for OCaml would be to write one using LablGTK for the GUI and camlp4 to parse OCaml code. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: general question, was Re: OCaml is broken
On Sunday 03 January 2010 10:49:38 Sylvain Le Gall wrote: The only point of the whole discussion -- which is a recurring point by some of those who participate -- is the lack of shared-memory parallelism in the core language. I solved the problem: the latest version of HLVM now facilitates high-performance shared-memory parallelism. The remaining challenges to making this more user friendly are: 1. High-level constructs for parallelism in HLVM (task queues). 2. OCaml-HLVM interop, probably by destructuring values passed from the OCaml world so that HLVM programs can use them directly and return results by mutating values on the OCaml side. 3. Camlp4 macros so users can write their HLVM code in an OCaml-like DSL. I believe this is basically an optimal solution for OCaml's multicore problem given the practical constraints. The future's looking bright again. :-) -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Prevalence of duplicate references in the heap
I took an odd design decision with HLVM and made references a struct of run-time type, metadata (e.g. array length), pointer to mark state and pointer to data. So every reference consumes 4x32=128 bits rather than the usual 32 bits but heap-allocated values no longer require a header. My performance results really surprised me. For example, the gc benchmark in the HLVM test suite fills a hash table that is represented by an array spine containing references to array buckets. Despite having (fat) references everywhere, HLVM is 2.2x faster than OCaml on x86. The main disadvantage of HLVM's approach is probably that every duplicate reference now duplicates the header information, wasting 96 bits. However, I do not believe references are duplicated in the heap very often. Both trees and hash tables contain many references but none are duplicated. So I'm wondering if anyone has studied the makeup of heaps in idiomatic OCaml code and could point me to data on the proportion of the heap typically consumed by duplicate references, i.e. how much space is HLVM likely to waste? Many thanks, -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] ocaml, llvm and generating code at runtime
On Friday 01 January 2010 17:39:15 Basile STARYNKEVITCH wrote: LLVM is rumored to be a bit faster, but is also rumored to be slow as a pure JIT (just in time) code generated (w.r.t. to other non Ocaml implementations - eg SBCL or CLISP common lisp). Are you saying that LLVM's JIT is slow to generate code or that the code it generates runs slow? Polyml http://polyml.org/ is also supposed to be a JIT-ed SML implementation (it is SML not Ocaml). A few years ago, metaocaml existed, but seems dead today. Walid said that MetaOCaml was going to relive back in August 2008: http://ocamlnews.blogspot.com/2008/07/metaprogramming-with-metaocaml.html?showComment=121761762#c4695232398067055037 Happy new year to everyone! Happy New Year! -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] ocaml, llvm and generating code at runtime
On Friday 01 January 2010 16:45:34 Joel Reymont wrote: Does anybody have example code that shows how to generate OCaml bindings at runtime with LLVM? The OCaml Journal article Calling C code from OCaml (10th July 2008) described an OCaml program that invoked a C function both via a stub written in C and via JIT compiled interface code using LLVM (no more stinking C stubs!). I would like to write more on JIT-compiled interface code in the future and perhaps write a library to make it easy. My goal is to compile an AST into code that uses OCaml functions within the same binary that's doing the compiling. I don't think it can be done with OCaml since it requires a standalone assembler, linker, etc. Correct me if I'm wrong, though. Mine is a web-based compiler with potentially many concurrent sessions. Running gas, ld, etc. seems a much heavier and less scalable approach that generating code at runtime. Sounds like you really want HLVM. :-) Thanks and Happy New Year, Joel Happy New Year! -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] ocaml, llvm and generating code at runtime
On Friday 01 January 2010 20:23:35 Basile STARYNKEVITCH wrote: I heard that LLVM code generation time is significantly higher (i.e. slower) than other JIT technologies. So machine code generation time is apparently significant which might be an issue inside a web server) but performance of the generated code is supposedly good (inside a web server this is important only if the generated code runs a lot, in particular more than in a single session). I don't have enough personal experience to validate that claim. However, both MONO PARROT sites are saying something similar: http://www.mono-project.com/Mono_LLVM http://trac.parrot.org/parrot/wiki/JITRewrite http://cliffhacks.blogspot.com/2007/03/experimenting-with-llvm.html But again, I may be wrong. Only real benchmarks on real applications can tell. I believe that libjit GNU lightning should probably both generate machine code quicker than LLVM does, but the performance of the generated code (by libjit or by lightning) is worse than when using LLVM. And some benchmarks on http://www.phoronix.com/scan.php?page=articleitem=apple_llvm_gccnum=1 suggest that LLVM generated machine code is less efficient than GCC generated machine code. Again, take all this with a grain of salt... That's a fair assessment but LLVM is only about 2x slower than the fastest compilers and it generates code that runs 2-10x faster. For example, compiling the 155kLOC of LLVM IR generated by HLVM's test suite takes 9.65s. I think it was a big mistake for the Go developers at Google and the Mono developers at Novell to not build upon LLVM. My main concern about other JITs is maturity: AFAIK LLVM has orders of magnitude more users that all of the others combined. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] ocaml, llvm and generating code at runtime
On Friday 01 January 2010 22:33:42 Yoann Padioleau wrote: On Jan 1, 2010, at 2:29 PM, Jon Harrop wrote: I think it was a big mistake for the Go developers at Google and the Mono developers at Novell to not build upon LLVM. My main concern about other JITs is maturity: AFAIK LLVM has orders of magnitude more users that all of the others combined. And ? That maturity gives me confidence in LLVM's reliability and on-going development. C has many orders of magnitude more users than ocaml ... Indeed. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: multicore wish
On Thursday 24 December 2009 12:58:18 Goswin von Brederlow wrote: Jon Harrop j...@ffconsultancy.com writes: On Tuesday 22 December 2009 18:02:32 Edgar Friendly wrote: On 12/22/2009 01:12 PM, Jon Harrop wrote: On Tuesday 22 December 2009 13:09:27 Goswin von Brederlow wrote: The advantage with ocaml though is that you never have pointers into a structure. Makes thinks a lot simpler for the GC and avoids large overheads in memory. I don't understand what you mean by OCaml never has pointers into a structure. Half the problem with OCaml is that OCaml almost always uses pointers and the programmer has no choice, e.g. for complex numbers. I think he means that ocaml structs (records, tuples) will only ever have pointers pointing to their beginning - you can't have a pointer to somewhere in the middle of a structure. Exactly. If so then I do not understand the relevance. You cannot have pointers into a structure in F# or HLVM either... If you have an array a of (int * int) then in ocaml a is an array of pointers to tuples. a.(5) is a pointer to the 6th tuple. Yes. You said that in F# the array will be really an array of tuples. Then a.(5) will be a pointer into the array at the position where the 6th tuple is. No. The expression a.(5) loads all fields of the value type. For example, if it were an array of complex number then a.(5) would load the real and imaginary parts. That means as long as a.(5) is reachable the full array has to remain allocated or the GC has to recognise that only one (a few) items of an array are reachable and copy them before freeing the array. The GC also needs a way to find the begining of an allocated block from a pointer into the block. Which means extra overhead in both memory and time. Another think is that tuples are immutable but arrays are mutable. In ocaml you get this nice behaviour: # let a = Array.init 5 (fun x - (x,x));; val a : (int * int) array = [|(0, 0); (1, 1); (2, 2); (3, 3); (4, 4)|] # let x = a.(1);; val x : int * int = (1, 1) # a.(1) - (2,2);; - : unit = () # x;; - : int * int = (1, 1) In F# you would get (2,2) for your fast sortable array. A different semantic. No. Even if F# represented tuples as structs (which it does not, unfortunately) you would still get the same result as you do in OCaml. HLVM already implements tuples as structs and your example works exactly as the OCaml does: # let xs = create(6, (1, 1)) in let x = xs.(1) in xs.(1) - (2, 2); x;; - : `Struct[`Int; `Int] = (1, 1) The only difference is that HLVM unboxes the pairs in the array so the heap contains only a single pointer and accessing a pair requires only one indirection. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: multicore wish
On Thursday 24 December 2009 13:19:52 Goswin von Brederlow wrote: Jon Harrop j...@ffconsultancy.com writes: No, in OCaml I fork every child. That is the only transparent way to give the child a coherent view of the heap but it is extremely slow (~1ms): So if you add a (sleep 60) to the ocaml code then ocaml gets even more worse than F#? You are comparing an implementation that is specifically optimized to use things F# is good at and ocaml is bad. To give a fair comparison you need to optimize the implementation to the language. Post a better OCaml solution if you can. Each process then has a work queue which is works through. So they will always use the local data. Only when a queue runs dry they steal from another process and ruin locality. You are correctly describing the efficient solution based upon work-stealing queues that F# uses but OCaml cannot express it. You mean that you didn't implement that way. No, I mean OCaml cannot express it. I can easily express that with closures and pre-forked worker threads. OCaml's threads do not run in parallel so that will not work. So I don't see where your argument fits. You are not creating childs on the fly. Only at the start and they run till all the work is done. At least in this example. No, every recursive invocation of the parallel quicksort spawns another child on-the-fly. That's precisely why it parallelizes so efficiently when you have wait-free work-stealing task deques and a shared heap. In general, you rewrite algorithms into this recursive divide and conquer form and parallelize when possible. You can parallelize a *lot* of problems efficiently that way. That seems awfully ineficient. Then you have a million children running on maybe 4 cores and each job queue holds no job. I think we mean different things by children. By children I mean what the kernel sees as children. Newly forked threads. I think you mean jobs that get put into the queues. There is no reason to fork a new system thread every time the parallel quicksort splits an array in two. The children are lightweight tasks, not threads or processes. Why would you fork in invoke? Fork is currently the only transparent way to implement invoke but it is extremely slow and unreliable. No, it is the insane way. Use closures. Please demonstrate. parallel programs that leverage multicores and run a *lot* faster than anything that can be written in OCaml. You can even make this accessible to OCaml programmers as a DSL with automatic interop to make high performance parallel programming as easy as possible from OCaml. Or just implement it properly in ocaml. The problem part is the GC. The rest is easy. No kidding. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: multicore wish
On Tuesday 22 December 2009 13:09:27 Goswin von Brederlow wrote: Jon Harrop j...@ffconsultancy.com writes: 1. The array a is just an ordinary array of any type of values on the shared heap in F# but, for generality in OCaml, this must be both the underlying ordinary data and a manually-managed shared big array of indices into the ordinary data where the indices get sorted while the original data remain in place until they are permuted at the end. Unless you have a primitive type that isn't a pointer. In OCaml, you would need to write a custom quicksort optimized for that particular type. In F#, the generic version just works and works efficiently. The advantage with ocaml though is that you never have pointers into a structure. Makes thinks a lot simpler for the GC and avoids large overheads in memory. I don't understand what you mean by OCaml never has pointers into a structure. Half the problem with OCaml is that OCaml almost always uses pointers and the programmer has no choice, e.g. for complex numbers. 2. The sorted subarrays are contiguous in memory and, at some subdivision, will fit into L2 cache. So F# offers optimal locality. In contrast, there is no locality whatsoever in the OCaml code and most accesses into the unsorted original array will incur cache misses right up to main memory. So the OCaml approach does not scale as well and will never see superlinear speedup because it cannot be cache friendly. On the other hand swapping two elements in the array has a constant cost no matter what size they have. At some size there will be a break even point where copying the data costs more than the cache misses and with increasing size the cache won't help F# so much either. In theory, yes. In practice, that threshold is far larger than any value type that a real program would use so it is of no practical concern. Moreover, F# gives the programmer control over whether data are unboxed (value types) or boxed (reference types) anyway. In contrast, OCaml is tied to a few value types that are hard-coded into the GC. 3. Child tasks are likely to be executed on the same core as their parent and use a subset of their parent's data in F#, so they offer the best possible locality. In contrast, child processes are likely to be executed on another core in OCaml and offer the worst possible locality. But, if I understood you right, you first fork one process per core. No, in OCaml I fork every child. That is the only transparent way to give the child a coherent view of the heap but it is extremely slow (~1ms): F# can do 60MFLOPS of computation in the time it takes OCaml to fork a single process - http://caml.inria.fr/pub/ml-archives/caml-list/2009/06/542b8bed77022b4a4824de2da5b7f714.en.html Those should then also each pin themself to one core. You have no idea which core a forked child process will run on. Each process then has a work queue which is works through. So they will always use the local data. Only when a queue runs dry they steal from another process and ruin locality. You are correctly describing the efficient solution based upon work-stealing queues that F# uses but OCaml cannot express it. So I don't see where your argument fits. You are not creating childs on the fly. Only at the start and they run till all the work is done. At least in this example. No, every recursive invocation of the parallel quicksort spawns another child on-the-fly. That's precisely why it parallelizes so efficiently when you have wait-free work-stealing task deques and a shared heap. In general, you rewrite algorithms into this recursive divide and conquer form and parallelize when possible. You can parallelize a *lot* of problems efficiently that way. 4. Task deques can handle an arbitrary number of tasks limited only by memory whereas processes are a scarce resource and forking is likely to fail, whereupon the invoke combinator will simply execute sequentially. So it is much easier to write reliable and performant code in F# than OCaml. Why would you fork in invoke? Fork is currently the only transparent way to implement invoke but it is extremely slow and unreliable. 5. OCaml's fork-based invoke combinator is many orders of magnitude slower than pushing a closure onto a concurrent task deque in F#. 6. The threshold value is a magic number derived from measurements on a given machine in my OCaml code but is dynamically adjusted in a machine-independent way by the invoke combinator in my F# code using atomic operations and real time profiling of the proportion of time spent spawning tasks vs doing actual work. 5+6 seem to be an implementation detail of some specific implementation you are talking about. Yes. I'm talking about today's OCaml and F# implementations. I don't see anything in the theory that would require that. If by in theory you mean that a new performant concurrent GC for OCaml would solve
Re: [Caml-list] Re: multicore wish
On Tuesday 22 December 2009 18:02:32 Edgar Friendly wrote: On 12/22/2009 01:12 PM, Jon Harrop wrote: On Tuesday 22 December 2009 13:09:27 Goswin von Brederlow wrote: The advantage with ocaml though is that you never have pointers into a structure. Makes thinks a lot simpler for the GC and avoids large overheads in memory. I don't understand what you mean by OCaml never has pointers into a structure. Half the problem with OCaml is that OCaml almost always uses pointers and the programmer has no choice, e.g. for complex numbers. I think he means that ocaml structs (records, tuples) will only ever have pointers pointing to their beginning - you can't have a pointer to somewhere in the middle of a structure. If so then I do not understand the relevance. You cannot have pointers into a structure in F# or HLVM either... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Looking for information regarding use of OCaml in scientific computing and simulation
On Tuesday 22 December 2009 13:11:58 Eray Ozkural wrote: On Tue, Dec 22, 2009 at 6:40 AM, Linas Vepstas linasveps...@gmail.com wrote: However, if you are interested in merely using the system to do your real work, then writing message-passing code is an utter waste of time -- its difficult, time-consuming, error prone, hard to balance and optimize tune, works well only for embarrasingly parallel code, etc. Even the evil slow-down of NUMA is often better than trying to performance-tune a message-passing system. Message passing doesn't work well only for embarrassingly parallel code. Message passing doesn't necessarily work well for embarrassingly-parallel problems either because you cannot use in-place algorithms and scatter and gather are O(n). For instance, you can implement the aforementioned parallel quicksort rather easily, But you cannot improve performance easily and performance is the *only* motivation for parallelism. So the fact that you can make naive use of message passing easily from OCaml is useless in practice. What message passing really is, it is the perfect match to a distributed memory architecture. Since, as you suggest, multicore chips have more or less a shared memory architecture, message passing is indeed not a good match. Yes. Conversely, shared memory is effectively a hardware accelerated form of message passing. Let me put it this way: suggesting that programmers can write their own message-passing system is kind of like telling them that they can write their own garbage-collection system, or design their own closures, or they can go create their own type system. Of course they can ... and if they wanted to do that, they would be programming in C or assembly, and would probably be designing new languages. Cause by the time you get done with message passing, you've created a significant and rich programming system that resembles a poorly-designed language... been there, done that. For a functional language, am I right in expecting a high-level and clean interface for explicit parallelism? I think that is a perfectly reasonable thing to expect but you still need to understand its characteristics and how to leverage them in order to make good use of the feature. I suppose a spawn directive would not be very hard to implement. You cannot implement it with useful efficiency in OCaml. Message Passing/Distributed Memory can also be accommodated I suppose. Sure but it is worth remembering that distributed parallelism across clusters is a tiny niche compared to multicores. OcamlP3l looks pretty cool. Parallel combinators? Definitely what I'm talking about, as usual the future is here with ocaml ;) http://ocamlp3l.inria.fr/eng.htm Try solving some real problems with OCamlP3L and F#. I'm sure you'll agree that the OCaml approach is certainly not the future. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: OCaml is broken
On Monday 21 December 2009 05:32:38 Markus Mottl wrote: On Sun, Dec 20, 2009 at 23:30, Jon Harrop j...@ffconsultancy.com wrote: Traffic here: 2007: 5814 2008: 4051 2009: 3071 That's because I don't have much time to post here nowaydays. I'm sure if Jon followed my example, we would have a parallel GC for OCaml by the end of the year. HLVM already has a parallel GC. :-) -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: multicore wish
On Monday 21 December 2009 14:19:36 Mihamina Rakotomandimby wrote: Ok, so for the beginner I am (must I ask on the beginners ML?): is multicore support just useless or not? I have found a great many uses for multicores but you need a decent foundation to make effective use of it. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] general question, was Re: OCaml is broken
On Monday 21 December 2009 14:19:05 Keyan wrote: Hi, i have a large project written in C++, for which i am planing to write add-ons and tools in ocaml, e.g. different tools to analyse my code (dependency stuff), an interpreter for a script-language i plan to include, etc, etc. form my time at the uni i remembered that ocaml allows to compile libraries which can be included in c/c++ program, and i know people who use it extensively in other projects. therefore, i decided to give ocaml a try. i like functional programming, and my first steps with ocaml are very promising. following this discussion, i am not so sure anymore, if ocaml is a good decision. may be i got this discussion wrong, but if ocaml is dying out, i might have to look for another functional programming language to use with my project. OCaml isn't dying out and it is still ideal for the applications you listed because they are not performance critical and do rely heavily upon an efficient GC. This discussion was essentially about the exact opposite end of the spectrum of applications, where performance is critical but GC is largely irrelevant. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: multicore wish [Was: Re: [Caml-list] Re: OCaml is broken]
On Monday 21 December 2009 13:31:10 Goswin von Brederlow wrote: Jon Harrop j...@ffconsultancy.com writes: We've discussed the problems with that before. Writing a parallel generic quicksort seems to be a good test of a decent multicore capable language implementation. Currently, F# is a *long* way ahead of everything open source. How do you implement it? 1) divide at the top and then let each core sort its smaller array Say you have 2^n cores then the first n splits in merge sort would first sort the values into the 2 regions and then start a thread for each region (start a new one, do the other part in this thread). After n splits you would switch to a uni-core quicksort. For this you need to split well so each core ends up with a roughly equal sized chunk. That is basically what is now being called nested data parallelism. You precompute a parallel strategy, partition the work upfront and then execute embarassingly parallel tasks. This worked well decades ago for sparse linear algebra on supercomputers but it sucks on multicores because load is variable and there is no load balancing. So one of those equal-sized partitions is likely to take 10x longer than the others due to competing processes, scheduling or cache issues and all cores end up twiddling their thumbs waiting for that one to complete. 2) factory/worker approach Each core runs a factory/worker thread waiting on a job queue. You start by dumping the full array into the job queue. Some random core picks it up, splits it into 2 regions and dumps one into the job queue. If the job gets too small (PAGE_SIZE? cache line size? total size / cores^2?) the factory/worker switches to normal uni-core quicksort and sorts the whole chunk. The job queue should probably be priority based so larger chunks are sorted before smaller. Here the quality of each split is not so important. If a chunk is smaller, and therefore faster, the core just picks up the next job in the queue. But you need more syncronization between the cores for the job queue. On the other hand you aren't limited to 2^n cores. Any number will do. This is a work sharing queue which is better because it is dynamically load balanced but still bad because you have global contention for a shared resource: the shared queue. Cilk pioneered wait-free work-stealing task deques and Microsoft's Task Parallel Library (which will be part of .NET 4 in March 2010) copied the idea. You have a separate deque of tasks for each core. A core tries to pop a task off its deque. If there are no tasks on its deque then it tries to pop off a randomly-chosen other deque. If there is a task then the core begins executing that task and any child tasks are pushed back onto the local deque (and might get stolen before they are executed). An important characteristic of this design is that child tasks are likely to be executed on the same core as their parent and, therefore, locality between related tasks is valuable. Consequently, many algorithms that recursively subdivide a large dataset (like quicksort) attain superlinear speedups because they make effective use of the sum of all L2 caches rather than just a single L2 cache. So this works extremely well with the multi-level hierarchical caches with interconnects that are the hallmark of multicore architecture. You can then implement quicksort (and many other parallel divide and conquer algorithms) something like this: let rec serial_qsort cmp a i0 i3 = if i3 - i0 1 then let i1, i2 = partition cmp a i0 i3 in serial_qsort i0 i1; serial_qsort i2 i3 let rec parallel_qsort cmp a i0 i3 = if i3 - i0 1 then if i3 - i0 threshold then serial_qsort cmp a i0 i3 else let i1, i2 = partition cmp a i0 i3 in let future = invoke (parallel_qsort cmp i0) i1 in parallel_qsort i2 i3 future() where the invoke combinator pushes a task onto the local task deque and returns a closure that, when applied, waits for that task to complete and returns its (unit) result. The threshold value protects against the function spending more time spawning tasks that doing actual work. You can implement this in OCaml and F# but there are some important differences: 1. The array a is just an ordinary array of any type of values on the shared heap in F# but, for generality in OCaml, this must be both the underlying ordinary data and a manually-managed shared big array of indices into the ordinary data where the indices get sorted while the original data remain in place until they are permuted at the end. 2. The sorted subarrays are contiguous in memory and, at some subdivision, will fit into L2 cache. So F# offers optimal locality. In contrast, there is no locality whatsoever in the OCaml code and most accesses into the unsorted original array will incur cache misses right up to main memory. So the OCaml approach does not scale as well
Re: [Caml-list] Re: OCaml is broken
On Sunday 20 December 2009 14:27:00 Dario Teixeira wrote: Hi, It's too bad that INRIA is not interested in fixing this bug. No matter what people say I consider this a bug. Two cores is standard by now, I'm used to 8, next year 32 and so on. OCaml will only become more and more irrelevant. I hate to see that happening. This is a perennial topic in this list. Without meaning to dwell too long on old arguments, I simply ask you to consider the following: - Do you really think a concurrent GC with shared memory will scale neatly to those 32 cores? - Will memory access remain homogeneous for all cores as soon as we get into the dozens of cores? The following web page describes a commercial machine sold by Azul Systems that has up to 16 54-core CPUs (=864 cores) and 768 GB of memory in a flat SMP configuration: http://www.azulsystems.com/products/compute_appliance.htm As you can see, a GC with shared memory can already scale across dozens of cores and memory access is no more heterogeneous than it was 20 years ago. Also, note that homogeneous memory access is a red herring in this context because it does not undermine the utility of a shared heap on a multicore. - Have you considered that many Ocaml users prefer a GC that offers maximum single core performance, OCaml's GC is nowhere near offering maximum single core performance. Its uniform data representation renders OCaml many times slower than its competitors for many tasks. For example, filling a 10M float-float hash table is over 18x slower with OCaml than with F#. FFT with a complex number type is 5.5x slower with OCaml than F#. Fibonacci with floats is 3.3x slower with OCaml than my own HLVM project (!). because their application is parallelised via multiple processes communicating via message passing? A circular argument based upon the self-selected group of remaining OCaml users. Today's OCaml users use OCaml despite its shortcomings. If you want to see the impact of OCaml's multicore unfriendliness, consider why the OCaml community has haemorrhaged 50% of its users in only 2 years. In this context, your bug is actually a feature. I'm not even sure you can substantiate that in the very specific context of distributed parallel theorem provers because other languages are so much more efficient at handling common abstractions like parametric polymorphism. Got any benchmarks? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: OCaml is broken
is not beneficial. You can consider multicore as a marketing trick of the chip industry to let the ordinary desktop user pay for a feature that is mostly interesting for datacenters. Ordinary desktop users have been paying top dollar for parallel computers in the form of GPUs for some time now. The use of GPUs for more general programming has been a really hot topic for years and just became viable. Even games consoles have multicores. ARM are making quadcores for your phone and netbook! If I can get HLVM to make parallel OCaml-style programming easy, I think a lot of people would love it. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: OCaml is broken
On Saturday 19 December 2009 19:38:41 Jeff Shaw wrote: My understanding is that since jocaml uses the regular ocaml runtime, it is also not multicore enabled. Haskell is a functional language that has good performance GHC and the Haskell language itself have serious performance problems. that can use multiple processors, but the learning curve is steeper and higher. And Haskell lacks many of the features OCaml programmers take for granted. OCaml is a close relative of Standard ML, so there might be some implementation of SML that you like. MLTon might allow multicore use, but I'm not sure how mature it is. SML/NJ has a library or language extension called Concurrent ML, but I think SML/NJ might not use multiple processors. MLton and SML/NJ are both multicore incapable. The PolyML implementation of SML is multicore friendly but last time I looked (many years ago) it was 100x slower than OCaml for floating point. As long as you're looking at OCaml's close relatives with multicore support, F# is your only viable option. Soon, HLVM will provide a cross-platform open source solution. If you look further you will also find Scala and Clojure. Note that if you're not using a lot of threads, you can use Unix.fork to do true multithreaded programming ocaml. We've discussed the problems with that before. Writing a parallel generic quicksort seems to be a good test of a decent multicore capable language implementation. Currently, F# is a *long* way ahead of everything open source. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] How to write a CUDA kernel in ocaml?
On Tuesday 15 December 2009 23:18:57 David Allsopp wrote: Basile Starynkevitch wrote: Eray Ozkural wrote: Compiling Ocaml to efficient C is not easy and probably impossible (or extremely difficult) in the general case. In particular, tail recursive calls are essential in Ocaml, and are not available in C in most compilers. What's this based on (out of interest)? Most C compilers don't necessarily identify tail recursion in *C* code but if you're emitting C as an OCaml backend then there's no reason not to convert tail recursive *OCaml* functions to C code based on goto or similar looping constructs (yes, you'd almost-always-virtually-never use goto in a hand-crafted C program without apologising profusely at Dijkstra's grave but if you're using C as an intermediate language then that's a different matter). If I recall correctly from an old post on this list, this is how Felix handles tail recursion when translating Felix to C++ And trampolines to eliminate tail calls that cannot be eliminated using goto. However, trampolines are ~10x slower than TCO in the code gen. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Looking for information regarding use of OCaml in scientific computing and simulation
% from 2007-2008 and another 50% from 2008-2009). I believe this is because the OCaml community was ~80% technical users in 2006 and most of them have since left for languages that address the disadvantages I just listed, primarily parallelism. The main problem as I see it is that all of the other solutions for technical users outside Windows (i.e. except F#) introduce serious problems of their own (e.g. no TCO in Clojure and Scala, poor inference in Scala and Haskell, unpredictable time and space in Haskell, conservative GC on Mono and ATS). The good news is that I've been working on a project (HLVM) specifically designed to address these problems, with high performance parallel numerics as well as having C-like data representation with an easy-to-use FFI and many other advantages including an optimizing native-code REPL, generic printing and serialization. This is just a hobby but I'm amazed at how easily I've been able to progress thanks to the awesome OCaml+LLVM combo. - My colleagues are working a lot with Mathlab, is there any synergy there (bindings, ways to integrate Mathlab within OCaml code or vice versa, ...)? You can reply to me on this list or off list. In case of personal reply, let me know if I can reuse your name and affiliation. Please do. :-) If this presentation is done, I'll make the slides available under a free license. Great, thanks. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Parallel programming in HLVM
I managed to find the time to do a little more work on my HLVM project this week and created a first working parallel version with threads that run simultaneously using a stop-the-world GC. The current implementation is extremely inefficient but it seems to work correctly and should be easy to optimize. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] MetaOcaml and high-performance [was: AST versus Ocaml]
On Monday 09 November 2009 04:23:28 o...@okmij.org wrote: Because offshoring produces a portable C or Fortran code file, you can use the code on 32 or 64-bit platform. The reason the native MetaOCaml without offshoring does not work on amd64 is because at that time OCaml didn't emit PIC code for amd64. So, dynamic linking was impossible. That problem has long been fixed in later versions of OCaml... Has the problem been fixed in MetaOCaml? Fortunately, some people have considered MetaOCaml to be a viable option for performance users and have reported good results. For example, Tuning MetaOCaml Programs for High Performance Diploma Thesis of Tobias Langhammer. http://www.infosun.fmi.uni-passau.de/cl/arbeiten/Langhammer.pdf Here is a good quotation from the Introduction: ``This thesis proposes MetaOCaml for enriching the domain of high-performance computing by multi-staged programming. MetaOCaml extends the OCaml language. ... Benchmarks for all presented implementations confirm that the execution time can be reduced significantly by high-level optimizations. Some MetaOCaml programs even run as fast as respective C implementations. Furthermore, in situations where optimizations in pure MetaOCaml are limited, computation hotspots can be explicitly or implicitly exported to C. This combination of high-level and low-level techniques allows optimizations which cannot be obtained in pure C without enormous effort.'' That thesis contains three benchmarks: 1. Dense float matrix-matrix multiply. 2. Blur of an int image matrix as convolution with a 3x3 stencil matrix. 3. Polynomial multiplication with distributed parallelism. I don't know about polynomial multiplication (suffice to say that it is not leveraging shared-memory parallelism which is what performance users value in today's multicore era) but the code for the first two benchmarks is probably 10-100x slower than any decent implementation. For example, his fastest 2048x2048 matrix multiply takes 167s whereas Matlab takes only 3.6s here. In essence, the performance gain (if any) from offshoring to C or Fortran is dwarfed by the lack of shared-memory parallelism. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Unboxed float tuples
On Sunday 08 November 2009 16:01:48 Daniel Bünzli wrote: Using tuples (vs records) to handles points and vectors is syntactically more lightweight (IMHO). If you want to handle structs and vectors then you probably want structs (value types) and not boxed representations like tuples and records anyway. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] OC4MC : OCaml for Multicore architectures
On Friday 25 September 2009 00:28:57 Jon Harrop wrote: Just to quantify this with a data point: the fastest (serial) version of my ray tracer benchmark is 10x slower with the new GC. However, this is anomalous with respect to complexity and the relative performance is much better for simpler renderings. For example, the new GC is only 1.7x slower with n=6 instead of n=9. The new SmartPumpkin release of OC4MC does a lot better. Specifically, the version compiled with partial collections is now only 3.9x slower on a serial ray tracer with n=9 (compared to 10x slower before). I'll try it in more detail... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Unboxed float tuples
On Sunday 08 November 2009 19:39:03 Brian Hurt wrote: Fixing this would require a major rearchitecting and rewrite of not only the compiler, but also the garbage collector, the run time, and all the C bindings that have been written. If you're willing to endure a whole program optimization pass then you can avoid having to touch the GC and run-time by monomorphizing the code, e.g. converting all tuples into monomorphic record types. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Re: ATS versus Ocaml
On Friday 06 November 2009 15:38:27 Jan Kybic wrote: I notice that in ATS you have to give the type of the array explicitly fn bubble_sort (a : array0 double ) : void = so you should also do so in the OCaml code, using You are right, I am sorry for this omission. Having done that, the ration between Ocaml and ATS times drops to 3:1 (Ocaml being slower). On x86, I get: ATS: 0.189s HLVM: 0.486s OCaml: 0.552s On x64, I get: OCaml: 0.299s -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] tip for tail recursive map
On Monday 02 November 2009 20:00:31 Christophe Raffalli wrote: List of size 1, 1 times with standard map : 7.564473s List of size 1, 1 times with map with rev : 15.452965s List of size 1, 1 times with map with prelist : 12.672792s List of size 1, 1 times with map with obj : 11.572724s Note that standard map is still very fast on this list length. List of size 10, 1000 times with standard map : 33.018063s List of size 10, 1000 times with map with rev : 42.142634s List of size 10, 1000 times with map with prelist : 22.161385s List of size 10, 1000 times with map with obj : 20.801299s Standard map is now relatively slower but only because it is O(n^2). Look at page 152 figure 7.4 of OCaml for Scientists to see this effect clearly. It is caused by the periodic traversal of the O(n) deep stack by the GC and it slows everything down (you get a similar effect with hash tables because the GC traverses arrays of pointers, like the spine, atomically). standard map with size 100 segfaults on my machine List of size 100, 100 times with map with rev : 55.211450s List of size 100, 100 times with map with prelist : 23.549472s List of size 100, 100 times with map with obj : 21.777361s You can use ulimit to get a bigger function call stack and keep running the ordinary map as far as you want. Conclusion : dirty map wins for large lists, Standard map wins for small lists... I think you can do a lot better than this and I think Xavier's recommendation stands (Obj is a horiffically bad idea unless you wrote the compiler and run-time *and* have the memory of an elephant ;-). Specifically, you just need to get rid of the O(n^2) behaviour by bounding the stack depth, perhaps using a trampoline. IIRC, this was discussed on this list many years ago. One notable observation was that adding a depth accumulator does not degrade performance. Another alternative is to convert the list into an array rather than reversing it and use the array as a kind of alternative to the function call stack (I think F# does this). -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] OC4MC : OCaml for Multicore architectures
On Saturday 26 September 2009 00:26:50 Benjamin Canou wrote: On the maintenance side, as Philippe said, we already have some half working version with ocaml 3.11.x, but partly because of the changes made to the native runtime in this release and partly because of [1], porting the patch is not trivial. OC4MC seems to work very well for numerical problems that do not allocation at all but introducing even the slightest mutation (not even in the inner loop) completely destroys performance and scaling. I'm guessing the reason is that any allocations eventually trigger collections and those are copying the entire heap which, in this case, consists almost entirely of float array arrays. My guess was that using big arrays would alleviate this problem by placing most of the data outside the OCaml heap (I'm guessing that oc4mc leaves the element data of a big array alone and copies only the small reference to it?). However, it does not seem to handle bigarrays: ../out/lib/ocaml//libbigarray.a(bigarray_stubs.o): In function `caml_ba_compare': bigarray_stubs.c:(.text+0x1e5): undefined reference to `caml_compare_unordered' bigarray_stubs.c:(.text+0x28d): undefined reference to `caml_compare_unordered' collect2: ld returned 1 exit status Error during linking If I am correct then I would value functioning bigarrays above OCaml 3.11 support. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Constructors are not functions
On Tuesday 06 October 2009 13:19:58 Philippe Wang wrote: Hello, I don't know the actual reason, but I guess it's simply a choice of semantics. It woud be weird to be able to deconstruct (Bar 42) while not be able to deconstruct (Bar) as it's not constructed. I mean we can write (match x with Bar y - y). If partial construction were accepted we may like to write (match x with Bar - x) but we couldn't because Bar is like a function then. With type t = A | B of int what would be the error warning for (match x with A - 42 | B - 43) ? Well, then I guess the reason is that it would be complicated to choose some sound enough semantics for partial application of constructors, since the solution of having to write (fun x - Bar x) is much simpler. Can you not just say that Bar in an expression is a function (fun x - Bar x)? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs