Re: [Caml-list] extending user-defined polymorphic variant types
Dan Bensen danben...@att.net writes: I'm trying to write a functor that extends a user-supplied polymorphic variant type (PVT). How do you declare the user's type in the signature for the argument to the functor without locking in the individual variant definitions? The code below (in revised syntax) generates an error message that says the type isn't a PVT. code: module type Reader = sig type ast; end; module Make (Read: Reader) = struct type ast = [= Read.ast | `Lid of string]; end; message: Error: The type Read.ast is not a polymorphic variant type How do you make ast a PVT while allowing the user to specify the variants? Even if that did work how usefull would that be? You couldn't write let foo = function | `Lid s - () | x - Read.foo x unless Read.foo allows [ Read.ast]. In which case you don't need a functor but can just implement foo straight up. What is your use case? What do you want to do? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] A shallow option type
Jacques Garrigue garri...@math.nagoya-u.ac.jp writes: Sorry for all these mails. Looks like I don't think well when I'm sleepy... Anyway, I think that at last I have a reasonable (much simpler) solution. This still uses sopt_tag (i.e. lazy_tag-1), but this time the argument is the number of some constructors, so there is no extra cost for marshaling. The only values on which Sopt.some is not the identity are those with a single argument, which is additionally represented by an integer between 0 and last. Moreover, even if for some reason you build such a value using a real sum type, you still have Sopt.arg (Sopt.some v) = v, the only change being a loss of sharing (which is anyway not guaranteed by the ocaml specification). Jacques module Sopt : sig type +'a t val none : 'a t val some : 'a - 'a t val is_none : 'a t - bool val arg : 'a t - 'a val depth : 'a t - int end = struct type 'a t = Obj.t let sopt_tag = Obj.lazy_tag - 1 let none = Obj.new_block sopt_tag 1 let last = 255 let area = Array.create (last+1) none let () = Obj.set_field none 0 (Obj.repr 0); for i = 1 to last do let stub = Obj.new_block sopt_tag 1 in Obj.set_field stub 0 (Obj.repr i); area.(i) - stub done let is_none x = (x = none) let is_sopt x = Obj.is_block x Obj.tag x = sopt_tag Obj.size x = 1 let i = Obj.obj (Obj.field x 0) in i = 0 i = last let depth x = let x = Obj.repr x in if is_sopt x then Obj.obj (Obj.field x 0) else -1 let some (x : 'a) : 'a t = let i = depth x in if i 0 then Obj.magic x else if i = last then invalid_arg Sopt.some else Obj.obj area.(i+1) let arg (x : 'a t) : 'a = let i = depth x in if i 0 then Obj.magic x else if i = 0 then invalid_arg Sopt.arg else Obj.obj area.(i-1) end What exactly is the point of specially tagged blocks? All you need is a bunch of pointers to values to encode the depth. You can use the value pointed at to encode the index the pointer is at and physical equality to ensure it actualy is one of your pointers: let area = Array.init (last+1) (fun i - ref i) let is_sopt x = let r = Obj.repr x in Obj.is_block r Obj.size x = 1 Obj.is_int (Obj.field r 0) let i = Obj.obj (Obj.field r 0) in i = 0 i = last Obj.obj r == area.(i) # is_sopt 1;; - : bool = false # is_sopt area.(5);; - : bool = true # is_sopt (ref 5);; - : bool = false MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] A shallow option type
Gabriel Scherer gabriel.sche...@gmail.com writes: Thanks for the clearer use-case example. I think Andreas's comment is at a general level of language design: no, we don't want to add something close to what you're asking for in a language. His argumentation is spot on. What we could have is a type system with strong update, but that's dreaming about another language, not OCaml. In your case, have you considered having a private option type, that would be allocated by a call to a C primitive (building a genuine OCaml option, but outside the heap) instead of (Some foo) on the OCaml side? That would be completely safe, and you could still safely match it on the OCaml side. I have considered that shortly. Setting the option would be simple. A call to a C primitive that mallocs a block, fills in the GC metadata, tag and pointer to the unit. But what about removing the unit, setting the option to None again? At what point would it be save to free the old option? The other thread might be using it right now. Any kind of dynamic memory allocation will require some coordination between the threads to free the block. Now that I think about it again (and with what I say below for units) there can be at most one option per tile. So I could pre-allocate an array of them and assign each tile one option block to be reused whenever the tile needs one. That said, it's only pushing the problem one step further: now you still need to allocate your Unit.t value outside the OCaml heap, otherwise you still have this problem. Is it the case in your application? Yes that is the plan. The world has a fixed number of tiles and there can be at most one unit per tile so I can pre-allocate an array of units outside the heap large enough to never run out. MfG Goswin On Sat, May 5, 2012 at 6:22 PM, Goswin von Brederlow goswin-...@web.de wrote: Andreas Rossberg rossb...@mpi-sws.org writes: On May 5, 2012, at 15.33 h, Goswin von Brederlow wrote: What I want is a type 'a shallow = NULL | 'a (constraint 'a != 'b shallow) This is a form of negation, which cannot be expressed in conventional type systems. Just consider what it should mean in the presence of type abstraction: if you have M : sig type t ... end would `M.t shallow` be a legal type? You couldn't decide that properly without requiring that _every_ abstract type in every signature is annotated with a constraint saying that it is not shallow. True, so abstract types would have to be forbidden too becauseit can't be decided wether they are save or not. Since 'a shallow is an abstract type that would also forbid 'a shallow shallow. So constraint 'a != abstract would be the right thing. I have some ideas on how to implement this in a module as abstract type providing get/set/clear functions, which basically means I map None to a C NULL pointer and Some x to plain x. I know x can never be the NULL pointer, except when someone creates a 'a shallow shallow and sets Some None. That would turn into simply None. And how do you know that nobody else implements a _different_ type, say shallow2, that does the same thing? And a third party then constructs a value of type `int shallow2 shallow`? I don't and I can't. But such a type would be abstract so wouldn't be allowed by the above (reject abstract types). It seems to me that what you want effectively is a special case of non- disjoint union. Unfortunately, those are known to come with all kinds of problems, such as not being compositional. /Andreas What I want is to have an array of type t ={ ... } (* Unit.t *) and a matrix of type tile = { ... mutable unit : Unit.t option; } that I can safely access from a C thread in parallel with ocaml without having to look the runtime system or individual tiles (the array and matrix would both be created outside the ocaml heap). The problem I have is that tile.unit - Some unit will allocate an option block on the heap and accessing that from C while the GC runs causes a race condition. What I don't want to do is use type tile = { ... mutable unit : Unit.t; } tile.unit - Obj.magic 0 since then trying things out in the toplevel causes segfaults when the pretty printer prints the tile.unit and it would be easy to forget to compare the tile.unit to Obj.magic 0 on every use. I know my shallow type is basically nothing else but it adds a level of savety to it. Idealy I would even love to write match tile.unit with | NULL - () | unit - do_something unit I guess some camlp4 magic could be used to transform such a match into match Shallow.as_option tile.unit with | None - () | Some unit - do_something unit or similar constructs. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Re: [Caml-list] A shallow option type
Goswin von Brederlow goswin-...@web.de writes: Andreas Rossberg rossb...@mpi-sws.org writes: On May 5, 2012, at 15.33 h, Goswin von Brederlow wrote: What I want is a type 'a shallow = NULL | 'a (constraint 'a != 'b shallow) This is a form of negation, which cannot be expressed in conventional type systems. Just consider what it should mean in the presence of type abstraction: if you have M : sig type t ... end would `M.t shallow` be a legal type? You couldn't decide that properly without requiring that _every_ abstract type in every signature is annotated with a constraint saying that it is not shallow. True, so abstract types would have to be forbidden too becauseit can't be decided wether they are save or not. Since 'a shallow is an abstract type that would also forbid 'a shallow shallow. So constraint 'a != abstract would be the right thing. I have some ideas on how to implement this in a module as abstract type providing get/set/clear functions, which basically means I map None to a C NULL pointer and Some x to plain x. I know x can never be the NULL pointer, except when someone creates a 'a shallow shallow and sets Some None. That would turn into simply None. And how do you know that nobody else implements a _different_ type, say shallow2, that does the same thing? And a third party then constructs a value of type `int shallow2 shallow`? I don't and I can't. But such a type would be abstract so wouldn't be allowed by the above (reject abstract types). Actualy this could be solved. The problem of a int shallow2 shallow is that both would be using NULL for None. But nobody says I must use NULL. All it needs is a bit pattern that can't be a valid value. Any pointer will do: #include caml/mlvalues.h static value nil = Val_unit; CAMLprim value caml_shallow_null(value unit) { (void)unit; // unused return nil; } Now if shallow2 uses my nil as well that would be a serious bug in shallow2 and pretty hard to do. It seems to me that what you want effectively is a special case of non- disjoint union. Unfortunately, those are known to come with all kinds of problems, such as not being compositional. /Andreas What I want is to have an array of type t ={ ... } (* Unit.t *) and a matrix of type tile = { ... mutable unit : Unit.t option; } that I can safely access from a C thread in parallel with ocaml without having to look the runtime system or individual tiles (the array and matrix would both be created outside the ocaml heap). The problem I have is that tile.unit - Some unit will allocate an option block on the heap and accessing that from C while the GC runs causes a race condition. What I don't want to do is use type tile = { ... mutable unit : Unit.t; } tile.unit - Obj.magic 0 since then trying things out in the toplevel causes segfaults when the pretty printer prints the tile.unit and it would be easy to forget to compare the tile.unit to Obj.magic 0 on every use. I know my shallow type is basically nothing else but it adds a level of savety to it. Idealy I would even love to write match tile.unit with | NULL - () | unit - do_something unit I guess some camlp4 magic could be used to transform such a match into match Shallow.as_option tile.unit with | None - () | Some unit - do_something unit or similar constructs. MfG Goswin MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] How do I declare a value of 'a t instead of '_a t in a module?
Gabriel Scherer gabriel.sche...@gmail.com writes: You need to specify in the signature that ('a t) is contravariant: type +'a t Thanks, that was what I was looking for. Why can I never find this in the docs when I need it? :) MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Implementation for a (nearly) typesafe shallow option type and a compiler bug?
Hi, as mentioned earlier I wanted to have a type 'a shallow = NULL | 'a that is like an 'a option but without the indirection. The first idea was that no ocaml value can be the C NULL pointer (usualy all bits set to 0) and the C NULL pointer points outside the ocaml heap so the GC is fine with that too. So NULL could be encoded as C NULL pointer and 'a used as is. First problem is that this does not work for an 'a shallow shallow. And there is no way to constrain 'a to not be a 'b shallow in the type definition. Second problem is that someone else might define a 'a shallow2 type with the same idea and 'a shallow2 shallow would also not work. And with abstract types it would be impossible to test for that even if ocaml would allow negative constraints. The common problem here is that NULL is not unique for each type. So I thought of a way to make it unique. What I came up with is using a functor to create a unique NULL value for each instance. Source code at the end. Output: NULL - None Some 5 - Some 5 NULL - None Some NULL - Some None Some Some 5 - Some Some 5 1 2 3 4 a b c d NULL - Some 70221466564508 As you can see from the output this solves the problem of 'a shallow shallow and would also solve the 'a shallow2 shallow case. But in the last line a new problem appears. Each instance of the functor has a unique NULL element. But instances of the functor with the same type are compatible. So an IntShallow2.t can be used instead of an IntShallow.t but then the NULL does not match. To me this looks like a bug in ocaml because the resulting type of the functor application does not match the type I specified for the functor: module type SHALLOW = sig type -'a t val null : 'a t val make : 'a - 'a t val as_option : 'a t - 'a option end module Make : functor (Type : TYPE) - SHALLOW module IntShallow2 : sig type 'a t = 'a Shallow.Make(Int).t val null : 'a t val make : 'a - 'a t val as_option : 'a t - 'a option end The type I specified has 'a t abstract while the IntShallow2 has the type 'a t more concret. Restricting the type to what it should already be results in the correct error: module IntShallow2 = (Shallow.Make(Int) : Shallow.SHALLOW) let () = Printf.printf NULL - %s\n (to_string (IntShallow2.null)) File shallow.ml, line 91, characters 42-60: Error: This expression has type 'a IntShallow2.t but an expression was expected of type int IntShallow.t = int Shallow.Make(Int).t Why is that? MfG Goswin == module Shallow : sig module type TYPE = sig type 'a t end module type SHALLOW = sig type -'a t val null : 'a t val make : 'a - 'a t val as_option : 'a t - 'a option end module type SHALLOWFUNCTOR = functor (Type : TYPE) - SHALLOW module Make : functor (Type : TYPE) - SHALLOW end = struct module type TYPE = sig type 'a t end module type SHALLOW = sig type -'a t val null : 'a t val make : 'a - 'a t val as_option : 'a t - 'a option end module type SHALLOWFUNCTOR = functor (Type : TYPE) - SHALLOW module Make_intern = functor (Type : TYPE) - struct type -'a t (* Dummy object unique to the type *) let null = Obj.magic (ref 0) let make x = Obj.magic x let as_option x = if x == null then None else Some (Obj.magic x) end module Make = (Make_intern : functor (Type : TYPE) - SHALLOW) end module Int = struct type 'a t = int end module IntShallow = Shallow.Make(Int) module IntShallowShallow = Shallow.Make(IntShallow) let to_string x = match IntShallow.as_option x with | None - None | Some x - Printf.sprintf Some %d x let to_string2 x = match IntShallowShallow.as_option x with | None - None | Some x - Printf.sprintf Some %s (to_string x) module List = struct module Item = struct type 'a t = 'a end module Shallow = Shallow.Make(Item) type 'a item = { mutable next : 'a list; data : 'a; } and 'a list = 'a item Shallow.t let null = Shallow.null let cons x y = Shallow.make { next = y; data = x; } let rec iter fn x = match Shallow.as_option x with | None - () | Some { next; data; } - fn data; iter fn next end let () = Printf.printf NULL - %s\n (to_string (IntShallow.null)); Printf.printf Some 5 - %s\n (to_string (IntShallow.make 5)); Printf.printf NULL - %s\n (to_string2 (IntShallowShallow.null)); Printf.printf Some NULL - %s\n (to_string2 (IntShallowShallow.make IntShallow.null)); Printf.printf Some Some 5 - %s\n (to_string2 (IntShallowShallow.make (IntShallow.make 5))); let x = List.null in let x = List.cons 4 x in let x = List.cons 3 x in let x = List.cons 2 x in let x = List.cons 1 x in List.iter (Printf.printf %d ) x; print_newline (); let y = List.null in let y = List.cons d y in let y = List.cons c y in let y
Re: [Caml-list] A shallow option type
Jacques Garrigue garri...@math.nagoya-u.ac.jp writes: Hi Goswin, Actually I remember discussing this issue with Xavier Leroy a long time ago, and we ended up concluding that this could be done in a clean way by using a bit pattern not yet used for ocaml values. Our idea was that rather than put some restriction on the shallowness of types (which is very hard to implement), one should just allow arbitrary nesting of option types, and represent the sequence Some (Some (... ( Some None) ...)) as an integer. I think I even tried an implementation, but we then concluded that there was not enough demand to put it in the compiler. The advantage of providing it by the compiler would be to allow pattern matching. However there is nothing wrong with providing it as a library. The basic idea is that you have only two kinds of values in ocaml: integers, which have their lower bit to 1, and pointers, which must all be multiples of 4 (on 32-bit architectures). So the pattern (v % 4) == 2 is not used. Actualy isn't (v % 4) == 2 an exception? caml/callback.h: #define Make_exception_result(v) ((v) | 2) #define Is_exception_result(v) (((v) 3) == 2) #define Extract_exception(v) ((v) ~3) So returning an Sopt.none in a callback would be taken as exception. :) Here is the code: module Sopt : sig type +'a t val none : 'a t val some : 'a - 'a t val is_none : 'a t - bool val arg : 'a t - 'a val is_sopt : 'a t - bool val depth : 'a t - int end = struct type 'a t = int let null = (Obj.magic (Some 0) land 2) land 1 let none = null + 1 let is_none x = x 0 x 1 let is_sopt x = is_none (x land 1) let some (x : 'a) : 'a t = let y : 'a t = Obj.magic x in if is_sopt y then y+2 else y let arg (x : 'a t) : 'a = if is_sopt x then Obj.magic (x-2) else Obj.magic x let depth x = if is_sopt x then x lsr 1 else 0 end WOW. That is some code. Thanks for sharing that. The code for null tricks the compiler into creating a null pointer. I avoiding using the simpler (Obj.magic (Some 0) land 0) in case land is optimized. By adding 1 to this value, one creates a 2 at the C level. For ocaml this value is strictly between 0 (i.e. 1) and 1 (i.e. 3). Sopt.some checks whether a value is such an sopt, in which case it adds 2 to it to add a Some constructor, otherwise it returns the value itself. Sopt.arg does just the opposite. Sopt.is_sopt and Sopt.depth are only exported for demonstrative purposes. # Sopt.depth (Sopt.some (Sopt.some (Sopt.some Sopt.none)));; - : int = 3 # Sopt.depth (Sopt.some (Sopt.some (Sopt.some 0)));; - : int = 0 Of course this precludes using the above bit pattern for anything else, but otherwise this should be completely type-safe. (At the theoretical level at least, I'm not 100% sure of the trickery used here to have the compiler work on C level integers) Cheers, Jacques Garrigue The implementation relies on specific aspect of the compiler to construct those values. I would give it a 99% chance to fail with an llvm backend for ocaml for example. Using C stubs that directly manipulate the bits seems a better approach. For example nothing says x + 2 can't be implemented as value add(value x, value y) { return (x ~1) + y; } as opposed to value add(value x, value y) { return x + y - 1; } MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Implementation for a (nearly) typesafe shallow option type and a compiler bug?
Gabriel Scherer gabriel.sche...@gmail.com writes: What you observe is the so-called strengthening of type equalities in functor applications. See papers 5 or 6 in this list: http://caml.inria.fr/about/papers.en.html It is not a bug, but a feature: you can write functors F such that applying F(X) twice yields compatible, rather than incompatible, types. If you want to recover incompatible types, you can seal the functor result as you did in your workaround, or pass a non-path functor expression (that behave in a more generative way): F(struct include X end). Good to know. I found that surprising. I think it is bad that you can't specify the type of the functor so that both compatible and incompatible types would be an option. Just like you can use 'a, +'a and -'a to fine tune variance in types there could be some syntax to make the functor type strengthened or not. On your more general code: - I do not understand why you specify the abstract type ('a t) to be contravariant, and I suspect this will be unsound (is an ( m : int t) also an ( m : int; s : string t)?) Left over from trying to make the functor type not strengthened. - I am not sure using (Obj.magic (ref 0)) is safe wrt. types whose representation is not always a pointer (ie. floats) (Obj.magic (ref 0)) is always a pointer that is unique to the instance of the functor. No other value can legally have this bit pattern. As for floats: Manual 18.3.1 Atomic types Caml type Encoding float Blocks with tag Double_tag. A float is always a pointer and that can't be legally pointing to our (Obj.magic (ref 0)). MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] using modules to wrap c++ classes
Joel Reymont joe...@gmail.com writes: On Fri, May 4, 2012 at 9:43 AM, Goswin von Brederlow goswin-...@web.de wrote: As discussed on irc you need to create your callbacks[] array as ocaml block and register that itself as root. That also has the benefit that you only have to register a single root instead of 50. Assuming that I have a class named MyCallbacks with a public member callbacks of type value, is this correct? No. The callbacks[] can be public, protect, private or static. You could also register the individual callbacks but then you need to register and deregister all 50 of them seperate. Having just one makes things easier. So the need above might be to strong. It is just simpler that way. And I think the GC had some performance issues with too many roots. Also, can I use Is_block to check if there's a closure value stored in the callbacks block before I dispatch to the closure? Sure. Thanks, Joel --- // finalizer, stored in custom_operations void Callbacks_delete(value v) { CAMLparam1(v); MyCallbacks* o = Callbacks_val(v); caml_remove_global_root(o-callbacks); delete o; } CAMLprim value Callbacks_new() { CAMLparam0(); CAMLlocal2(v, cbks); v = caml_alloc_custom(ops, sizeof(MyCallbacks*), 0, 1); MyCallbacks* o = new MyCallbacks(); Callbacks_val(v) = o; cbks = caml_alloc(NUM_CBKS, 0); o-callbacks = cbks; caml_register_global_root(o-callbacks); CAMLreturn(v); } extern C CAMLprim value Callbacks_set(value self, value cbk) { CAMLparam2(self, cbk); MyCallbacks* o = Callbacks_val(self); Store_field(o-callbacks, Tag_val(cbk), Field(0, cbk)); CAMLreturn(Val_unit); } Don't you have to use caml_modify() here instead of Store_field()? I think Store_field is only alowed in freshly allocated blocks. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] A shallow option type
Hi, I wish there was an option type that would work without extra indirection (or more importantly without extra allocation of an ocaml value when setting it to something). Why? I'm interfacing with C in a multithreaded way. The data is allocated on the C side so it won't be moved around by the GC. The ocaml side will modify data and the C side will utilize it. Now if ocaml changes a multable 'a option then it will allocate a block for Some x and the GC will move that block around during compaction. Which means the 'a option can't be safely used without holding the runtime system lock. Which then means the threads can't run in parallel. What I want is a type 'a shallow = NULL | 'a (constraint 'a != 'b shallow) I have some ideas on how to implement this in a module as abstract type providing get/set/clear functions, which basically means I map None to a C NULL pointer and Some x to plain x. I know x can never be the NULL pointer, except when someone creates a 'a shallow shallow and sets Some None. That would turn into simply None. Is it possible to somehow declare the constraint that 'a in 'a shallow must not be itself a 'b shallow? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] How do I declare a value of 'a t instead of '_a t in a module?
Hi, I'm writing a module that reimplements the option type without redirection. For that I'm mapping None to a value with all bits set to 0, the NULL pointer in C. For that I have a little helper function: external make_null : unit - 'a t = caml_shallow_null But now I want to also have a polymorphic value for NULL: module Make : sig type 'a t val make_list : unit - 'a list val make_null : unit - 'a t end = struct type 'a t let make_list () = [] external make_null : unit - 'a t = caml_shallow_null end module Shallow : sig type 'a t val list : 'a list val null : 'a t end = struct type 'a t = 'a Make.t let list = Make.make_list () let null = Make.make_null () end File shallow.ml, line 15, characters 6-102: Error: Signature mismatch: Modules do not match: sig type 'a t = 'a Make.t val list : 'a list val null : '_a Make.t end is not included in sig type 'a t val list : 'a list val null : 'a t end Values do not match: val null : '_a Make.t is not included in val null : 'a t What makes the Make.make_list differ from Make.make_null? Is there a way that I can specify that null is still polymorphic or does that only work for constructors like None and compiler magic like []? -- And here something odd: module Make : sig type 'a t = 'a list val make_list : unit - 'a list val make_null : unit - 'a t end = struct type 'a t = 'a list let make_list () = [] external make_null : unit - 'a t = caml_shallow_null end module Shallow : sig type 'a t val list : 'a list val null : 'a t end = struct type 'a t = 'a Make.t let list = Make.make_list () let null = Make.make_null () end compiles fine. But module Make : sig type 'a t = private 'a list val make_list : unit - 'a list val make_null : unit - 'a t end = struct type 'a t = 'a list let make_list () = [] external make_null : unit - 'a t = caml_shallow_null end module Shallow : sig type 'a t val list : 'a list val null : 'a t end = struct type 'a t = 'a Make.t let list = Make.make_list () let null = Make.make_null () end fails again. Just making the Make.t private makes it fail. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] A shallow option type
Andreas Rossberg rossb...@mpi-sws.org writes: On May 5, 2012, at 15.33 h, Goswin von Brederlow wrote: What I want is a type 'a shallow = NULL | 'a (constraint 'a != 'b shallow) This is a form of negation, which cannot be expressed in conventional type systems. Just consider what it should mean in the presence of type abstraction: if you have M : sig type t ... end would `M.t shallow` be a legal type? You couldn't decide that properly without requiring that _every_ abstract type in every signature is annotated with a constraint saying that it is not shallow. True, so abstract types would have to be forbidden too becauseit can't be decided wether they are save or not. Since 'a shallow is an abstract type that would also forbid 'a shallow shallow. So constraint 'a != abstract would be the right thing. I have some ideas on how to implement this in a module as abstract type providing get/set/clear functions, which basically means I map None to a C NULL pointer and Some x to plain x. I know x can never be the NULL pointer, except when someone creates a 'a shallow shallow and sets Some None. That would turn into simply None. And how do you know that nobody else implements a _different_ type, say shallow2, that does the same thing? And a third party then constructs a value of type `int shallow2 shallow`? I don't and I can't. But such a type would be abstract so wouldn't be allowed by the above (reject abstract types). It seems to me that what you want effectively is a special case of non- disjoint union. Unfortunately, those are known to come with all kinds of problems, such as not being compositional. /Andreas What I want is to have an array of type t ={ ... } (* Unit.t *) and a matrix of type tile = { ... mutable unit : Unit.t option; } that I can safely access from a C thread in parallel with ocaml without having to look the runtime system or individual tiles (the array and matrix would both be created outside the ocaml heap). The problem I have is that tile.unit - Some unit will allocate an option block on the heap and accessing that from C while the GC runs causes a race condition. What I don't want to do is use type tile = { ... mutable unit : Unit.t; } tile.unit - Obj.magic 0 since then trying things out in the toplevel causes segfaults when the pretty printer prints the tile.unit and it would be easy to forget to compare the tile.unit to Obj.magic 0 on every use. I know my shallow type is basically nothing else but it adds a level of savety to it. Idealy I would even love to write match tile.unit with | NULL - () | unit - do_something unit I guess some camlp4 magic could be used to transform such a match into match Shallow.as_option tile.unit with | None - () | Some unit - do_something unit or similar constructs. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] using modules to wrap c++ classes
Joel Reymont joe...@gmail.com writes: So I ended up with something like this --- Callbacks.ml type t type 'a callback = | Alert of ('a - Alert.t - unit) | AskQuote of ('a - Ask.t - unit) | BestAskQuote of ('a - Ask.t - unit) | BidQuote of ('a - Bid.t - unit) | BestBidQuote of ('a - Bid.t - unit) | ClosePrice of ('a - ClosePrice.t - unit) | ClosingIndicator of ('a - ClosingIndicator.t - unit) | EndQuote of of ('a - EndQuote.t - unit) | EquityOptionList of of ('a - EquityOptionList.t - unit) | EquityOptionStrategyList of of ('a - EquityOptionStrategyList.t - unit) | HighPrice of ('a - HighPrice.t - unit) | Indicator of ('a - Indicator.t - unit) | IndicatorReplay of ('a - IndicatorReplay.t - unit) | LimitOrderBook of ('a - LimitOrderBook.t - unit) | LowPrice of ('a - LowPrice.t - unit) | MarketMode of ('a - MarketMode.t - unit) | OpenPrice of ('a - OpenPrice.t - unit) ... external make : unit - t = Callbacks_new external set : t - 'a callback - unit = Callbacks_set I now want to grab the closure from the constructor on the C side and stick it into a callback array. Is this sufficient? extern C CAMLprim value Callbacks_set(value v) { CAMLparam1(v); int i = Tag_val(v); CAMLlocal1(f); f = Field(0, v); caml_register_global_root(f); callbacks[i] = f; CAMLreturn(Val_unit); } Thanks, Joel This won't work. The value f is a pointer to a closure block butr that block is under the control of the GC. It will be moved around during compaction and then your callbacks[i] is invalid. As discussed on irc you need to create your callbacks[] array as ocaml block and register that itself as root. That also has the benefit that you only have to register a single root instead of 50. For an example you can look at my libfuse bindings: http://git.ocamlcore.org/cgi-bin/gitweb.cgi?p=libfuse-ocaml/libfuse-ocaml.git;a=tree;f=lib;h=72f44b5090e0d1341d679f52249bbf3974709c1c;hb=HEAD The idea there is to have a ocaml block with callbacks and a custom block with finalizer that contains a pointer to the ocaml block mixed with plain C data. The GC will not inspect the contents of the custom block so the block of callbacks is registered as global root to keep it alive and unregistered in the finalizer. Since you are going to use a static block of calbacks you only need to register the root, it never dies. In my case I use a record containing all callbacks to initialize the C side. With fuse most use cases will set most of the callbacks so having to set each in turn would be more work. I also use Obj.magic 0 to initialize callbacks that are unused. An alternative is to use 'a option types but I decided against that to avoid the extra indirection. Those callbacks get called a lot and speed is relevant. Callbacks can be set as needed and the intended use for this is let my_ops = { default_ops with init = my_init; destroy = my_destroy; } Using the with syntx of records one only overrides the callbckas one wishes to use. Changing the code to set callbacks individualy is left as an exercise to the reader. :) MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] using modules to wrap c++ classes
Joel Reymont joe...@gmail.com writes: Suppose I have a C++ class with a bunch of methods that act as callbacks. My callbacks will be hit if I derive the class and implement the methods I'm interested in. Now, I would like to wrap this class from OCaml. This is straightforward from using OCaml classes (section 18.3.5 of the manual) but is there a way to implement this using modules? I don't think this is possible but would like to confirm. Thanks, Joel From the top of my head: module M = struct type t external foo : t - unit = caml_mine_M_foo end class M { void foo(void) { } }; value caml_mine_M_foo(value ml_m) { CAMLparam1(ml_m); M *m = (M*)ml_m; m-foo(); CAMLreturn Val_unit; } Wouldn't that do? It gets harder with inheritance, when you have modules A, B, C for the different classes and want to be able to pass a C.t to a function expecting a B.t. You will need to declare a C.as_b function and such. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] using infix operators/functions in modules when not using open
Gabriel Scherer gabriel.sche...@gmail.com writes: Module.(foo (op) bar) This is a local open (a more explicit notation is let open Module in foo op bar). I think this is a fine solution: I don't like open, but local mitigates its flaws. Another solution is to rebind your operator locally: let (op) = Module.blah in foo op bar (of course Module.blah doesn't need to be an infix itself) If you are going to use an infix operator in a wide scope, I think that such a rebinding solution is nice because it's more explicit, which is good when it helps users understand the code. And that also works more globally as in let (op) = Module.blah let foo x y = x (op) y MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] exn vs option
Daniel Bünzli daniel.buen...@erratique.ch writes: Le jeudi, 5 avril 2012 à 11:05, Goswin von Brederlow a écrit : If you are writing a module then consider providing both flavours for functions, one with exceptions and one with options. Even if you only do something like this: I don't think it's a rule that should be applied consistently, I don't like libraries that provide tons of convenience functions, too many names. If you stick with the option type (or variant cases) the user can quickly wrap the function if it needs an exception (or vice-versa but I prefer it that way since it's then documented by the function signature and is in my opinion better behaved in general). However in the particular case of finding something in a data structure where the user could be confronted with a situation where he can prove that the datum will be found I think it's justified to provide both flavours. For these cases, in my own code I settled on the following pattern : val find : 'a t - key - 'a option (** [find d k] is the value bound to [k] in [d], if any. *) val get : 'a t - key - 'a (** [get d k] is like {!find} but raises [Invalid_argument] if [k] is not bound in [d]. *) That pattern works well if you have just one or few such functions and can think of nice names to differentiate them. Otherwise submodules are usefull: module Foo = Foo.Options or module Foo = Foo.Exceptions Foo.do_things () or open Foo.Options do_things () The option and exception raising functions would be named the same and the user selects one or the other by selecting a submodule. As said, it is usualy a matter of taste and the users tastes might differ from the module authors taste. By providing an option and execption flavour of a function (by different names or submodules hardly matters) the user can use what he likes as opposed to what the modules author imposes on him. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] exn vs option
Pierre Chopin pie...@punchup.com writes: Hi, I benchmarked two programs, in one case the main function throw an exception that is caught, in the other the function returns an option that is pattern matched on. I noticed that, whether the exception is thrown or not, the option version is always faster. Is there any case where it makes sense, performance wise, to use exception instead of 'a option ? I find that in most cases speed is not a major concern and then it comes down to taste and readability of the code. test1.ml -- exception Foo let f x = if x =1 then raise Foo else () ;; for i = 0 to 10_000_000 do try f 1 with Foo - () done 0.34s user 0.01s system 99% cpu 0.351 total That is rather short for a test. lets add 2 zeroes to the loop. And lets call f 0 and f 1 to test both cases: f 0: 17.72s user 0.02s system 99% cpu 17.792 total f 1: 35.30s user 0.02s system 99% cpu 35.371 total test2.ml: let f x = if x=1 then None else Some () ;; for i = 0 to 10_000_000 do match f 1 with None - () | Some s - s done f 0: 11.60s user 0.02s system 99% cpu 11.655 total f 1: 10.91s user 0.01s system 99% cpu 10.933 total And lets test the speed when the exception is actualy exceptional: exception Foo let f x = if x =1 then raise Foo else () let () = try for i = 0 to 10 do f 0 done with Foo - () 9.94s user 0.00s system 99% cpu 9.946 total Someone said in deep recursions exceptions are faster because they don't have to unwind the stack: exception Result of int let rec fac acc = function | 1 - raise (Result acc) | n - fac (n * acc) (n - 1) let () = for i = 0 to 100_000_000 do try fac 1 50 with Result _ - () done 71.88s user 0.00s system 99% cpu 1:11.90 total let rec fac acc = function | 1 - acc | n - fac (n * acc) (n - 1) let () = for i = 0 to 100_000_000 do ignore (fac 1 50) done 67.04s user 0.02s system 99% cpu 1:07.08 total Not feeling it. Lets try something not tail recursive: exception Error let rec foo = function | 1 - raise Error | n - n * (foo (n - 1)) let () = for i = 0 to 100_000_000 do try ignore (foo 50) with Error - () done 25.03s user 0.01s system 99% cpu 25.068 total let rec foo = function | 1 - None | n - match foo (n - 1) with None - None | Some x - Some (n * x) let () = for i = 0 to 100_000_000 do ignore (foo 50) done 49.48s user 0.01s system 99% cpu 49.508 total In conclusion I would have to say that exceptions are better if they are exceptional. When you do not catch them right away or not at all then they are better. The try command is more expensive than an option type and if you are going to catch the exception right away anyway then options are faster. But if you don't catch them right away the cost of the try can be amortized over many calls and becomes cheaper. Or if you don't catch the exception at all then you can get a nice backtrace of where the exception occured. If your code is not tail recursive then option types mean you have to match them on every level again and again and allocate a ton of 'Some x' if no error occurs. You can make your code tail recursive or use exception to improve performance there. If you are writing a module then consider providing both flavours for functions, one with exceptions and one with options. Even if you only do something like this: let find x y = let raise_on_none exn f arg = match f arg with | None - raise exn | Some x - x let find_exn x y = raise_on_none Not_found (find x) y Obviously option only works for exceptions like Not_found. If you want to return an error you might have to use something like type ('a, 'b) result = Result of 'a | Error of 'b Putting the 2 flavours into different submodules can keep things tidy too. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Wish: mutable variant types, equivalence with records
François Bobot bo...@lri.fr writes: On 30/03/2012 07:16, Goswin von Brederlow wrote: François Bobotbo...@lri.fr writes: On 24/03/2012 13:45, Wojciech Meyer wrote: Please see [1], Alain Frisch has been working recently on implementing in-line records for constructor arguments. It's more implementation/design implications than people might think. [1] http://caml.inria.fr/mantis/view.php?id=5528 In the thread of this proposed feature, there is a remark that inlined record and normal record can become competing features. There is also the burden that inlined record as proposed can be modified only in pattern matching. But can't we consider that, for a semantic, syntax and typing perspective: type t = | A of string | B of ({msg: string; mutable foo:int} as t2) | C is exactly the same thing than: type t = | A of string | B of t2 | C and t2 = {msg: string; mutable foo:int} That would change existing code because B of t2 currently is a Block of size 1 with tag B and pointer to a record of type t2. I don't propose to change how is compiled the second definition. Just that if a developer use the first definition in a module he can consider that he define t and t2 like in the second definition. The only exception is if you use Obj.magic: Obj.magic (B r) == Obj.magic r. But Obj.magic is not in the semantic I assume. The only difference is that when you create a record of type t2 the tag is directly the one of B in the first case and is the default tag for record (the first tag if I remember well) in the second case. So in the first case applying the constructor B is just the identity. Maybe the type of a record could be altered to, at least internally, include a tag value: Every records already include a tag value, it's just always the same: the tag of the first non-constant constructor of a sum type. No it doesn't. Not in its type. It just happens to have a tag 0 in its memory representation. Say you have type r1 = { x : int; } type Foo = Foo of int | Bar of ({ x : int; } as r2) type Foo = Foo of ({ x : int; } as r3) | Bar of int The type r1 and the internal type r2 are not the same. A record of type r2 would have a tag of 1. Same with r2 and r3, they should be different types. type t = A of string | B of ({msg: string; mutable foo:int} as t2) | C type t2 = {[B] x : int; } The inline record would be identical to the external record with tag value and (t2 : t) would work even implicitly. I'm not sure that we should make implicit conversion (inference? principality?). But that B r is a noop can be very interesting. The implicit parts would be in matches for example. match x with B r - would basically just translate to (x : t2) with the additional check that the tag is right. Or let bar = fun i - {[B] x = i; } let foo : int - t = fun i - bar i val bar : int - {[B] x : int; } val foo : int - t On the other hand bar could already be interfered as type int - t. It all depends on wether the [tag] syntax would be restricted to inline records in constructors or not. It probably makes sense to restrict declaring a record with tag to inline records. Otherwise it might get confusing what value the tag should actually have. One other thing: How do you declare a value of such a type? type r = { x : int; } type foo = Foo of r | Blub of int type bar = Bar of { y : int; } | Blubber of int let myfoo = Foo { x = 0; } let mybar = Bar { y = 0; } This would be confusing since the two look identical but do something quite different. Maybe the syntax should be more like this: type r = { x : int; } type foo = Foo of r | Blub of int type bar = { Bar y : int; } | Blubber of int let myfoo = Foo { x = 0; } let mybar = { Bar y = 0; } With the constructor actualy inside the record the two cases look different and it is visually clear that the Bar tag is part of the record while Foo is a constructor taking a record as argument. If we want to play to extend this feature (perhaps too much) we can play with conversion between sum type without cost by allowing Ap to be also the identity: type tpublic = | A of ({... } as ta) | B of ({... } as tb) type tprivate = | Ap of ta | Bp of tb | Cp of ... Since A,ta and Ap share the same tag the following function is the identity: Again no, at least it isn't a NOP. The current memory representation of Ap of ta is incompatible with the representation we want for A of ({... } as ta). MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Wish: mutable variant types, equivalence with records
François Bobot bo...@lri.fr writes: On 24/03/2012 13:45, Wojciech Meyer wrote: Please see [1], Alain Frisch has been working recently on implementing in-line records for constructor arguments. It's more implementation/design implications than people might think. [1] http://caml.inria.fr/mantis/view.php?id=5528 In the thread of this proposed feature, there is a remark that inlined record and normal record can become competing features. There is also the burden that inlined record as proposed can be modified only in pattern matching. But can't we consider that, for a semantic, syntax and typing perspective: type t = | A of string | B of ({msg: string; mutable foo:int} as t2) | C is exactly the same thing than: type t = | A of string | B of t2 | C and t2 = {msg: string; mutable foo:int} That would change existing code because B of t2 currently is a Block of size 1 with tag B and pointer to a record of type t2. The only difference is that when you create a record of type t2 the tag is directly the one of B in the first case and is the default tag for record (the first tag if I remember well) in the second case. So in the first case applying the constructor B is just the identity. Maybe the type of a record could be altered to, at least internally, include a tag value: type t = A of string | B of ({msg: string; mutable foo:int} as t2) | C type t2 = {[B] x : int; } The inline record would be identical to the external record with tag value and (t2 : t) would work even implicitly. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Wish: mutable variant types, equivalence with records
Lukasz Stafiniak lukst...@gmail.com writes: On Sat, Mar 24, 2012 at 7:39 PM, Lukasz Stafiniak lukst...@gmail.com wrote: On Sat, Mar 24, 2012 at 7:32 PM, Lukasz Stafiniak lukst...@gmail.com wrote: I'm not sure about mutable but I'd appreciate labels :D As for syntax, I think that unboxed anonymous records would be better. For starters, one could make a Camlp4 extension that generates a record type named typ_Variant for a type typ and its constructor Variant whose fields are defined as a record. Hmm... Yes please. That wouldn't eliminate the risk of the two types getting out-of-sync and save keystrokes. Record unboxing might be handled as an orthogonal issue? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] GADT and optional arguments: Can they work?
Hi, after having just written a ton of functions that just differ in the argument type I started to think that this might be a good use acase for GADT types. So instead of seperate get_int, get_int32, get_int64, ... functions there would only be one get function with an extra GADT argument to specifiy the type to be used. For example Unix.lseek and Unix.LargeFile.lseek could be written as: (* GADT for LargeFile *) type _ size = Int : int size | Int64 : int64 size let lseek : type a . a size - Unix.file_descr - a - Unix.seek_command - a = fun size fd off cmd - match size with | Int - Unix.lseek fd off cmd | Int64 - Unix.LargeFile.lseek fd off cmd val lseek : 'a size - Unix.file_descr - 'a - Unix.seek_command - 'a = fun So far so good. But now one has to specify the argument size on every lseek call. To make it nicer I would like the size to be optional and specify a default size for the most common use: let lseek : type a . ?size:a size - Unix.file_descr - a - Unix.seek_command - a = fun ?(size=Int) fd off cmd - match size with | Int - Unix.lseek fd off cmd | Int64 - Unix.LargeFile.lseek fd off cmd ;; (* * fun ?(size=Int) fd off cmd - *^^^ * Error: This expression has type int size *but an expression was expected of type a size *) Is there a way to write this so it types correctly? I guess I would need to somehow specify a default type for a for the a size argument to be optional. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Strategies for finding memory leaks
Hans Ole Rafaelsen hrafael...@gmail.com writes: Hi, I have tried to trigger Gc.compact but that only have limited effect for a short time. Hans Ole My thinking was rather that something triggers the GC manually and over time starts to do that far too often so more and more time is spend in the GC. If it isn't triggered manually then custom types can also give the GC a wrong idea about how much memory is used/wasted. But that is also just an idea of where to look. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Wish: mutable variant types, equivalence with records
Hi, consider the code below that counts how often a value is printed. The reason why this code works is that the memory layout of a variant with arguments is identical to a record with the same types. The only difference is that a variant sets the tag of the memory block to reflect which constructor it is (here Bar = 0, Baz = 1). But the code is fragile. It requires the use of Obj.magic and duplicates the definitions of bar and baz (once as variant and once as record). If the type of foo is changed but not the records then bad things will happen. There are ways to do this without Obj.magic: type foo = Bar of bar_record | Baz of baz_record type foo = Bar of int * int ref | Baz of float * int ref The first adds an indirection for every access to foo and breaks matching. The second adds an indirection for every mutable value in the type. In both cases the extra indirections increase the memory footprint and runtime. So why not allow mutable in variant types and some equivalence with records? For example: type name = Constructor of [mutable] type[:label] [| ...] as in: type foo = | Bar of int:i * mutable int:bar_used | Baz of float:f * mutable int:baz_used let use x = match x with | Bar bar - bar.bar_used - bar.bar_used + 1 | Baz baz - baz.baz_used - baz.baz_used + 1 let print x = use x; match x with | Bar bar - Printf.printf %d\n bar.i | Baz baz - Printf.printf %f\n baz.f The label is optional and any types in the constructor without label would be translated into anonymous fields in a record that are ineaccessible. type foo = Foo of int * mutable int:used would be equivalent to { _ : int; mutable used : int; } Taking it one step wurther one could even allow: let bar = { i = 1; bar_used = 0; } let foo = Bar bar let foo = let Bar bar = foo in Bar { bar with i = 2; } What do you think? MfG Goswin == module Foo : sig type foo val make_bar : int - foo val make_baz : float - foo val get_used : foo - int val print : foo - unit end = struct type foo = Bar of int * int | Baz of float * int type bar_record = { i : int; mutable bar_used : int; } type baz_record = { f : float; mutable baz_used : int; } let make_bar i = Bar (i, 0) let make_baz f = Baz (f, 0) let use x = match x with | Bar _ - let (bar : bar_record) = Obj.magic x in bar.bar_used - bar.bar_used + 1 | Baz _ - let (baz : baz_record) = Obj.magic x in baz.baz_used - baz.baz_used + 1 let get_used = function Bar (_, used) | Baz (_, used) - used let print x = use x; match x with | Bar (i, _) - Printf.printf %d\n i | Baz (f, _) - Printf.printf %f\n f end;; let foo = Foo.make_bar 1 let used_before = Foo.get_used foo let () = Foo.print foo let used_after = Foo.get_used foo -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Strategies for finding memory leaks
Hans Ole Rafaelsen hrafael...@gmail.com writes: Hello, Is there some tools / tricks that can be used to help find memory leaks? I have trouble with an application, that when running for a long time, starts to use a lot of CPU and consume more memory. It starts out by using about 20% CPU (reported by top) and after 24 hours it has increased to 80% usage. Also physical (RES) memory usage goes from 80M to 160M. The workload for the application is the same the whole time. Using OProfile (http://oprofile.sourceforge.net/news/) shows that that most of the time is being spent doing memory management. At startup: CPU: Core 2, speed 2667 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 10 samples % image name symbol name 52764 22.3913 vc_client.native mark_slice 33580 14.2502 vc_client.native caml_page_table_lookup 25415 10.7853 vc_client.native sweep_slice 10119 4.2942 vc_client.native caml_fl_allocate 6423 2.7257 [vdso] (tgid:9015 range:0x7fff4256c000-0x7fff4256d000) [vdso] (tgid:9015 range:0x7fff4256c000-0x7fff4256d000) 5233 2.2207 vc_client.native camlLividisvc__Nalbuf_tools__replace_pattern_1033 2759 1.1708 vc_client.native caml_iterate_global_roots 2728 1.1577 vc_client.native caml_modify 2473 1.0495 vc_client.native caml_oldify_one 2204 0.9353 vc_client.native camlLividisvc__Nalbuf_bytestream__search_1047 2183 0.9264 vc_client.native caml_darken 1935 0.8212 vc_client.native caml_stash_backtrace 1843 0.7821 vc_client.native caml_do_roots 1838 0.7800 vc_client.native caml_delete_global_root After ca. 24 hours run: CPU: Core 2, speed 2667 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 10 samples % image name symbol name 1137401 56.2697 vc_client.native mark_slice 405598 20.0658 vc_client.native sweep_slice 399832 19.7806 vc_client.native caml_page_table_lookup 10106 0.5000 vc_client.native caml_fl_allocate 3548 0.1755 vc_client.native caml_iterate_global_roots 3397 0.1681 [vdso] (tgid:26129 range:0x7fff747ff000-0x7fff7480) [vdso] (tgid:26129 range:0x7fff747ff000-0x7fff7480) 2797 0.1384 vc_client.native camlLividisvc__Nalbuf_tools__replace_ pattern_1033 2307 0.1141 vc_client.native camlLividisvc__Nalbuf_bytestream__sea rch_1047 2005 0.0992 vc_client.native caml_oldify_local_roots 1786 0.0884 vc_client.native caml_gc_stat 1441 0.0713 vc_client.native caml_oldify_one 1163 0.0575 vc_client.native caml_darken 1163 0.0575 vc_client.native caml_fl_merge_block 1032 0.0511 vc_client.native camlHashtbl__find_1093 The application uses several 3rd party libraries, including: LablGTK2, OCamlNet, LWT and others. Is there some clever trick that can by used to track down or get a hint of what is causing this? Thanks, Hans Ole Rafaelsen Are you calling the GC manually somewhere in the code or in one of the libs? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Re: Wanted: GADT examples: string length, counting module x
o...@okmij.org writes: Somehow typed tagless interpreters (embeddings of a typed language) and length-parameterized lists with the append function are the most popular examples of GADTs. Neither of them seem to be particularly good or compelling examples. One can write typed interpreters in the final-tagless style, with no GADTs. Writing append function whose type says the length of the resulting list is the sum of the lengths of the argument lists is cute. However, this example does not go too far, as one discovers when trying to write List.filter for length-parameterized lists. The ML2010 talk on GADT emulation specifically used a different illustrating example: a sort of generic programming, or implementing N-morphic functions: http://okmij.org/ftp/ML/first-class-modules/first-class-modules-talk-notes.pdf An interesting talk to read. But unless I'm mistaken that isn't using ocamls new GADT types but implements it own. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] GADT example for universal list
Hi, I played some more with GADTs, this time with an universal list. Well, not truely universal as you can't store arbitrary types. Only Int and Float are allowed here: type _ kind = (* used for searching in the list *) | Int_kind : int kind | Float_kind : float kind type value = (* box the values so we have runtime type information *) | Int of int | Float of float type list = (* the universal list *) | Nil : list | Cons : value * list - list (* find the first value in the list of a given kind *) let rec get : type a . list - a kind - a = fun l k - match (k, l) with | (_, Nil) - raise Not_found | (Int_kind, Cons (Int x, xs)) - x | (Float_kind, Cons (Float x, xs)) - x | (_, Cons (_, xs)) - get xs k (* print out list *) let rec print = function | Nil - print_newline () | Cons ((Int x), xs) - Printf.printf %d x; print xs | Cons ((Float x), xs) - Printf.printf %f x; print xs (* testing *) let empty = Nil let l1 = Cons ((Int 1), empty) let l2 = Cons ((Float 2.), l1) let () = print l2 let i = get l2 Int_kind let f = get l2 Float_kind;; (* 2.00 1 type _ kind = Int_kind : int kind | Float_kind : float kind type value = Int of int | Float of float type list = Nil : list | Cons : value * list - list val get : list - 'a kind - 'a = fun val print : list - unit = fun val empty : list = Nil val l1 : list = Cons (Int 1, Nil) val l2 : list = Cons (Float 2., Cons (Int 1, Nil)) val i : int = 1 val f : float = 2. *) At first glance you might think: Why does that need GADTs? Why not simply use type value = Int of int | Float of float for this? Take a close look at the get funktion: val get : list - 'a kind - 'a = fun It does not return a value but directly the unboxed type. Because the value is unboxed you can directly use it and ocaml will detect if you screw up the type like in: let () = Printf.printf %d\n (get l2 Float_kind);; Error: This expression has type float but an expression was expected of type int MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Wanted: GADT examples: string length, counting module x
Raoul Duke rao...@gmail.com writes: On Mon, Mar 19, 2012 at 9:13 AM, Jesper Louis Andersen If you want to play with dependent types, there are two ways which seem popular at the moment: Agda or Coq. and some not popular ones... http://www.ats-lang.org/ http://sandycat.info/blog/deptypes-shen/ et. al. Neigther of which are examples for GADT in ocaml. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?
Markus Weißmann markus.weissm...@in.tum.de writes: Taking one step back, what you actually want is a data structure that: -is a container for elements of some type 'a -some elements are tagged with a state tag 'b -the elements have an ordering -you can add elements -you can tag elements with a state tag -you can remove the current element -you can remove the tag from the current element -you can get the current element -you can move the current element one step (forwardbackward) -you can get the tag of the current element -you can move the current element one step to the next/prev tagged one No, that was just the simplest example I could think of. And you want this structure to be fast, right? Or do you only care about the concrete implementation? You should think about the type of the elements and tags you want to be able to use: Do they have a natural ordering? Are they hashable? I would advice you to 1st write the interface with the functions you require, then think about the implementation. I'm under the impression that you have a decent C implementation and want to transfer that to OCaml at all costs. I want a set of standard containers that are combinable so that given an element from any one of the containers it can be removed from all of them efficiently. Efficient doesn't allways equal fast. What I don't want is an operation that should be O(1) to suddenly take O(log n). More generally given an element from any one of the containers all the normal operation of any other container should be possible. All containers should have a remove this element operation. Lists have a next. Double linked lists have next and prev. Priority queue should have change priority. Queue might have move to front and move to back. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?
Gabriel Scherer gabriel.sche...@gmail.com writes: I think this is a case of using a fancy new construction when a simpler construction would do just as well. For some reasons some people, even beginners, are absolutely fond of first-class modules, and insist on using them in situation where they're really not needed. I not eager to see what they will do with GADTs... In your case, you could simply use ('a - 'a dllist) instead of your Accessor module. Your code works just fine with the following definitions of make_accessor and get_next: let make_accessor f = f (* useless, for code compatibility *) let get_next get y = (get y).next I think learning *not to use* fancy features is just as fun as using them. Sure. I think you missed the For the fun of it right at the begining. One advantage of using a first class module though is that they are extensible. I could define 2 accessor modules: let module DList_Accessor = struct type a = c let get_next x = get_next x.dlist let get_prev x = get_prev x.dlist let set_next x = set_next x.dlist let set_prev x = set_prev x.dlist end let module PrioQueue_Accessor = struct type a = c let get_next x = get_next x.queue let get_prev x = get_prev x.queue let set_next x = set_next x.queue let set_prev x = set_prev x.queue let get_prio x = get_prio x.queue let set_prio x = set_prio x.queue end Now any function that operates on (module DList) can also be called with (module PrioQueue). You can't do that with closures or records of closures. You can do it with objects though. Note: you probably wouldn't implement a priotity queue as doubly linked list. Just an example. On Fri, Mar 9, 2012 at 8:50 AM, Goswin von Brederlow goswin-...@web.de wrote: Gabriel Scherer gabriel.sche...@gmail.com writes: I have implemented a small example in two versions: one with external dllist (I'm using the Batteries Dllist module), which indeed needs a small hack to bootstrap a single-element cycle, and one with inline pointers as you show, using simple value recursion. In the second case I also have a small abstraction layer for traversals -- only instantiated in the 'iter' case currently -- that could save some code duplication; that must be one of the closures tricks you were thinking of, but I don't claim that it's as efficient than duplicated specialized traversal functions. The code is here: https://gitorious.org/gasche-snippets/outside-or-inline-lists/blobs/master/goswin_dllists.ml For the fun of it I wrote a version with first class modules. It is pretty much like your inline solution with records of closures, just a different syntax. The one thing I can't eliminate is the code duplication for the initialization. I'm starting to believe there is no way around using Obj.magic or option types there. At least for the verry first task (in each dlist). The right-hand restriction in a let rec simply prevents putting that code into a function. For DList the duplication is minimal because the container is so simple. But for more complex container more code would have to be duplicated. The code could be hidden behind camlp4 though. MfG Goswin -- (* Type for doubly linked list *) type 'a dlist = { mutable next : 'a; mutable prev : 'a; } (* Module to access a container in another type *) module type Accessor = sig type a val get : a - a dlist end (* create an access module from a closure *) let make_accessor : 'a . ('a - 'a dlist) - (module Accessor with type a = 'a) = fun (type c) (fn : c - c dlist) - let module M = struct type a = c let get x = fn x end in (module M : Accessor with type a = c) (* Iterator over a DList *) let get_next : 'a . (module Accessor with type a = 'a) - 'a - 'a = fun (type a) x y - let module D = (val x : Accessor with type a = a) in (D.get y).next (* Now lets use this: *) (* A task has 2 DList: all and state *) type task = { all : task dlist; state : task dlist; name : string; } (* Accessors for the two DLists *) let all = make_accessor (fun task - task.all) let state = make_accessor (fun task - task.state) (* Some sample tasks *) let rec task1 = { all = { next = task2; prev = task2; }; state = { next = task1; prev = task1; }; name = task1; } and task2 = { all = { next = task1; prev = task1; }; state = { next = task2; prev = task2; }; name = task2; } let () = Printf.printf all_next = %s, state_next = %s \n (get_next all task1).name (get_next state task1).name MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?
Gerd Stolpmann i...@gerd-stolpmann.de writes: Am Donnerstag, den 08.03.2012, 06:11 +0100 schrieb Goswin von Brederlow: Gabriel Scherer gabriel.sche...@gmail.com writes: So then you need mutable option types or mutables that you initialize with dummy values and then overwrite with the real thing once all members of a cycle are created. Or some other trickery to make it work. Overall cycles are generally not good. I believe the trick you need (either passing a dummy value of your type, or Obj.magic) is less ugly that your Camlp4 solution for inline access. If I really needed absolute performance, I'd use the inline version just like in C, with mutable fields, but without preprocessor sugar. Preprocessing could avoid a bit of duplication (corresponding to manual inlining) on the structure-definition side, but wouldn't simplify the structure-using code much. How? type task = { mutable all_next : task; mutable all_prev : task; mutable state_next : task; mutable state_prev : tast; ... } Now how do yout write DList.insert, DList.remove, DList.iter, ...? I'm looking for some nice tricks using closures, functors, first class modules or something to make it look pretty and easy to use. There is a solution with functors. However, I don't dare to predict how fast it is (functors add additional indirection): module type DLINKABLE = sig type t val next : t - t option val prev : t - t option val set_next : t - t option - unit val set_prev : t - t option - unit end module type DHEAD = sig type dlist type t val first : dlist - t option val last : dlist - t option val set_first : dlist - t option - unit val set_last : dlist - t option - unit end module DLIST(H : DHEAD)(L : DLINKABLE with type t = H.t) = struct (* Here define operations over lists, e.g. *) let remove (dl:H.dlist) (x:D.t) = ( match D.prev x with | None - H.set_first dl (D.next x) | Some p - D.set_next (D.next x) ); ( match D.next x with | None - H.set_last dl (D.prev x) | Some n - D.set_prev (D.prev x) ) end For a concrete application you would only have to implement DLINKABLE and DHEAD, and then apply DLIST on these implementations. If elements are members of several lists, do this for each list separately. I guess you could define macros for generating DLINKABLE and DHEAD (even the macros built into camlp4 could be good enough for this). Gerd That is a start. But you used an open ended doubly linked list. This makes it simpler because you can have an empty list and initialize items to not be linked to anything before inserting them. That avoids the cycles. If you have a closed (cyclic or ring) doubly linked list the problem becomes how to create the first element that must link to itself. Can you come up with something without option type or Obj.magic? Something that doesn't duplicate the initialization of the DList for each instance. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?
Gabriel Scherer gabriel.sche...@gmail.com writes: I have implemented a small example in two versions: one with external dllist (I'm using the Batteries Dllist module), which indeed needs a small hack to bootstrap a single-element cycle, and one with inline pointers as you show, using simple value recursion. In the second case I also have a small abstraction layer for traversals -- only instantiated in the 'iter' case currently -- that could save some code duplication; that must be one of the closures tricks you were thinking of, but I don't claim that it's as efficient than duplicated specialized traversal functions. The code is here: https://gitorious.org/gasche-snippets/outside-or-inline-lists/blobs/master/goswin_dllists.ml For the fun of it I wrote a version with first class modules. It is pretty much like your inline solution with records of closures, just a different syntax. The one thing I can't eliminate is the code duplication for the initialization. I'm starting to believe there is no way around using Obj.magic or option types there. At least for the verry first task (in each dlist). The right-hand restriction in a let rec simply prevents putting that code into a function. For DList the duplication is minimal because the container is so simple. But for more complex container more code would have to be duplicated. The code could be hidden behind camlp4 though. MfG Goswin -- (* Type for doubly linked list *) type 'a dlist = { mutable next : 'a; mutable prev : 'a; } (* Module to access a container in another type *) module type Accessor = sig type a val get : a - a dlist end (* create an access module from a closure *) let make_accessor : 'a . ('a - 'a dlist) - (module Accessor with type a = 'a) = fun (type c) (fn : c - c dlist) - let module M = struct type a = c let get x = fn x end in (module M : Accessor with type a = c) (* Iterator over a DList *) let get_next : 'a . (module Accessor with type a = 'a) - 'a - 'a = fun (type a) x y - let module D = (val x : Accessor with type a = a) in (D.get y).next (* Now lets use this: *) (* A task has 2 DList: all and state *) type task = { all : task dlist; state : task dlist; name : string; } (* Accessors for the two DLists *) let all = make_accessor (fun task - task.all) let state = make_accessor (fun task - task.state) (* Some sample tasks *) let rec task1 = { all = { next = task2; prev = task2; }; state = { next = task1; prev = task1; }; name = task1; } and task2 = { all = { next = task1; prev = task1; }; state = { next = task2; prev = task2; }; name = task2; } let () = Printf.printf all_next = %s, state_next = %s \n (get_next all task1).name (get_next state task1).name -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
[Caml-list] Can one implement greedy/inline data structures in ocaml?
Hi, I'm wondering if it is possible to implement greedy/inline data structures in ocaml in a generic and reusable way. Maybe you don't know what greedy/inline data structures are or maybe I'm using the wrong name. So here is what I mean: Say you have a task structure that is in 2 doubly linked list. One list links all the structures together and the second list links structures with the same internal state (running, sleeping, waiting, ...). In C you would then have something like this: struct task { DLIST(task, all); DLIST(task, state); foo_t data; } The DLIST makro adds the prev/next pointers to the struct, one set prefixed with all, the other with state. There would also be makros to to init, insert, remove and iterate of the two lists. The internal pointer needed for the DLIST are part of the struct task. The big advantage of this (and why I ask) is that given a task one can esily remove it from both lists. On the other hand if the list is external to the struct and only contains a pointer to the task then removing a task becomes difficult. The task then needs pointers to each of the lists data structures creating cycles. Not good for ocaml. It also would waste memory for 2 pointers (per list). So how would you design such greedy/inline data structures in ocaml? Or how else would you design containers (list, doubly linked list, queue, ...) so that things can be in multiple containers and removing them is fast? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?
Edgar Friendly thelema...@gmail.com writes: On 03/07/2012 09:41 AM, Goswin von Brederlow wrote: The task then needs pointers to each of the lists data structures creating cycles. Not good for ocaml. It also would waste memory for 2 pointers (per list). Cycles are fine for ocaml, pointers are pretty cheap, and I think the answer to your question is no - there's a regularity to values that's required by the garbage collector, and what you seem to want is the ability to inline nested structures without pointers, and OCaml's data representation doesn't allow this, mainly because of the tag word needed by the GC. Beware premature optimization. E. The problem with cycles is that you need to define all structures in a cyled atomically as the same time. That means using let rec foo = ... and bar = ... and sometimes needing to lift function calls out of the initialization. For example and from memory you couldn't write this: let rec task = { all_list = all_list; state_list = state_list; foo = bar; } and all_list = DList.create_item all_list_head task and state_list DList.create_item state_list_head task At the time the DList.create_item is called the task binding is still incomplete and you get an error that it can't be on the right-hand side. So then you need mutable option types or mutables that you initialize with dummy values and then overwrite with the real thing once all members of a cycle are created. Or some other trickery to make it work. Overall cycles are generally not good. Inlining the DList wihtout another indirection like in C would be nice but would certainly require some camlp4 code to pull off. But it would certainly be enough to pull in a pointer to the list structure as long as you don't also need a back pointer inside the list structure. The cycle is the problem there, just one direction is fine. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...
Gerd Stolpmann i...@gerd-stolpmann.de writes: What I did not understand up to now: It is really easy to keep all this information in a helper data structure. Why extend Bigarray? Gerd A helper data structure would mean keeping track of that information on my own complicating the source and wasting memory at runtime. On the other hand what I suggested only makes use of what is already there in the Bigarray data structure. The functions would only make that information available to the user. I'm not a fan of duplicating information that is already there. Worst case the helper data structure could be wrong due to some bug in the code. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...
Gerd Stolpmann i...@gerd-stolpmann.de writes: Am Dienstag, den 28.02.2012, 21:17 +0100 schrieb Goswin von Brederlow: Hi, I'm implementing a RAID in userspace using ocaml and NBD (Network Block Device) as protocol to export the device. For this I'm using Bigarray.Array1 as buffer for data and wrote myself the right read(), write(), pread() and pwrite() stubs. The advantage of this (over strings) is that Bigarrays are not copied around by the GC so they don't need to copy data around between the Bigarray and a temporary buffer in the C stub. I used the same approach for PlasmaFS. The bigarray-based reads and writes are really missing in the stdlib. (If anybody wants to experiment, look into the Netsys_mem module of Ocamlnet, and use the functions mem_read and mem_write.) Btw, you should try to allocate the bigarrays in a page-aligned way. This makes the syscalls even more efficient. Which is actualy a requirement for libaio. At least alignment to blocksize. My libaio bindings include a stub to create an aligned Bigarray. In my use case, I did not write to devices directly, and could assume that the blocks are backed by the page cache. So I did not run into this problem you describe in the following. For efficiency each request stores all its data in a single Bigarray.Array1. For reasons of the RAID implementation large requests are split into 4k chunks using Bigarray.Array1.sub and grouped into stripes. The stripes are then acted upon independently until all stripes involved in a request are finished. Then the request is completed. Now I get a problem when 2 requests come in that overlap. Say I get a write request for blocks 0 - 6 and a read request for blocks 4-9 in that order. Since I already have the data for block 4-6 in memory from the write request I do not want to reread them from disk. On the stripe level the data looks like this: |W0 W1 W2 W3 W4 W5 W6 . . . | write 0-6 |W4 W5 W6 R7 R8 R9 | read 4-9 For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9 in two writes. I think I would like to avoid sending each stripe chunk out seperately. Why not? This seems to be an elegant solution, and I don't see why this should make the accesses slower. The time for the extra context switches in negligible compared to the disk accesses. A lot of the time data will just move between caches so disk speed is secondary. And sending each chunk seperately would mean up to 32 context switches instead of 1. And writing to a socket in chunks can lead to the data being send out in frames less than the MTU which would be wastefull. So I think there is some gain in limiting this. On the other hand I could implement (p)readv() and (p)writev() stubs. Anyway, my question now is how to detect which subarrays in the stripes are part of a common larger array? Do I need to write my own C stub that looks into the internas of the arrays to see if they share a common ancestor? I think that would be preferable to tracking the origin of each subarray myself. Yes, subarrays are tracked, but this feature exists only to catch the right moment for unmmapping bigarrays (if needed). As far as I remember, this is not tracked as a sub/super relationship, but the bigarrays accessing the same buffer share then the same buffer descriptor, and when the use count drops to 0, the buffer is finally destroyed. So, you cannot say which bigarray is the original one, and it can even be that the original bigarray is already deallocated but the backing buffer is not yet. All subarrays share a struct caml_ba_proxy, as you say to catch the right moment for unmmapping bigarrays. So two subarrays are part of the same bigarray if they have the same proxy object. That seems like an easy enough test. Which one is the original bigarray doesn't matter, at least to me. On a similar note how does Bigarray.Array1.blit handle arrays that are part of the same larger array and overlap? -- I'm missing functions like: val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t Create a merged sub-array of 2 adjacent sub-arrays of the same big array. This function would be possible to implement. The requirement would be that both bigarrays use the same buffer descriptor. val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool True if the 2 (sub-)arrays are part of the same big array. I would not call it same_parent, but same_backing_buffer. Maybe related would be better? val offset : ('a, 'b, 'c) t - int Offset of the sub-array in the underlying big array. I think this information is in the current implementation not available. As buffer sharing is also possible after reshaping, this is also not meaningful in the general case. offset = array-data - array-proxy-data; After reshaping the offset would be the offset into the original array after reshaping
[Caml-list] Bigarray question: Detecting subarrays, overlaps, ...
Hi, I'm implementing a RAID in userspace using ocaml and NBD (Network Block Device) as protocol to export the device. For this I'm using Bigarray.Array1 as buffer for data and wrote myself the right read(), write(), pread() and pwrite() stubs. The advantage of this (over strings) is that Bigarrays are not copied around by the GC so they don't need to copy data around between the Bigarray and a temporary buffer in the C stub. For efficiency each request stores all its data in a single Bigarray.Array1. For reasons of the RAID implementation large requests are split into 4k chunks using Bigarray.Array1.sub and grouped into stripes. The stripes are then acted upon independently until all stripes involved in a request are finished. Then the request is completed. Now I get a problem when 2 requests come in that overlap. Say I get a write request for blocks 0 - 6 and a read request for blocks 4-9 in that order. Since I already have the data for block 4-6 in memory from the write request I do not want to reread them from disk. On the stripe level the data looks like this: |W0 W1 W2 W3 W4 W5 W6 . . . | write 0-6 |W4 W5 W6 R7 R8 R9 | read 4-9 For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9 in two writes. I think I would like to avoid sending each stripe chunk out seperately. On the other hand I could implement (p)readv() and (p)writev() stubs. Anyway, my question now is how to detect which subarrays in the stripes are part of a common larger array? Do I need to write my own C stub that looks into the internas of the arrays to see if they share a common ancestor? I think that would be preferable to tracking the origin of each subarray myself. On a similar note how does Bigarray.Array1.blit handle arrays that are part of the same larger array and overlap? -- I'm missing functions like: val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t Create a merged sub-array of 2 adjacent sub-arrays of the same big array. val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool True if the 2 (sub-)arrays are part of the same big array. val offset : ('a, 'b, 'c) t - int Offset of the sub-array in the underlying big array. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Fwd: interval trees
Francois Berenger beren...@riken.jp writes: Hello, Anyone can translate this into being tail recursive if it's possible: let rec interval_tree intervals = match intervals with [] - Empty | _ - let x_mid = median intervals in let left, mid, right = partition intervals x_mid in let left_list = L.sort leftmost_bound_first mid in let right_list = L.sort rightmost_bound_first mid in Node (x_mid, left_list, right_list, interval_tree left, interval_tree right) I'm afraid my program could crash on a very long list of intervals. Thanks a lot, F. Each recursion halves the lists, right? That means you need a very very very long list to exceed the recursion depth. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Fwd: interval trees
Francois Berenger beren...@riken.jp writes: Hello, I did a naive implementation of interval trees for float intervals. It is available here: https://github.com/HappyCrow/interval-tree I wonder if it is possible to construct the trees in a tail recursive fashion. Maybe I knew how to do this when I was still at university. Regards, Francois. | Node of (* x_mid left_list right_list left_tree right_tree *) float * interval list * interval list * interval_tree * interval_tree Why interval list? You only need a single interval in leafes and none in other nodes (although it can help to store min and max in each node). You are also missing insert and remove operations, which is the actually hard part in this. Inserting an interval might require merging the rightmost interval left of the root and the leftmost interval right of the root. So you would have 2 removals and one insertion of a combined interval, which complicates balancing the whole thing efficiently. That is the part I'm struggling with. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Hashtbl and security
Gerd Stolpmann i...@gerd-stolpmann.de writes: Well, looks like that the upper 32 bits aren't considered at all for computing the hash value. This is for sure surprising - I would have expected that these bits somehow go in to the final value (e.g. that they are xor-ed with the lower half). My guess is that this was done to ensure that an int-to-something hashtable can be marshalled from 64 to 32 bit systems. Not sure for what this is important, but it looks like a nice symmetry. Marshaling hashtables of ints might be an argument. BUT: # Hashtbl.hash (1 lsr 40);; - : int = 0 # Hashtbl.hash (Int64.of_int (1 lsr 40));; - : int = 0 # Hashtbl.hash (Int64.of_int (1 lsr 40 + 1));; - : int = 1 In trunk the method for ensuring this seems to have changed (see function caml_hash_mix_intnat in http://caml.inria.fr/cgi-bin/viewvc.cgi/ocaml/trunk/byterun/hash.c?revision=12072view=markup). Gerd Lets hope that fixes this. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Hashtbl and security
AUGER Cédric cedric.au...@lri.fr writes: More seriously, hashtables are here to store values, so you don't want want to have a hash value bigger than (2^30)-1. In fact even such a value is too big for that purpouse IMHO, since on a 4Gio RAM computer, it will take half it to store the table, assuming each entry only takes 4 octets, that is it would be ok only for storing just the structure of the table (each entry has to be a pointer), so that in practice, either your table is ridiculously big to store too few values either it is relevant to have such a size and in this case your computer will spend all its time to swap since you won't have enough RAM. But I have 128GB ram. :) That allows for a hashtable of size 2^33 entries pointing at simple values (e.g. ints) without swapping. And ram sizes get bigger and bigger. [Doesn't a Hashtbl of ints store the ints directly? That would allow 2^34 entries.] Clearly you cannot a bijection from 63 bits integers to 31 bits integers; so there are a lot of collisions. I do not have a 64 arch, but I guesse that if you hash 2^48+168 you will get 168. I guess such a function is ways to be ideal, since when playing with bitvectors having two integers which differs by only one bit is very common, but at least it is a fast function, and has a good repartition (you have 2^32 collisions exactly per bucket). Ignoring the upper bits makes for a verry verry poor hash function. Also means it is 100% trivial to create a DOS by tuneing your input values to only differ in the upper bits. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Hashtbl and security
Gerd Stolpmann i...@gerd-stolpmann.de writes: Gerd Stolpmann i...@gerd-stolpmann.de writes: You are forgetting a variable in this. To create a DOS in the hashtable you need to hit the same bucket again and again. And bucket = hash % size. You focused on the hash but lets talk about the size for a moment. The size is rather limited and large hashtabels simply won't increate size anymore and allways have lots of collisions. So even if someone isn't actively trying to create DOS the performace still tanks as you add more items. I cannot completely follow here. If you add more elements, the bucket array is resized. The ocaml Hashtbl does this when the number of elements exceeds half of the array size. The argument of Hashtbl.create is only the initial size of the bucket array. The upper bound is Sys.max_array_length, of course. Granted, on 32 bit platforms it is low (4M). But nobody of sound mind would run ocaml programs on 32 bit for production purposes. (If you do, consider to switch. You can use more memory and get a little performance boost, not only for ocaml programs.) Gerd In practice I see a serious performance degradation with large hashtables. So much so that I had to use an array of hashtables or hashtable of hashtables and split the values across them. Maybe the size of the hashtable indeed isn't the problem but the hash function. Here is something odd: # for i = 1 to 63 do Printf.printf %d %d\n (1 lsl i) (Hashtbl.hash (1 lsl i)) done;; 2 2 4 4 8 8 16 16 32 32 64 64 128 128 256 256 512 512 1024 1024 2048 2048 4096 4096 8192 8192 16384 16384 32768 32768 65536 65536 131072 131072 262144 262144 524288 524288 1048576 1048576 2097152 2097152 4194304 4194304 8388608 8388608 16777216 16777216 33554432 33554432 67108864 67108864 134217728 134217728 268435456 268435456 536870912 536870912 1073741824 0 2147483648 0 4294967296 0 8589934592 0 17179869184 0 34359738368 0 68719476736 0 137438953472 0 274877906944 0 549755813888 0 1099511627776 0 219902322 0 4398046511104 0 8796093022208 0 17592186044416 0 35184372088832 0 70368744177664 0 140737488355328 0 281474976710656 0 562949953421312 0 1125899906842624 0 2251799813685248 0 4503599627370496 0 9007199254740992 0 18014398509481984 0 36028797018963968 0 72057594037927936 0 144115188075855872 0 288230376151711744 0 576460752303423488 0 1152921504606846976 0 2305843009213693952 0 -4611686018427387904 0 0 0 - : unit = () Is every value over a billion hashed to 0? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Hashtbl and security
Gerd Stolpmann i...@gerd-stolpmann.de writes: Hi, there was recently a security alert for web services that use hash tables to store web form parameters sent via POST (so that millions of such parameters can be sent in a single request). It is possible to keep the web service busy for hours with such a DoS (denial of service) attack. The type of attack boils down to a problem in most hash table implementations, namely that the hash functions are invertible, and it is possible for a malicious user to construct lots of keys that all map to the same bucket of the hash table, creating a mass collision. The text of the alert: http://www.nruns.com/_downloads/advisory28122011.pdf I'd like to discuss this issue, because it is not restricted to the processing of web requests, but may also occur for all other data coming from untrusted sources. The web is only the most exposed area where this issue exists. So how is Ocaml affected? The hash functions used in recent Ocaml releases are also insecure in the above mentioned sense (currently MurmurHash3, and even a simpler hash function in previous releases). A quick survey of the Internet revealed at least one site that tries to break it. Probably a good cryptographer could do it in minutes. A pure Hashtbl.add of the constructed keys does not yet lead to the performance degradation, but a Hashtbl.replace, and of course Hashtbl.find after the table is built up will. So it depends very much of the details of the programs whether they are affected or not. I've just checked that Ocamlnet uses only Hashtbl.add to collect POST parameters, so it is not directly vulnerable. But if the crafted request is actually served by a handler, the handler would get a degraded table, and could show in turn bad performance (again leading to DoS). What are possible fixes? 1) Avoid hash tables in contexts where security is relevant. The alternative is Set (actually a balanced binary tree), which does not show this problem. Unfortunately O(log n) O(1) and that can be a deciding factor in the overall runtime. Even for small n your code then runs 2,3,4,... times slower. 2) Use cryptographically secure hash functions. 3) Use randomized hash tables. The trick here is that there is not a single hash function h anymore, but a family h(1)...h(n). When the hash table is created, one of the functions is picked randomly. This makes it impossible to craft an attack request, because you cannot predict the function. I don't think 1) is viable - hash tables are too wide spread, and are loved by programmers because of their good performance. 2) would be good in extremely critical contexts - although it is then again questionable, because it is likely to have worse performance than 1). So, the question is how to do 3). I see two problems here: a) how to define the family of hash functions. Is it e.g. sufficient to introduce an initialization vector for the Murmurhash algorithm, and fill it randomly? How to get a random number that is good enough? b) the Hashtbl in the standard library does not allow it to set the hash function dynamically. Maybe one can get this effect by using first-class modules (haven't checked). Anyway, I think these problems are difficult enough to deserve some discussion here. At least I cannot solve them immediately, and this problem may exist for lots of Ocaml applications. Gerd You are forgetting a variable in this. To create a DOS in the hashtable you need to hit the same bucket again and again. And bucket = hash % size. You focused on the hash but lets talk about the size for a moment. The size is rather limited and large hashtabels simply won't increate size anymore and allways have lots of collisions. So even if someone isn't actively trying to create DOS the performace still tanks as you add more items. And this isn't only a problem in ocaml. The glib hashtable also has a maximum size in the 32bit range for example. So for ocaml the first thing that needs to be fixed is to allow larger size arrays of buckets. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Hashtbl and security
Xavier Leroy xavier.le...@inria.fr writes: On 01/01/2012 01:52 PM, Richard W.M. Jones wrote: On Fri, Dec 30, 2011 at 06:06:26PM +0100, Xavier Leroy wrote: Indeed. The optional seed parameter to Hashtbl.create does exactly this in the new implementation of Hashtbl (the one based on Murmur3). It may be worth noting that Perl solved this problem (back in 2003) by unconditionally using a seed which is a global set to a random number during interpreter initialization. That's how my initial reimplementation of Hashtbl worked, using the Random module to produce seeds, but I was told (correctly) that in security-sensitive applications it's better to leave the generation of random numbers under control of the programmer. For some applications Random.self_init might be good enough and for others stronger randomness is needed. Of course, you can trivially emulate Perl's behavior using the new Hashtbl implementation + the Random module. - Xavier Leroy It is also crucial if you are doing performance tests or debugging. You want the same behaviour on every run for that. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] OCaml maintenance status / community fork (again)
Mehdi Dogguy me...@dogguy.org writes: On 12/12/2011 11:59 AM, Benedikt Meurer wrote: On Dec 12, 2011, at 11:21 , Xavier Leroy wrote: - It complicates the lives of OCaml users, packagers, and 3rd-party library developers: what version should they use? what will be the basis for the packagers's distribution-specific patches? what happens if a library is compatible with one version but not the other? what if the community effort runs out of steam in a few years? If we can adopt the eglibc model, then the community thing will be the version shipped by distributions, i.e. the community thing is the OCaml for distributions/packagers, not an alternative to the official version. That way we do no longer need to maintain specific patches / versions for Debian, Red Hat, MacPorts, etc. This ensures that versions are compatible across different distributions (because they do no longer need to maintain their own set of patches). No, distributions won't start shipping the community distribution just because it exists. I cannot speak for Fedora (and others), but Debian/Ubuntu won't switch to the community distribution that easily. In fact, it may appear as a seperate source package at some point but won't replace INRIA's ocaml in Debian. That is probably what they said about eglibc too in the beginning. Now eglibc is used in Debian/Ubuntu and is (or because it is) a great success. Obviously nobody is going to say: Yes, lets start this project, we will ship it right away. It first has to proof itself. And first adopters will be those that have the most problems with the standard ocaml. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] OCaml maintenance status / community fork (again)
Xavier Leroy xavier.le...@inria.fr writes: On 12/08/2011 10:10 AM, Benedikt Meurer wrote: The relevant bug report PR/5404, which includes a backward compatible patch, is already waiting for a sign of life for 3 weeks now (maybe wait another 4 years to get the port fixed). More bile. What's so urgent about it? The next release of OCaml is 3-6 months in the future; your suggestion will be examined by then. You know, some people have busy professional lives. I can only devote 1 to 2 days per months to OCaml maintenance, and would prefer to spend them on technical work rather than on massaging egos or putting out the fires that you light up on this list. I don't want to be aggressive or offensive and if it comes across as that then please forgive me. No need to massage my ego also. I would rather be told to piss off or that a patch will never be added because it is stupid than have you play nice and have patches be left unatended in mantis. Because how else will I ever learn? That said ... So maybe PR/5404 was a bad example. How about these? [1] 1 month to acknowledge, 24 month silence since then [2] 1 month to acknowledge, 24 month silence since then [3] 11 month to acknowledge, 19 month silence since then [4] 16 month to acknowledge, 18 month silence since then [5] 15 month to acknowledge, 6 month silence since then [1] http://caml.inria.fr/mantis/view.php?id=4919 [2] http://caml.inria.fr/mantis/view.php?id=4909 [3] http://caml.inria.fr/mantis/view.php?id=4812 [4] http://caml.inria.fr/mantis/view.php?id=4987 [5] http://caml.inria.fr/mantis/view.php?id=5005 All of these have patches (or just needs a function declaration copied to the .h file). They don't need huge work consuming changes to the compiler and they don't add anything fundamentally new to it. They just improve/extend the existing stuff a little. For those to lazy to follow the links here is one of the patches that has been sitting there for 2 years now: -- --- ocaml-3.11.0.orig/otherlibs/unix/unixsupport.h +++ ocaml-3.11.0/otherlibs/unix/unixsupport.h @@ -20,6 +20,7 @@ #define Nothing ((value) 0) extern value unix_error_of_code (int errcode); +extern int code_of_unix_error (value error); extern void unix_error (int errcode, char * cmdname, value arg) Noreturn; extern void uerror (char * cmdname, value arg) Noreturn; --- ocaml-3.11.0.orig/otherlibs/unix/unixsupport.c +++ ocaml-3.11.0/otherlibs/unix/unixsupport.c @@ -263,6 +263,15 @@ return err; } +extern int code_of_unix_error (value error) +{ + if (Is_block(error)) { +return Int_val(Field(error, 0)); + } else { +return error_table[Int_val(error)]; + } +} + void unix_error(int errcode, char *cmdname, value cmdarg) { value res; -- Adding this does not take a week of your summer vacation but still it gets ignored. Sure it is more fun to write code for first-class modules and the core team has done a great job there. On the other hand trvial things are not being done and that is verry frustrating to the submitter. Frankly I've given up writing patches for ocaml because what is the point. They are not getting looked at. Through things like that you are loosing a workforce that could contribute at lot. On the other hand if the model of glibc / eglibc for example where used for ocaml the frustration would be reduced, the patches would get audited by others in the community and added to the one community fork before reaching the core team. You would get less patches, patches that are better tested and cleaner. Bad patches would get filtered out before they reach you. This is not a condemnation of you or the core team. It is probably not even something you can do anything about no matter how hard you try. It is probably just a matter of scale. As a project growes at some point there just seem to be a point where a single central core doesn't work anymore. Things pile up, things fall through the cracks, priotization leaves things starving. There plain aren't enough hours in the day to cover every single bug or reply to every mail on the mainlinglist. So let the community help and offload some work on them, like vetting a simple patch like the above, so that you have more time to concentrate on the difficult things like arm support or fist class modules (which I love by the way) or to take a deserved vacation. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Storing ocaml values outside ocaml's heap
William Le Ferrand william.le-ferr...@polytechnique.edu writes: Dear list, We are building a cache in ocaml and we're wondering if it would make sense to store ocaml values outside the reach of the gc. (gc on a 20GB cache hangs the process for a second or so). To run some experiments, we wrote a small library (https://github.com/besport/ ocaml-everlasting) that exposes two functions, get and set. When inserting a value, we copy recursively the blocs outside of the reach of the gc (and put the resulting value in some C array). When getting the value, we simply pass the pointer to the copied value to the ocaml code (the structure is still coherent and the value is directly usable). We also wrote an update function that compare a new value with the existing value in cache, to avoid unnecessary memory allocation/deallocation. It does not seems very stable though, but I don't know if it is a bug in the update function or simply because this approach is not reasonable. Do you have any thoughts? Is there any clever way to build a large cache in an ocaml app ? Thanks in advance for any tips! Best William For a generic case you will have to inspect the values, tell the GC about all the other ocaml values it points to and track any heap values that have a reference to your external value so you don't delete it before it is dead. In short you need to write a GC. For special cases you can store completly self contained values, e.g. a array of floats or a record of non pointer types. The simplest of types. For a cache you often have millions of objects of the same type. And often those contain pointers but do not share them. Then it becomes feasable to copy everything they point to into the cache as well. Or write a little C glue to store the data in an abstract or custom block that does not contain pointers to outside the block. Often you can also store the data much more compact than ocaml allows and thereby reduce the memory footprint and increase cache efficiency. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] OCaml maintenance status / community fork
Gerd Stolpmann i...@gerd-stolpmann.de writes: Hi all, I will not jump in the how to save OCaml from dying because nothing moves discussion. But just in the nothing moves discussion. On Tue, Dec 6, 2011 at 2:52 PM, ivan chollet ivan.chol...@gmail.com wrote: The current status of OCaml is more than stable enough to serve its goals, which are to teach computer science to french undergrads and provide a playground for computer languages researchers. First, french undergrads sadly often still use camllight... Which is not the case for example of Harvard undergrad (http://www.seas.harvard.edu/courses/cs51/lectures.html) and some UPenn one (http://www.seas.upenn.edu/~cis341/). But you are right that I can't find any well known university out of France using OCaml to teach computer science... Well, if you ask whether _any_ FP language is taught, the results won't be much better. I'm currently doing consulting for a web company (in Germany) - around 60 developers, many fresh from the University. There are only three guys knowing FP languages at all - one Scala, one Erlang, and one R. It's a complete failure of the academic education. IMHO it does not matter which FP language you are taught in. The point is that the students understand the ideas, and that they recognize them as relevant. These web developers here in the company have no clue that they actually developing a big continuation-style FP program. Gerd In Tuebingen we started with scheme in the first year and later there were several classes on FP languages and a few using ocaml, for example the compiler construction class. But the later classes are pickchoose, you just need enough credits from each of the 3 major groups, so people can probably completly avoid FP based curses. Personally I think the introduction to computer languages class we had is a must. How else do you even know what is out there and if you like it? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] OCaml maintenance status / community fork
oliver oli...@first.in-berlin.de writes: Hello, during the last years, more than one person mourned about this or that dark sides of OCaml. Even some of the mourning and the proposals had mentioned good ideas and had positive motivation, after a while it became clear, that the same people with the one or the other good idea, failed badly in other areas. Good, that they did not have had too much influence in the development of OCaml. Even in general I like the community/bazaar, I think in case of OCaml, there is a lot of high knowledge in the core team, which was criticized by others already, but in the long run, it turned out that the core team had their reasons for a lot of decisions, which were criticized. Ocaml of course will also have some history-related issues that might be changed, but maybe also a lot of decisions inside, which relies on theoretical reasoning. That is no excuse for not reacting to bug or feature patches. If there is a reason not to accept a patch then that can be communicated. There is no excuse for silence. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Some comments on recent discussions
Ashish Agarwal agarwal1...@gmail.com writes: A standard library does not imply big or that it is part of the standard distribution. Both Batteries and Core would make fine standard libraries. Neither is very big and both are independent of the standard distribution. But having 5 different standard libraries is annoying precisely because the library covers just basic concepts. We don't need 5 APIs for string functions. They are all easy to learn independently but a pain if you have to know them all. A beginner won't bother taking the time to do a google search and find which one to use; before that they'll use another language. We used to have 100 seperate little modules all over the place. Now we have 2 contenders for a standard library. I think we are going the right way there. Lets leave that allone for at least a year or two. And we do (initially) need 5 APIs for string functions and we need to test them all and decide which one fits best. And there can be more than one that should be kept. E.g. one API with exceptions and one with option types. But then all modules in the standard library should provide those APIs we want to keep consistently across all modules. Lets agree that we eventually want one standard library. Can we also agree that the standard library can be external to the compiler, like Batteries or Core is? I also think the debate about github, ocamlforge, etc is fruitless. People have different preferences and who knows what new tools will be developed a couple years from now. Instead of arguing about which one is best, we should accept that we will not agree on this. And why do we need to? We can aggregate the information from multiple sources and present it in a uniform way on a community website. ACK. Totaly different fruit. Even talking about a standard library is imho fruitless (at this point in time) and OT to the original problem. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] OCaml maintenance status / community fork
Benedikt Meurer benedikt.meu...@googlemail.com writes: Dear caml-list, During the last year or two it seems that time and interest in OCaml maintenance from the official OCaml development team is diminishing. It takes several months to get a patch reviewed (if at all), which is quite frustrating for OCaml contributors and even worse for OCaml users. I suspect that this is one of the top reasons why there are only a few active contributors to OCaml (and the number of active users, at least on the mailing list, is declining). I understand that INRIA does not necessarily pay people for full time maintenance jobs on OCaml (and Coq), and the official dev team is probably already doing as much as possible to maintain OCaml. Given that OCaml is such a nice language with a lot of useful frameworks available, it is too sad to see it loosing ground just because of it's closed development process and lack of time of the official team. I'd therefore propose to open up OCaml development to a wider range of developers / contributors, to ensure that OCaml will be ready for the (functional programming) future. There are already various OCaml forks in the wild, with different goals and patch sets, so simply starting another fork would be rather useless. Instead I'd suggest to bundle efforts in a new OCaml community fork, which is always based on the most recent upstream OCaml release (starting point would be 3.12.1 for now), and takes care to review and integrate pending patches as well as developing and testing new features. Let's say we'd name the fork OCaml-ng, then we'd try to release a new patch set every month or two, based on the official OCaml release, i.e. ocaml-3.12.1+ng201112 and so on, to get early testing and feedback (should work together closely with the Debian/Ubuntu/etc. OCaml maintainers). With this process, OCaml upstream could merge (tested) patches from OCaml-ng once they proved working in the wild, and thereby 1. maintenance overhead for INRIA people is reduced, 2. maintenance status of OCaml would be way better, 3. there would be a lot less frustration for possible contributors, and 4. users benefit from a better and more up to date OCaml. Now that does of course raise a few questions: 1. What is the opinion of the official development team / INRIA on this? 2. Who would help with the community fork? 3. What about infrastructure? Feedback and suggestions are welcome. Benedikt +1 for getting patches better/faster reviewd and included. I'm still waiting to hear back for my Bigarray patch to support 31bit integers. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] PlasmaFS, PlasmaKV, and MapReduce, version 0.5.1
Gerd Stolpmann i...@gerd-stolpmann.de writes: Hi, I've just released Plasma-0.5.1, fixing a possible lock-up. General information about Plasma: Plasma consists now of three parts, namely PlasmaFS, PlasmaKV, and Plasma Map/Reduce: * PlasmaFS is a distributed replicating filesystem. Unlike other such filesystems, it is transactional and exhibits transactions to the user. Also, it implements almost all of what is known as POSIX semantics, and it is mountable. * PlasmaKV is a key/value database on top of PlasmaFS. It is designed for ultra-high read workloads, and offers interesting properties borrowed from PlasmaFS (e.g. replication and ACID transactions). * Plasma Map/reduce implements a variant of the popular data processing scheme. All pieces of software are bundled together in one download. The project page with further links is http://projects.camlcity.org/projects/plasma.html There is now also a homepage at http://plasma.camlcity.org THIS IS NOW A BETA RELEASE! I'm searching for testers. Whoever has access to a cluster please check Plasma out! Plasma is installable via GODI for Ocaml 3.12. For discussions on specifics of Plasma there is a separate mailing list: https://godirepo.camlcity.org/mailman/listinfo/plasma-list Gerd Any plans to support fuse so the filesystem can be mounted locally? MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Odd failure to infer types
Gabriel Scherer gabriel.sche...@gmail.com writes: Indeed, a let-bound [] is generalized to a polymorphic ('a list), while a let-bound (ref []), or (Array.make n []), is not generalized. It would be unsound to do so as if you had a reference to a polymorphic ('a list) you could add elements of different types in the list. When a type variable is not generalized (polymorphic), it stays an unknown of the type system until it is unified to a known type: # let r = ref [];; val r : '_a list ref = {contents = []} # r := 1 :: !r;; - : unit = () # r;; - : int list ref = {contents = [1]} '_a is sometimes called a weakly polymorphic variable, in fact it is not polymorphic at all. It is just an unknown. In the code example above, the real type for '_a could be determined after two phrases in the top level. It may also happen as a compilation unit (if the code was written in a file instead of fed phrase-per-phrase to the toplevel). We wouldn't observe the intermediary '_a inferred type, as the type are inferred globally. I think you are missing the point. I totaly get why it must be '_a instead if 'a. My question is why it doesn't infer the full type from the call to the print function. In my case the type inferences outputs something like this (using the above example): # let r = ref [];; val r : '_a list ref = {contents = []} # r := 1 :: !r;; - : unit = () # r;; - : '_a list ref = {contents = [1]} I see my print call as equivalent to '1 :: !r'. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs