Re: [Caml-list] extending user-defined polymorphic variant types

2012-05-07 Thread Goswin von Brederlow
Dan Bensen danben...@att.net writes:

 I'm trying to write a functor that extends a user-supplied polymorphic
 variant type (PVT).  How do you declare the user's type in the signature
 for the argument to the functor without locking in the individual variant
 definitions? 
 The code below (in revised syntax) generates an error message that says
 the type isn't a PVT.

 code:

 module type Reader = sig type ast; end;

 module Make (Read: Reader) = struct
   type ast = [= Read.ast | `Lid of string];
 end;

 message:

 Error: The type Read.ast is not a polymorphic variant type

 How do you make ast a PVT while allowing the user to
 specify the variants?

Even if that did work how usefull would that be? You couldn't write

let foo = function
  | `Lid s - ()
  | x - Read.foo x

unless Read.foo allows [ Read.ast]. In which case you don't need a
functor but can just implement foo straight up.

What is your use case? What do you want to do?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] A shallow option type

2012-05-07 Thread Goswin von Brederlow
Jacques Garrigue garri...@math.nagoya-u.ac.jp writes:

 Sorry for all these mails.
 Looks like I don't think well when I'm sleepy...

 Anyway, I think that at last I have a reasonable (much simpler) solution.
 This still uses sopt_tag (i.e. lazy_tag-1), but this time the argument is
 the number of some constructors, so there is no extra cost for marshaling.
 The only values on which Sopt.some is not the identity are those with a
 single argument, which is additionally represented by an integer between 0 
 and last.
 Moreover, even if for some reason you build such a value using a real sum
 type, you still have Sopt.arg (Sopt.some v) = v, the only change being a loss
 of sharing (which is anyway not guaranteed by the ocaml specification).

 Jacques

 module Sopt : sig
  type +'a t
  val none : 'a t
  val some : 'a - 'a t
  val is_none : 'a t - bool
  val arg  : 'a t - 'a
  val depth : 'a t - int
 end = struct
  type 'a t = Obj.t
  let sopt_tag = Obj.lazy_tag - 1
  let none = Obj.new_block sopt_tag 1
  let last = 255
  let area = Array.create (last+1) none
  let () =
Obj.set_field none 0 (Obj.repr 0);
for i = 1 to last do
  let stub = Obj.new_block sopt_tag 1 in
  Obj.set_field stub 0 (Obj.repr i);
  area.(i) - stub
done
  let is_none x = (x = none)
  let is_sopt x =
Obj.is_block x  Obj.tag x = sopt_tag  Obj.size x = 1 
let i = Obj.obj (Obj.field x 0) in i = 0  i = last
  let depth x =
let x = Obj.repr x in
if is_sopt x then Obj.obj (Obj.field x 0) else -1
  let some (x : 'a) : 'a t =
let i = depth x in
if i  0 then Obj.magic x else
if i = last then invalid_arg Sopt.some else Obj.obj area.(i+1)
  let arg (x : 'a t) : 'a =
let i = depth x in
if i  0 then Obj.magic x else
if i = 0 then invalid_arg Sopt.arg else Obj.obj area.(i-1)
 end

What exactly is the point of specially tagged blocks? All you need is a
bunch of pointers to values to encode the depth. You can use the value
pointed at to encode the index the pointer is at and physical equality
to ensure it actualy is one of your pointers:

let area = Array.init (last+1) (fun i - ref i)

let is_sopt x =
  let r = Obj.repr x in
  Obj.is_block r  Obj.size x = 1  Obj.is_int (Obj.field r 0) 
  let i = Obj.obj (Obj.field r 0) in
  i = 0  i = last  Obj.obj r == area.(i)

# is_sopt 1;;
- : bool = false
# is_sopt area.(5);;
- : bool = true
# is_sopt (ref 5);;
- : bool = false

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] A shallow option type

2012-05-06 Thread Goswin von Brederlow
Gabriel Scherer gabriel.sche...@gmail.com writes:

 Thanks for the clearer use-case example.

 I think Andreas's comment is at a general level of language design:
 no, we don't want to add something close to what you're asking for in
 a language. His argumentation is spot on. What we could have is a type
 system with strong update, but that's dreaming about another language,
 not OCaml.

 In your case, have you considered having a private option type, that
 would be allocated by a call to a C primitive (building a genuine
 OCaml option, but outside the heap) instead of (Some foo) on the OCaml
 side? That would be completely safe, and you could still safely match
 it on the OCaml side.

I have considered that shortly. Setting the option would be simple. A
call to a C primitive that mallocs a block, fills in the GC metadata,
tag and pointer to the unit.

But what about removing the unit, setting the option to None again? At
what point would it be save to free the old option? The other thread
might be using it right now. Any kind of dynamic memory allocation will
require some coordination between the threads to free the block.

Now that I think about it again (and with what I say below for units)
there can be at most one option per tile. So I could pre-allocate an
array of them and assign each tile one option block to be reused
whenever the tile needs one.

 That said, it's only pushing the problem one step further: now you
 still need to allocate your Unit.t value outside the OCaml heap,
 otherwise you still have this problem. Is it the case in your
 application?

Yes that is the plan. The world has a fixed number of tiles and there
can be at most one unit per tile so I can pre-allocate an array of units
outside the heap large enough to never run out.

MfG
Goswin

 On Sat, May 5, 2012 at 6:22 PM, Goswin von Brederlow goswin-...@web.de 
 wrote:
 Andreas Rossberg rossb...@mpi-sws.org writes:

 On May 5, 2012, at 15.33 h, Goswin von Brederlow wrote:
 What I want is a

    type 'a shallow = NULL | 'a  (constraint 'a != 'b shallow)

 This is a form of negation, which cannot be expressed in conventional
 type systems. Just consider what it should mean in the presence of
 type abstraction: if you have

   M : sig
     type t
     ...
   end

 would `M.t shallow` be a legal type? You couldn't decide that properly
 without requiring that _every_ abstract type in every signature is
 annotated with a constraint saying that it is not shallow.

 True, so abstract types would have to be forbidden too becauseit can't
 be decided wether they are save or not. Since 'a shallow is an abstract
 type that would also forbid 'a shallow shallow.

 So constraint 'a != abstract would be the right thing.

 I have some ideas on how to implement this in a module as abstract
 type
 providing get/set/clear functions, which basically means I map None
 to a
 C NULL pointer and Some x to plain x. I know x can never be the NULL
 pointer, except when someone creates a 'a shallow shallow and sets
 Some
 None. That would turn into simply None.

 And how do you know that nobody else implements a _different_ type,
 say shallow2, that does the same thing? And a third party then
 constructs a value of type `int shallow2 shallow`?

 I don't and I can't. But such a type would be abstract so wouldn't be
 allowed by the above (reject abstract types).

 It seems to me that what you want effectively is a special case of non-
 disjoint union. Unfortunately, those are known to come with all kinds
 of problems, such as not being compositional.

 /Andreas

 What I want is to have an array of

    type t ={ ... }  (* Unit.t *)

 and a matrix of

    type tile = { ... mutable unit : Unit.t option; }

 that I can safely access from a C thread in parallel with ocaml without
 having to look the runtime system or individual tiles (the array and
 matrix would both be created outside the ocaml heap).

 The problem I have is that

    tile.unit - Some unit

 will allocate an option block on the heap and accessing that from C
 while the GC runs causes a race condition.


 What I don't want to do is use

    type tile = { ... mutable unit : Unit.t; }
    tile.unit - Obj.magic 0

 since then trying things out in the toplevel causes segfaults when the
 pretty printer prints the tile.unit and it would be easy to forget to
 compare the tile.unit to Obj.magic 0 on every use. I know my shallow
 type is basically nothing else but it adds a level of savety to it.

 Idealy I would even love to write

    match tile.unit with
      | NULL - ()
      | unit - do_something unit

 I guess some camlp4 magic could be used to transform such a match into

    match Shallow.as_option tile.unit with
      | None - ()
      | Some unit - do_something unit

 or similar constructs.

 MfG
        Goswin

 --
 Caml-list mailing list.  Subscription management and archives:
 https://sympa-roc.inria.fr/wws/info/caml-list
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

Re: [Caml-list] A shallow option type

2012-05-06 Thread Goswin von Brederlow
Goswin von Brederlow goswin-...@web.de writes:

 Andreas Rossberg rossb...@mpi-sws.org writes:

 On May 5, 2012, at 15.33 h, Goswin von Brederlow wrote:
 What I want is a

type 'a shallow = NULL | 'a  (constraint 'a != 'b shallow)

 This is a form of negation, which cannot be expressed in conventional
 type systems. Just consider what it should mean in the presence of
 type abstraction: if you have

   M : sig
 type t
 ...
   end

 would `M.t shallow` be a legal type? You couldn't decide that properly
 without requiring that _every_ abstract type in every signature is
 annotated with a constraint saying that it is not shallow.

 True, so abstract types would have to be forbidden too becauseit can't
 be decided wether they are save or not. Since 'a shallow is an abstract
 type that would also forbid 'a shallow shallow.

 So constraint 'a != abstract would be the right thing.

 I have some ideas on how to implement this in a module as abstract
 type
 providing get/set/clear functions, which basically means I map None
 to a
 C NULL pointer and Some x to plain x. I know x can never be the NULL
 pointer, except when someone creates a 'a shallow shallow and sets
 Some
 None. That would turn into simply None.

 And how do you know that nobody else implements a _different_ type,
 say shallow2, that does the same thing? And a third party then
 constructs a value of type `int shallow2 shallow`?

 I don't and I can't. But such a type would be abstract so wouldn't be
 allowed by the above (reject abstract types).

Actualy this could be solved. The problem of a int shallow2 shallow is
that both would be using NULL for None. But nobody says I must use
NULL. All it needs is a bit pattern that can't be a valid value. Any
pointer will do:

#include caml/mlvalues.h

static value nil = Val_unit;

CAMLprim value caml_shallow_null(value unit) {
(void)unit; // unused
return nil;
}

Now if shallow2 uses my nil as well that would be a serious bug in
shallow2 and pretty hard to do.

 It seems to me that what you want effectively is a special case of non-
 disjoint union. Unfortunately, those are known to come with all kinds
 of problems, such as not being compositional.

 /Andreas

 What I want is to have an array of

 type t ={ ... }  (* Unit.t *)

 and a matrix of

 type tile = { ... mutable unit : Unit.t option; }

 that I can safely access from a C thread in parallel with ocaml without
 having to look the runtime system or individual tiles (the array and
 matrix would both be created outside the ocaml heap).

 The problem I have is that

 tile.unit - Some unit

 will allocate an option block on the heap and accessing that from C
 while the GC runs causes a race condition.


 What I don't want to do is use

 type tile = { ... mutable unit : Unit.t; }
 tile.unit - Obj.magic 0

 since then trying things out in the toplevel causes segfaults when the
 pretty printer prints the tile.unit and it would be easy to forget to
 compare the tile.unit to Obj.magic 0 on every use. I know my shallow
 type is basically nothing else but it adds a level of savety to it.

 Idealy I would even love to write

 match tile.unit with
   | NULL - ()
   | unit - do_something unit

 I guess some camlp4 magic could be used to transform such a match into

 match Shallow.as_option tile.unit with
   | None - ()
   | Some unit - do_something unit

 or similar constructs.

 MfG
 Goswin

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] How do I declare a value of 'a t instead of '_a t in a module?

2012-05-06 Thread Goswin von Brederlow
Gabriel Scherer gabriel.sche...@gmail.com writes:

 You need to specify in the signature that ('a t) is contravariant:
   type +'a t

Thanks, that was what I was looking for. Why can I never find this in
the docs when I need it? :)

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] Implementation for a (nearly) typesafe shallow option type and a compiler bug?

2012-05-06 Thread Goswin von Brederlow
Hi,

as mentioned earlier I wanted to have a

type 'a shallow = NULL | 'a

that is like an 'a option but without the indirection.

The first idea was that no ocaml value can be the C NULL pointer (usualy
all bits set to 0) and the C NULL pointer points outside the ocaml heap
so the GC is fine with that too. So NULL could be encoded as C NULL
pointer and 'a used as is.

First problem is that this does not work for an 'a shallow shallow. And
there is no way to constrain 'a to not be a 'b shallow in the type
definition.

Second problem is that someone else might define a 'a shallow2 type with
the same idea and 'a shallow2 shallow would also not work. And with
abstract types it would be impossible to test for that even if ocaml
would allow negative constraints.


The common problem here is that NULL is not unique for each type. So I
thought of a way to make it unique. What I came up with is using a 
functor to create a unique NULL value for each instance. Source code at
the end.

Output:
NULL - None
Some 5 - Some 5
NULL - None
Some NULL - Some None
Some Some 5 - Some Some 5
1 2 3 4 
a b c d 
NULL - Some 70221466564508

As you can see from the output this solves the problem of 'a shallow
shallow and would also solve the 'a shallow2 shallow case.

But in the last line a new problem appears. Each instance of the functor
has a unique NULL element. But instances of the functor with the same
type are compatible. So an IntShallow2.t can be used instead of an
IntShallow.t but then the NULL does not match.

To me this looks like a bug in ocaml because the resulting type of the
functor application does not match the type I specified for the functor:

  module type SHALLOW =
sig
  type -'a t
  val null : 'a t
  val make : 'a - 'a t
  val as_option : 'a t - 'a option
end
  module Make : functor (Type : TYPE) - SHALLOW

module IntShallow2 :
  sig
type 'a t = 'a Shallow.Make(Int).t
val null : 'a t
val make : 'a - 'a t
val as_option : 'a t - 'a option
  end

The type I specified has 'a t abstract while the IntShallow2 has the
type 'a t more concret.

Restricting the type to what it should already be results in the correct
error:

module IntShallow2 = (Shallow.Make(Int) : Shallow.SHALLOW)
let () =
  Printf.printf NULL - %s\n (to_string (IntShallow2.null))

File shallow.ml, line 91, characters 42-60:
Error: This expression has type 'a IntShallow2.t
   but an expression was expected of type
 int IntShallow.t = int Shallow.Make(Int).t

Why is that?

MfG
Goswin

==
module Shallow : sig
  module type TYPE = sig type 'a t end
  module type SHALLOW =
sig
  type -'a t
  val null : 'a t
  val make : 'a - 'a t
  val as_option : 'a t - 'a option
end
  module type SHALLOWFUNCTOR = functor (Type : TYPE) - SHALLOW
  module Make : functor (Type : TYPE) - SHALLOW
end = struct
  module type TYPE = sig type 'a t end
  module type SHALLOW =
sig
  type -'a t
  val null : 'a t
  val make : 'a - 'a t
  val as_option : 'a t - 'a option
end
  module type SHALLOWFUNCTOR = functor (Type : TYPE) - SHALLOW

  module Make_intern =
functor (Type : TYPE) -
  struct
type -'a t
(* Dummy object unique to the type *)
let null = Obj.magic (ref 0)
let make x = Obj.magic x
let as_option x = if x == null then None else Some (Obj.magic x)
  end
  module Make = (Make_intern :  functor (Type : TYPE) - SHALLOW)
end

module Int = struct type 'a t = int end
  
module IntShallow = Shallow.Make(Int)
module IntShallowShallow = Shallow.Make(IntShallow)

let to_string x =
  match IntShallow.as_option x with
| None - None
| Some x - Printf.sprintf Some %d x

let to_string2 x =
  match IntShallowShallow.as_option x with
| None - None
| Some x - Printf.sprintf Some %s (to_string x)

module List = struct
  module Item = struct
type 'a t = 'a
  end

  module Shallow = Shallow.Make(Item)

  type 'a item = { mutable next : 'a list; data : 'a; }
  and 'a list = 'a item Shallow.t

  let null = Shallow.null
  let cons x y = Shallow.make { next = y; data = x; }
  let rec iter fn x =
match Shallow.as_option x with
  | None - ()
  | Some { next; data; } - fn data; iter fn next
end

let () =
  Printf.printf NULL - %s\n (to_string (IntShallow.null));
  Printf.printf Some 5 - %s\n (to_string (IntShallow.make 5));
  Printf.printf NULL - %s\n (to_string2 (IntShallowShallow.null));
  Printf.printf Some NULL - %s\n
(to_string2 (IntShallowShallow.make IntShallow.null));
  Printf.printf Some Some 5 - %s\n
(to_string2 (IntShallowShallow.make (IntShallow.make 5)));
  let x = List.null in
  let x = List.cons 4 x in
  let x = List.cons 3 x in
  let x = List.cons 2 x in
  let x = List.cons 1 x in
  List.iter (Printf.printf %d ) x;
  print_newline ();
  let y = List.null in
  let y = List.cons d y in
  let y = List.cons c y in
  let y 

Re: [Caml-list] A shallow option type

2012-05-06 Thread Goswin von Brederlow
Jacques Garrigue garri...@math.nagoya-u.ac.jp writes:

 Hi Goswin,

 Actually I remember discussing this issue with Xavier Leroy a long time ago,
 and we ended up concluding that this could be done in a clean way by using
 a bit pattern not yet used for ocaml values.

 Our idea was that rather than put some restriction on the shallowness of
 types (which is very hard to implement), one should just allow arbitrary 
 nesting
 of option types, and represent the sequence Some (Some (... ( Some None) 
 ...))
 as an integer.
 I think I even tried an implementation, but we then concluded that there
 was not enough demand to put it in the compiler.

The advantage of providing it by the compiler would be to allow pattern
matching.

 However there is nothing wrong with providing it as a library.
 The basic idea is that you have only two kinds of values in ocaml:
 integers, which have their lower bit to 1, and pointers, which must
 all be multiples of 4 (on 32-bit architectures).
 So  the pattern (v % 4) == 2 is not used.

Actualy isn't (v % 4) == 2 an exception?

caml/callback.h:
#define Make_exception_result(v) ((v) | 2)
#define Is_exception_result(v) (((v)  3) == 2)
#define Extract_exception(v) ((v)  ~3)

So returning an Sopt.none in a callback would be taken as exception. :)

 Here is the code:

 module Sopt : sig
   type +'a t
   val none : 'a t
   val some : 'a - 'a t
   val is_none : 'a t - bool
   val arg : 'a t - 'a
   val is_sopt : 'a t - bool
   val depth : 'a t - int
 end = struct
   type 'a t = int
   let null = (Obj.magic (Some 0) land 2) land 1
   let none = null + 1
   let is_none x = x  0  x  1
   let is_sopt x = is_none (x land 1)
   let some (x : 'a) : 'a t =
 let y : 'a t = Obj.magic x in
 if is_sopt y then y+2 else y
   let arg (x : 'a t) : 'a =
 if is_sopt x then Obj.magic (x-2) else Obj.magic x
   let depth x =
 if is_sopt x then x lsr 1 else 0
 end

WOW. That is some code. Thanks for sharing that.

 The code for null tricks the compiler into creating a null pointer.
 I avoiding using the simpler (Obj.magic (Some 0) land 0) in case land is 
 optimized.
 By adding 1 to this value, one creates a 2 at the C level.
 For ocaml this value is strictly between 0 (i.e. 1) and 1 (i.e. 3).
 Sopt.some checks whether a value is such an sopt, in which case it adds 2 to 
 it to
 add a Some constructor, otherwise it returns the value itself.
 Sopt.arg does just the opposite.
 Sopt.is_sopt and Sopt.depth are only exported for demonstrative purposes.

 # Sopt.depth (Sopt.some (Sopt.some (Sopt.some Sopt.none)));;
 - : int = 3
 # Sopt.depth (Sopt.some (Sopt.some (Sopt.some 0)));;
 - : int = 0

 Of course this precludes using the above bit pattern for anything else,
 but otherwise this should be completely type-safe.
 (At the theoretical level at least, I'm not 100% sure of the trickery used 
 here
 to have the compiler work on C level integers)

 Cheers,

 Jacques Garrigue

The implementation relies on specific aspect of the compiler to
construct those values. I would give it a 99% chance to fail with an
llvm backend for ocaml for example. Using C stubs that directly
manipulate the bits seems a better approach.

For example nothing says x + 2 can't be implemented as

value add(value x, value y) {
return (x  ~1) + y;
}

as opposed to

value add(value x, value y) {
return x + y - 1;
}

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Implementation for a (nearly) typesafe shallow option type and a compiler bug?

2012-05-06 Thread Goswin von Brederlow
Gabriel Scherer gabriel.sche...@gmail.com writes:

 What you observe is the so-called strengthening of type equalities
 in functor applications. See papers 5 or 6 in this list:
   http://caml.inria.fr/about/papers.en.html

 It is not a bug, but a feature: you can write functors F such that
 applying F(X) twice yields compatible, rather than incompatible,
 types. If you want to recover incompatible types, you can seal the
 functor result as you did in your workaround, or pass a non-path
 functor expression (that behave in a more generative way): F(struct
 include X end).

Good to know. I found that surprising. I think it is bad that you can't
specify the type of the functor so that both compatible and incompatible
types would be an option. Just like you can use 'a, +'a and -'a to fine
tune variance in types there could be some syntax to make the functor
type strengthened or not.

 On your more general code:
 - I do not understand why you specify the abstract type ('a t) to be
 contravariant, and I suspect this will be unsound (is an ( m : int 
 t) also an ( m : int; s : string  t)?)

Left over from trying to make the functor type not strengthened.

 - I am not sure using (Obj.magic (ref 0)) is safe wrt. types whose
 representation is not always a pointer (ie. floats)

(Obj.magic (ref 0)) is always a pointer that is unique to the instance
of the functor. No other value can legally have this bit pattern.

As for floats: Manual 18.3.1 Atomic types

Caml type  Encoding
float  Blocks with tag Double_tag.

A float is always a pointer and that can't be legally pointing to our
(Obj.magic (ref 0)).

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] using modules to wrap c++ classes

2012-05-05 Thread Goswin von Brederlow
Joel Reymont joe...@gmail.com writes:

 On Fri, May 4, 2012 at 9:43 AM, Goswin von Brederlow goswin-...@web.de 
 wrote:

 As discussed on irc you need to create your callbacks[] array as ocaml
 block and register that itself as root. That also has the benefit that
 you only have to register a single root instead of 50.

 Assuming that I have a class named MyCallbacks with a public member
 callbacks of type value, is this correct?

No. The callbacks[] can be public, protect, private or static.  You
could also register the individual callbacks but then you need to
register and deregister all 50 of them seperate. Having just one makes
things easier.

So the need above might be to strong. It is just simpler that way. And
I think the GC had some performance issues with too many roots.

 Also, can I use Is_block to check if there's a closure value stored in
 the callbacks block before I dispatch to the closure?

Sure.

 Thanks, Joel

 ---

 // finalizer, stored in custom_operations

 void Callbacks_delete(value v)
 {
   CAMLparam1(v);
   MyCallbacks* o = Callbacks_val(v);
   caml_remove_global_root(o-callbacks);
   delete o;
 }

 CAMLprim value
 Callbacks_new()
 {
   CAMLparam0();
   CAMLlocal2(v, cbks);
   v = caml_alloc_custom(ops, sizeof(MyCallbacks*), 0, 1);
   MyCallbacks* o = new MyCallbacks();
   Callbacks_val(v) = o;
   cbks = caml_alloc(NUM_CBKS, 0);
   o-callbacks = cbks;
   caml_register_global_root(o-callbacks);
   CAMLreturn(v);
 }

 extern C CAMLprim value
 Callbacks_set(value self, value cbk)
 {
   CAMLparam2(self, cbk);
   MyCallbacks* o = Callbacks_val(self);
   Store_field(o-callbacks, Tag_val(cbk), Field(0, cbk));
   CAMLreturn(Val_unit);
 }

Don't you have to use caml_modify() here instead of Store_field()? I
think Store_field is only alowed in freshly allocated blocks.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] A shallow option type

2012-05-05 Thread Goswin von Brederlow
Hi,

I wish there was an option type that would work without extra
indirection (or more importantly without extra allocation of an ocaml
value when setting it to something).

Why? I'm interfacing with C in a multithreaded way. The data is
allocated on the C side so it won't be moved around by the GC. The ocaml
side will modify data and the C side will utilize it. Now if ocaml
changes a multable 'a option then it will allocate a block for Some x
and the GC will move that block around during compaction. Which means
the 'a option can't be safely used without holding the runtime system
lock. Which then means the threads can't run in parallel.

What I want is a

type 'a shallow = NULL | 'a  (constraint 'a != 'b shallow)

I have some ideas on how to implement this in a module as abstract type
providing get/set/clear functions, which basically means I map None to a
C NULL pointer and Some x to plain x. I know x can never be the NULL
pointer, except when someone creates a 'a shallow shallow and sets Some
None. That would turn into simply None.

Is it possible to somehow declare the constraint that 'a in 'a shallow must
not be itself a 'b shallow?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] How do I declare a value of 'a t instead of '_a t in a module?

2012-05-05 Thread Goswin von Brederlow
Hi,

I'm writing a module that reimplements the option type without
redirection. For that I'm mapping None to a value with all bits set to
0, the NULL pointer in C. For that I have a little helper function:

external make_null : unit - 'a t = caml_shallow_null

But now I want to also have a polymorphic value for NULL:

module Make : sig
  type 'a t
  val make_list : unit - 'a list
  val make_null : unit - 'a t
end = struct
  type 'a t
  let make_list () = []
  external make_null : unit - 'a t = caml_shallow_null
end

module Shallow : sig
  type 'a t
  val list : 'a list
  val null : 'a t
end = struct
  type 'a t = 'a Make.t
  let list = Make.make_list ()
  let null = Make.make_null ()
end

File shallow.ml, line 15, characters 6-102:
Error: Signature mismatch:
   Modules do not match:
 sig
   type 'a t = 'a Make.t
   val list : 'a list
   val null : '_a Make.t
 end
   is not included in
 sig type 'a t val list : 'a list val null : 'a t end
   Values do not match:
 val null : '_a Make.t
   is not included in
 val null : 'a t


What makes the Make.make_list differ from Make.make_null?

Is there a way that I can specify that null is still polymorphic or does
that only work for constructors like None and compiler magic like []?

--

And here something odd:

module Make : sig
  type 'a t = 'a list
  val make_list : unit - 'a list
  val make_null : unit - 'a t
end = struct
  type 'a t = 'a list
  let make_list () = []
  external make_null : unit - 'a t = caml_shallow_null
end

module Shallow : sig
  type 'a t
  val list : 'a list
  val null : 'a t
end = struct
  type 'a t = 'a Make.t
  let list = Make.make_list ()
  let null = Make.make_null ()
end

compiles fine. But

module Make : sig
  type 'a t = private 'a list
  val make_list : unit - 'a list
  val make_null : unit - 'a t
end = struct
  type 'a t = 'a list
  let make_list () = []
  external make_null : unit - 'a t = caml_shallow_null
end

module Shallow : sig
  type 'a t
  val list : 'a list
  val null : 'a t
end = struct
  type 'a t = 'a Make.t
  let list = Make.make_list ()
  let null = Make.make_null ()
end

fails again. Just making the Make.t private makes it fail.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] A shallow option type

2012-05-05 Thread Goswin von Brederlow
Andreas Rossberg rossb...@mpi-sws.org writes:

 On May 5, 2012, at 15.33 h, Goswin von Brederlow wrote:
 What I want is a

type 'a shallow = NULL | 'a  (constraint 'a != 'b shallow)

 This is a form of negation, which cannot be expressed in conventional
 type systems. Just consider what it should mean in the presence of
 type abstraction: if you have

   M : sig
 type t
 ...
   end

 would `M.t shallow` be a legal type? You couldn't decide that properly
 without requiring that _every_ abstract type in every signature is
 annotated with a constraint saying that it is not shallow.

True, so abstract types would have to be forbidden too becauseit can't
be decided wether they are save or not. Since 'a shallow is an abstract
type that would also forbid 'a shallow shallow.

So constraint 'a != abstract would be the right thing.

 I have some ideas on how to implement this in a module as abstract
 type
 providing get/set/clear functions, which basically means I map None
 to a
 C NULL pointer and Some x to plain x. I know x can never be the NULL
 pointer, except when someone creates a 'a shallow shallow and sets
 Some
 None. That would turn into simply None.

 And how do you know that nobody else implements a _different_ type,
 say shallow2, that does the same thing? And a third party then
 constructs a value of type `int shallow2 shallow`?

I don't and I can't. But such a type would be abstract so wouldn't be
allowed by the above (reject abstract types).

 It seems to me that what you want effectively is a special case of non-
 disjoint union. Unfortunately, those are known to come with all kinds
 of problems, such as not being compositional.

 /Andreas

What I want is to have an array of

type t ={ ... }  (* Unit.t *)

and a matrix of

type tile = { ... mutable unit : Unit.t option; }

that I can safely access from a C thread in parallel with ocaml without
having to look the runtime system or individual tiles (the array and
matrix would both be created outside the ocaml heap).

The problem I have is that

tile.unit - Some unit

will allocate an option block on the heap and accessing that from C
while the GC runs causes a race condition.


What I don't want to do is use

type tile = { ... mutable unit : Unit.t; }
tile.unit - Obj.magic 0

since then trying things out in the toplevel causes segfaults when the
pretty printer prints the tile.unit and it would be easy to forget to
compare the tile.unit to Obj.magic 0 on every use. I know my shallow
type is basically nothing else but it adds a level of savety to it.

Idealy I would even love to write

match tile.unit with
  | NULL - ()
  | unit - do_something unit

I guess some camlp4 magic could be used to transform such a match into

match Shallow.as_option tile.unit with
  | None - ()
  | Some unit - do_something unit

or similar constructs.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] using modules to wrap c++ classes

2012-05-04 Thread Goswin von Brederlow
Joel Reymont joe...@gmail.com writes:

 So I ended up with something like this

 --- Callbacks.ml

 type t

 type 'a callback =
   | Alert of ('a - Alert.t - unit)
   | AskQuote of ('a - Ask.t - unit)
   | BestAskQuote of ('a - Ask.t - unit)
   | BidQuote of ('a - Bid.t - unit)
   | BestBidQuote of ('a - Bid.t - unit)
   | ClosePrice of ('a - ClosePrice.t - unit)
   | ClosingIndicator of ('a - ClosingIndicator.t - unit)
   | EndQuote of of ('a - EndQuote.t - unit)
   | EquityOptionList of of ('a - EquityOptionList.t - unit)
   | EquityOptionStrategyList of of ('a - EquityOptionStrategyList.t - unit)
   | HighPrice of ('a - HighPrice.t - unit)
   | Indicator of ('a - Indicator.t - unit)
   | IndicatorReplay of ('a - IndicatorReplay.t - unit)
   | LimitOrderBook of ('a - LimitOrderBook.t - unit)
   | LowPrice of ('a - LowPrice.t - unit)
   | MarketMode of ('a - MarketMode.t - unit)
   | OpenPrice of ('a - OpenPrice.t - unit)
   ...

 external make : unit - t = Callbacks_new
 external set : t - 'a callback - unit = Callbacks_set

 I now want to grab the closure from the constructor on the C side and
 stick it into a callback array. Is this sufficient?

 extern C CAMLprim value
 Callbacks_set(value v)
 {
   CAMLparam1(v);
   int i = Tag_val(v);
   CAMLlocal1(f);
   f = Field(0, v);
   caml_register_global_root(f);
   callbacks[i] = f;
   CAMLreturn(Val_unit);
 }

 Thanks, Joel

This won't work. The value f is a pointer to a closure block butr that
block is under the control of the GC. It will be moved around during
compaction and then your callbacks[i] is invalid.

As discussed on irc you need to create your callbacks[] array as ocaml
block and register that itself as root. That also has the benefit that
you only have to register a single root instead of 50.

For an example you can look at my libfuse bindings:

http://git.ocamlcore.org/cgi-bin/gitweb.cgi?p=libfuse-ocaml/libfuse-ocaml.git;a=tree;f=lib;h=72f44b5090e0d1341d679f52249bbf3974709c1c;hb=HEAD

The idea there is to have a ocaml block with callbacks and a custom
block with finalizer that contains a pointer to the ocaml block mixed
with plain C data. The GC will not inspect the contents of the custom
block so the block of callbacks is registered as global root to keep it
alive and unregistered in the finalizer. Since you are going to use a
static block of calbacks you only need to register the root, it never
dies.

In my case I use a record containing all callbacks to initialize the C
side. With fuse most use cases will set most of the callbacks so having
to set each in turn would be more work. I also use Obj.magic 0 to
initialize callbacks that are unused. An alternative is to use 'a option
types but I decided against that to avoid the extra indirection. Those
callbacks get called a lot and speed is relevant.

Callbacks can be set as needed and the intended use for this is

let my_ops = { default_ops with
init = my_init;
destroy = my_destroy;
}

Using the with syntx of records one only overrides the callbckas one
wishes to use.

Changing the code to set callbacks individualy is left as an exercise to
the reader. :)

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] using modules to wrap c++ classes

2012-05-03 Thread Goswin von Brederlow
Joel Reymont joe...@gmail.com writes:

 Suppose I have a C++ class with a bunch of methods that act as callbacks.

 My callbacks will be hit if I derive the class and implement the
 methods I'm interested in.

 Now, I would like to wrap this class from OCaml.

 This is straightforward from using OCaml classes (section 18.3.5 of
 the manual) but is there a way to implement this using modules?

 I don't think this is possible but would like to confirm.

 Thanks, Joel

From the top of my head:

module M = struct
  type t
  external foo : t - unit = caml_mine_M_foo
end


class M {
  void foo(void) { }
};

value caml_mine_M_foo(value ml_m) {
  CAMLparam1(ml_m);
  M *m = (M*)ml_m;
  m-foo();
  CAMLreturn Val_unit;
}

Wouldn't that do?

It gets harder with inheritance, when you have modules A, B, C for the
different classes and want to be able to pass a C.t to a function
expecting a B.t. You will need to declare a C.as_b function and such.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] using infix operators/functions in modules when not using open

2012-04-27 Thread Goswin von Brederlow
Gabriel Scherer gabriel.sche...@gmail.com writes:

 Module.(foo (op) bar)

 This is a local open (a more explicit notation is let open Module in
 foo op bar). I think this is a fine solution: I don't like open, but
 local mitigates its flaws.

 Another solution is to rebind your operator locally:

   let (op) = Module.blah in
   foo op bar

 (of course Module.blah doesn't need to be an infix itself)

 If you are going to use an infix operator in a wide scope, I think
 that such a rebinding solution is nice because it's more explicit,
 which is good when it helps users understand the code.

And that also works more globally as in

let (op) = Module.blah

let foo x y = x (op) y

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] exn vs option

2012-04-11 Thread Goswin von Brederlow
Daniel Bünzli daniel.buen...@erratique.ch writes:

 Le jeudi, 5 avril 2012 à 11:05, Goswin von Brederlow a écrit :
 If you are writing a module then consider providing both flavours for
 functions, one with exceptions and one with options. Even if you only do
 something like this:


 I don't think it's a rule that should be applied consistently, I don't like 
 libraries that provide tons of convenience functions, too many names. If you 
 stick with the option type (or variant cases) the user can quickly wrap the 
 function if it needs an exception (or vice-versa but I prefer it that way 
 since it's then documented by the function signature and is in my opinion 
 better behaved in general).

 However in the particular case of finding something in a data structure where 
 the user could be confronted with a situation where he can prove that the 
 datum will be found I think it's justified to provide both flavours. For 
 these cases, in my own code I settled on the following pattern :  

 val find : 'a t - key - 'a option
 (** [find d k] is the value bound to [k] in [d], if any. *)

 val get : 'a t - key - 'a  
 (** [get d k] is like {!find} but raises [Invalid_argument] if [k] is not 
 bound in [d]. *)

That pattern works well if you have just one or few such functions and
can think of nice names to differentiate them.

Otherwise submodules are usefull:

module Foo = Foo.Options
or
module Foo = Foo.Exceptions

Foo.do_things ()

or

open Foo.Options
do_things ()

The option and exception raising functions would be named the same and
the user selects one or the other by selecting a submodule.

As said, it is usualy a matter of taste and the users tastes might
differ from the module authors taste. By providing an option and
execption flavour of a function (by different names or submodules hardly
matters) the user can use what he likes as opposed to what the modules
author imposes on him.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] exn vs option

2012-04-05 Thread Goswin von Brederlow
Pierre Chopin pie...@punchup.com writes:

 Hi,

 I benchmarked two programs, in one case the main function throw an exception
 that is caught, in the other the function returns an option that is pattern
 matched on.

 I noticed that, whether the exception is thrown or not, the option version is
 always faster.

 Is there any case where it makes sense, performance wise, to use exception
 instead of 'a option ?

I find that in most cases speed is not a major concern and then it comes
down to taste and readability of the code.

 test1.ml
 --

 exception Foo
 let f x =
  if x =1 then raise Foo else ()

 ;;

  for i = 0 to 10_000_000 do
 try
     f 1
 with Foo - ()
 done

0.34s user 0.01s system 99% cpu 0.351 total

That is rather short for a test. lets add 2 zeroes to the loop. And lets
call f 0 and f 1 to test both cases:

f 0: 17.72s user 0.02s system 99% cpu 17.792 total
f 1: 35.30s user 0.02s system 99% cpu 35.371 total

 
 test2.ml:
 
 let f x =
     if x=1 then None else Some ()

 ;;
 for i = 0 to 10_000_000 do
     match f 1 with
         None - ()
     |   Some s - s
     done
 

f 0: 11.60s user 0.02s system 99% cpu 11.655 total
f 1: 10.91s user 0.01s system 99% cpu 10.933 total

And lets test the speed when the exception is actualy exceptional:

exception Foo
let f x = if x =1 then raise Foo else ()

let () =
  try
for i = 0 to 10 do
  f 0
done
  with Foo - ()

9.94s user 0.00s system 99% cpu 9.946 total

Someone said in deep recursions exceptions are faster because they don't
have to unwind the stack:

exception Result of int

let rec fac acc = function
  | 1 - raise (Result acc)
  | n - fac (n * acc) (n - 1)

let () =
  for i = 0 to 100_000_000 do
try
  fac 1 50
with Result _ - ()
  done

71.88s user 0.00s system 99% cpu 1:11.90 total


let rec fac acc = function
  | 1 - acc
  | n - fac (n * acc) (n - 1)

let () =
  for i = 0 to 100_000_000 do
ignore (fac 1 50)
  done

67.04s user 0.02s system 99% cpu 1:07.08 total


Not feeling it. Lets try something not tail recursive:

exception Error

let rec foo = function
  | 1 - raise Error
  | n - n * (foo (n - 1))

let () =
  for i = 0 to 100_000_000 do
try
  ignore (foo 50)
with Error - ()
  done

25.03s user 0.01s system 99% cpu 25.068 total

let rec foo = function
  | 1 - None
  | n - match foo (n - 1) with None - None | Some x - Some (n * x)

let () =
  for i = 0 to 100_000_000 do
ignore (foo 50)
  done

49.48s user 0.01s system 99% cpu 49.508 total


In conclusion I would have to say that exceptions are better if they are
exceptional.

When you do not catch them right away or not at all then they are
better. The try command is more expensive than an option type and if
you are going to catch the exception right away anyway then options are
faster.

But if you don't catch them right away the cost of the try can be
amortized over many calls and becomes cheaper. Or if you don't catch the
exception at all then you can get a nice backtrace of where the
exception occured.

If your code is not tail recursive then option types mean you have to
match them on every level again and again and allocate a ton of 'Some x'
if no error occurs. You can make your code tail recursive or use
exception to improve performance there.



If you are writing a module then consider providing both flavours for
functions, one with exceptions and one with options. Even if you only do
something like this:

let find x y = 

let raise_on_none exn f arg =
  match f arg with
  | None - raise exn
  | Some x - x

let find_exn x y = raise_on_none Not_found (find x) y

Obviously option only works for exceptions like Not_found. If you want
to return an error you might have to use something like

   type ('a, 'b) result = Result of 'a | Error of 'b

Putting the 2 flavours into different submodules can keep things tidy too.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Wish: mutable variant types, equivalence with records

2012-03-31 Thread Goswin von Brederlow
François Bobot bo...@lri.fr writes:

 On 30/03/2012 07:16, Goswin von Brederlow wrote:
 François Bobotbo...@lri.fr  writes:

 On 24/03/2012 13:45, Wojciech Meyer wrote:
 Please see [1], Alain Frisch has been working recently on implementing
 in-line records for constructor arguments.

 It's more implementation/design implications than people might think.

 [1] http://caml.inria.fr/mantis/view.php?id=5528

 In the thread of this proposed feature, there is a remark that inlined
 record and normal record can become competing features. There is also
 the burden that inlined record as proposed can be modified only in
 pattern matching.

 But can't we consider that, for a semantic, syntax and typing perspective:

 type t =
 | A of string
 | B of ({msg: string; mutable foo:int} as t2)
 | C

 is exactly the same thing than:

 type t =
 | A of string
 | B of t2
 | C

 and t2 = {msg: string; mutable foo:int}

 That would change existing code because B of t2 currently is a Block of
 size 1 with tag B and pointer to a record of type t2.

 I don't propose to change how is compiled the second definition. Just
 that if a developer use the first definition in a module he can
 consider that he define t and t2 like in the second definition. The
 only exception is if you use Obj.magic: Obj.magic (B r) == Obj.magic r.
 But Obj.magic is not in the semantic I assume.


 The only difference is that when you create a record of type t2 the
 tag is directly the one of B in the first case and is the default tag
 for record (the first tag if I remember well) in the second case. So
 in the first case applying the constructor B is just the identity.

 Maybe the type of a record could be altered to, at least internally,
 include a tag value:
 Every records already include a tag value, it's just always the same:
 the tag of the first non-constant constructor of a sum type.

No it doesn't. Not in its type. It just happens to have a tag 0 in its
memory representation.

Say you have

type r1 = { x : int; }
type Foo = Foo of int | Bar of ({ x : int; } as r2)
type Foo = Foo of ({ x : int; } as r3) | Bar of int

The type r1 and the internal type r2 are not the same. A record of type
r2 would have a tag of 1. Same with r2 and r3, they should be different
types.

 type t = A of string | B of ({msg: string; mutable foo:int} as t2) | C
 type t2 = {[B] x : int; }

 The inline record would be identical to the external record with tag
 value and (t2 :  t) would work even implicitly.

 I'm not sure that we should make implicit conversion (inference?
 principality?). But that B r is a noop can be very interesting.

The implicit parts would be in matches for example. match x with B r -
would basically just translate to (x : t2) with the additional check
that the tag is right. Or

let bar = fun i - {[B] x = i; }
let foo : int - t = fun i - bar i

val bar : int - {[B] x : int; }
val foo : int - t

On the other hand bar could already be interfered as type int - t.
It all depends on wether the [tag] syntax would be restricted to
inline records in constructors or not.

It probably makes sense to restrict declaring a record with tag to
inline records. Otherwise it might get confusing what value the tag
should actually have.


One other thing: How do you declare a value of such a type?

type r = { x : int; }
type foo = Foo of r | Blub of int
type bar = Bar of { y : int; } | Blubber of int

let myfoo = Foo { x = 0; }
let mybar = Bar { y = 0; }

This would be confusing since the two look identical but do something
quite different.

Maybe the syntax should be more like this:

type r = { x : int; }
type foo = Foo of r | Blub of int
type bar = { Bar y : int; } | Blubber of int

let myfoo = Foo { x = 0; }
let mybar = { Bar y = 0; }

With the constructor actualy inside the record the two cases look
different and it is visually clear that the Bar tag is part of the
record while Foo is a constructor taking a record as argument.

 If we want to play to extend this feature (perhaps too much) we can
 play with conversion between sum type without cost by allowing Ap to
 be also the identity:


 type tpublic =
 | A of ({... } as ta)
 | B of ({... } as tb)

 type tprivate =
 | Ap of ta
 | Bp of tb
 | Cp of ...


 Since A,ta and Ap share the same tag the following function is the identity:

Again no, at least it isn't a NOP. The current memory representation of
Ap of ta is incompatible with the representation we want for A of
({... } as ta).

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Wish: mutable variant types, equivalence with records

2012-03-30 Thread Goswin von Brederlow
François Bobot bo...@lri.fr writes:

 On 24/03/2012 13:45, Wojciech Meyer wrote:
 Please see [1], Alain Frisch has been working recently on implementing
 in-line records for constructor arguments.

 It's more implementation/design implications than people might think.

 [1] http://caml.inria.fr/mantis/view.php?id=5528

 In the thread of this proposed feature, there is a remark that inlined
 record and normal record can become competing features. There is also
 the burden that inlined record as proposed can be modified only in
 pattern matching.

 But can't we consider that, for a semantic, syntax and typing perspective:

 type t =
| A of string
| B of ({msg: string; mutable foo:int} as t2)
| C

 is exactly the same thing than:

 type t =
| A of string
| B of t2
| C

 and t2 = {msg: string; mutable foo:int}

That would change existing code because B of t2 currently is a Block of
size 1 with tag B and pointer to a record of type t2.

 The only difference is that when you create a record of type t2 the
 tag is directly the one of B in the first case and is the default tag
 for record (the first tag if I remember well) in the second case. So
 in the first case applying the constructor B is just the identity.

Maybe the type of a record could be altered to, at least internally,
include a tag value:

type t = A of string | B of ({msg: string; mutable foo:int} as t2) | C
type t2 = {[B] x : int; }

The inline record would be identical to the external record with tag
value and (t2 : t) would work even implicitly.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Wish: mutable variant types, equivalence with records

2012-03-25 Thread Goswin von Brederlow
Lukasz Stafiniak lukst...@gmail.com writes:

 On Sat, Mar 24, 2012 at 7:39 PM, Lukasz Stafiniak lukst...@gmail.com wrote:
 On Sat, Mar 24, 2012 at 7:32 PM, Lukasz Stafiniak lukst...@gmail.com wrote:

 I'm not sure about mutable but I'd appreciate labels :D

 As for syntax, I think that unboxed anonymous records would be better.

 For starters, one could make a Camlp4 extension that generates a
 record type named typ_Variant for a type typ and its constructor
 Variant whose fields are defined as a record. Hmm...

Yes please.

That wouldn't eliminate the risk of the two types getting out-of-sync
and save keystrokes.

 Record unboxing might be handled as an orthogonal issue?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] GADT and optional arguments: Can they work?

2012-03-25 Thread Goswin von Brederlow
Hi,

after having just written a ton of functions that just differ in the
argument type I started to think that this might be a good use acase for
GADT types.

So instead of seperate get_int, get_int32, get_int64, ... functions
there would only be one get function with an extra GADT argument to
specifiy the type to be used.


For example Unix.lseek and Unix.LargeFile.lseek could be written as:

(* GADT for LargeFile *)

type _ size = Int : int size | Int64 : int64 size

let lseek : type a . a size - Unix.file_descr - a - Unix.seek_command - a =
  fun size fd off cmd -
match size with
  | Int - Unix.lseek fd off cmd
  | Int64 - Unix.LargeFile.lseek fd off cmd

val lseek : 'a size - Unix.file_descr - 'a - Unix.seek_command - 'a = fun


So far so good. But now one has to specify the argument size on every
lseek call. To make it nicer I would like the size to be optional and
specify a default size for the most common use:

let lseek : type a . ?size:a size - Unix.file_descr - a - Unix.seek_command 
- a =
  fun ?(size=Int) fd off cmd -
match size with
  | Int - Unix.lseek fd off cmd
  | Int64 - Unix.LargeFile.lseek fd off cmd
;;
(*
 * fun ?(size=Int) fd off cmd -
 *^^^
 * Error: This expression has type int size
 *but an expression was expected of type a size
*)

Is there a way to write this so it types correctly? I guess I would need
to somehow specify a default type for a for the a size argument to
be optional.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Strategies for finding memory leaks

2012-03-24 Thread Goswin von Brederlow
Hans Ole Rafaelsen hrafael...@gmail.com writes:

 Hi,

 I have tried to trigger Gc.compact but that only have limited effect for a
 short time.

 Hans Ole

My thinking was rather that something triggers the GC manually and over
time starts to do that far too often so more and more time is spend in
the GC.

If it isn't triggered manually then custom types can also give the GC a
wrong idea about how much memory is used/wasted. But that is also just
an idea of where to look.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] Wish: mutable variant types, equivalence with records

2012-03-24 Thread Goswin von Brederlow
Hi,

consider the code below that counts how often a value is printed.

The reason why this code works is that the memory layout of a
variant with arguments is identical to a record with the same
types. The only difference is that a variant sets the tag of the memory
block to reflect which constructor it is (here Bar = 0, Baz = 1).

But the code is fragile. It requires the use of Obj.magic and duplicates
the definitions of bar and baz (once as variant and once as record). If
the type of foo is changed but not the records then bad things will
happen.

There are ways to do this without Obj.magic:

type foo = Bar of bar_record | Baz of baz_record
type foo = Bar of int * int ref | Baz of float * int ref

The first adds an indirection for every access to foo and breaks
matching. The second adds an indirection for every mutable value in the
type. In both cases the extra indirections increase the memory footprint
and runtime.


So why not allow mutable in variant types and some equivalence with
records? For example:

type name = Constructor of [mutable] type[:label] [| ...]

as in:

type foo =
  | Bar of int:i * mutable int:bar_used
  | Baz of float:f * mutable int:baz_used

let use x =
  match x with
| Bar bar -
bar.bar_used - bar.bar_used + 1
| Baz baz -
baz.baz_used - baz.baz_used + 1

let print x =
  use x;
  match x with
  | Bar bar - Printf.printf %d\n bar.i
  | Baz baz - Printf.printf %f\n baz.f

The label is optional and any types in the constructor without label
would be translated into anonymous fields in a record that are
ineaccessible.

  type foo = Foo of int * mutable int:used

would be equivalent to { _ : int; mutable used : int; }

Taking it one step wurther one could even allow:

let bar = { i = 1; bar_used = 0; }
let foo = Bar bar
let foo = let Bar bar = foo in Bar { bar with i = 2; }


What do you think?

MfG
Goswin

==
module Foo : sig
  type foo
  val make_bar : int - foo
  val make_baz : float - foo
  val get_used : foo - int
  val print : foo - unit
end = struct
  type foo = Bar of int * int | Baz of float * int
  type bar_record = { i : int; mutable bar_used : int; }
  type baz_record = { f : float; mutable baz_used : int; }

  let make_bar i = Bar (i, 0)
  let make_baz f = Baz (f, 0)

  let use x =
match x with
| Bar _ -
let (bar : bar_record) = Obj.magic x
in bar.bar_used - bar.bar_used + 1
| Baz _ -
let (baz : baz_record) = Obj.magic x
in baz.baz_used - baz.baz_used + 1

  let get_used = function Bar (_, used) | Baz (_, used) - used

  let print x =
use x;
match x with
| Bar (i, _) - Printf.printf %d\n i
| Baz (f, _) - Printf.printf %f\n f
end;;

let foo = Foo.make_bar 1
let used_before = Foo.get_used foo
let () = Foo.print foo
let used_after = Foo.get_used foo




-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Strategies for finding memory leaks

2012-03-22 Thread Goswin von Brederlow
Hans Ole Rafaelsen hrafael...@gmail.com writes:

 Hello,

 Is there some tools / tricks that can be used to help find memory leaks?

 I have trouble with an application, that when running for a long time, starts
 to use a lot of CPU and consume more memory. It starts out by using about 20%
 CPU (reported by top) and after 24 hours it has increased to 80% usage. Also
 physical (RES) memory usage goes from 80M to 160M. The workload for the
 application is the same the whole time.

 Using OProfile (http://oprofile.sourceforge.net/news/) shows that that most of
 the time is being spent doing memory management.

 At startup:
 CPU: Core 2, speed 2667 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask
 of 0x00 (Unhalted core cycles) count 10
 samples  %    image name   symbol name
 52764    22.3913  vc_client.native mark_slice
 33580    14.2502  vc_client.native caml_page_table_lookup
 25415    10.7853  vc_client.native sweep_slice
 10119 4.2942  vc_client.native caml_fl_allocate
 6423  2.7257  [vdso] (tgid:9015 range:0x7fff4256c000-0x7fff4256d000) 
 [vdso]
 (tgid:9015 range:0x7fff4256c000-0x7fff4256d000)
 5233  2.2207  vc_client.native
 camlLividisvc__Nalbuf_tools__replace_pattern_1033
 2759  1.1708  vc_client.native caml_iterate_global_roots
 2728  1.1577  vc_client.native caml_modify
 2473  1.0495  vc_client.native caml_oldify_one
 2204  0.9353  vc_client.native
 camlLividisvc__Nalbuf_bytestream__search_1047
 2183  0.9264  vc_client.native caml_darken
 1935  0.8212  vc_client.native caml_stash_backtrace
 1843  0.7821  vc_client.native caml_do_roots
 1838  0.7800  vc_client.native caml_delete_global_root

 After ca. 24 hours run:
 CPU: Core 2, speed 2667 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask
 of 0x00 (Unhalted core cycles) count 10
 samples  %    image name   symbol name
 1137401  56.2697  vc_client.native mark_slice
 405598   20.0658  vc_client.native sweep_slice
 399832   19.7806  vc_client.native caml_page_table_lookup
 10106 0.5000  vc_client.native caml_fl_allocate
 3548  0.1755  vc_client.native caml_iterate_global_roots
 3397  0.1681  [vdso] (tgid:26129 range:0x7fff747ff000-0x7fff7480)
 [vdso]
  (tgid:26129 range:0x7fff747ff000-0x7fff7480)
 2797  0.1384  vc_client.native
 camlLividisvc__Nalbuf_tools__replace_
 pattern_1033
 2307  0.1141  vc_client.native
 camlLividisvc__Nalbuf_bytestream__sea
 rch_1047
 2005  0.0992  vc_client.native caml_oldify_local_roots
 1786  0.0884  vc_client.native caml_gc_stat
 1441  0.0713  vc_client.native caml_oldify_one
 1163  0.0575  vc_client.native caml_darken
 1163  0.0575  vc_client.native caml_fl_merge_block
 1032  0.0511  vc_client.native camlHashtbl__find_1093

 The application uses several 3rd party libraries, including: LablGTK2,
 OCamlNet, LWT and others.

 Is there some clever trick that can by used to track down or get a hint of 
 what
 is causing this?

 Thanks,

 Hans Ole Rafaelsen

Are you calling the GC manually somewhere in the code or in one of the
libs?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] Re: Wanted: GADT examples: string length, counting module x

2012-03-22 Thread Goswin von Brederlow
o...@okmij.org writes:

 Somehow typed tagless interpreters (embeddings of a typed language)
 and length-parameterized lists with the append function are the most
 popular examples of GADTs. Neither of them seem to be particularly
 good or compelling examples. One can write typed interpreters in the
 final-tagless style, with no GADTs. Writing append function whose type
 says the length of the resulting list is the sum of the lengths of the
 argument lists is cute. However, this example does not go too far, as
 one discovers when trying to write List.filter for
 length-parameterized lists.

 The ML2010 talk on GADT emulation specifically used a different
 illustrating example: a sort of generic programming, or implementing
 N-morphic functions:
  
 http://okmij.org/ftp/ML/first-class-modules/first-class-modules-talk-notes.pdf

An interesting talk to read. But unless I'm mistaken that isn't using
ocamls new GADT types but implements it own.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] GADT example for universal list

2012-03-22 Thread Goswin von Brederlow
Hi,

I played some more with GADTs, this time with an universal list. Well,
not truely universal as you can't store arbitrary types. Only Int and
Float are allowed here:

type _ kind = (* used for searching in the list *)
  | Int_kind : int kind
  | Float_kind : float kind

type value = (* box the values so we have runtime type information *)
  | Int of int
  | Float of float

type list = (* the universal list *)
  | Nil : list
  | Cons : value * list - list

(* find the first value in the list of a given kind *)
let rec get : type a . list - a kind - a = fun l k -
  match (k, l) with
| (_, Nil) - raise Not_found
| (Int_kind, Cons (Int x, xs)) - x
| (Float_kind, Cons (Float x, xs)) - x
| (_, Cons (_, xs)) - get xs k

(* print out list *)
let rec print = function
  | Nil - print_newline ()
  | Cons ((Int x), xs) - Printf.printf %d  x; print xs
  | Cons ((Float x), xs) - Printf.printf %f  x; print xs

(* testing *)
let empty = Nil
let l1 = Cons ((Int 1), empty)
let l2 = Cons ((Float 2.), l1)
let () = print l2
let i = get l2 Int_kind
let f = get l2 Float_kind;;

(*
  2.00 1 
type _ kind = Int_kind : int kind | Float_kind : float kind
type value = Int of int | Float of float
type list = Nil : list | Cons : value * list - list
val get : list - 'a kind - 'a = fun
val print : list - unit = fun
val empty : list = Nil
val l1 : list = Cons (Int 1, Nil)
val l2 : list = Cons (Float 2., Cons (Int 1, Nil))
val i : int = 1
val f : float = 2.
*)

At first glance you might think: Why does that need GADTs? Why not
simply use 

type value = Int of int | Float of float

for this?


Take a close look at the get funktion:

val get : list - 'a kind - 'a = fun

It does not return a value but directly the unboxed type. Because the
value is unboxed you can directly use it and ocaml will detect if you
screw up the type like in:

let () = Printf.printf %d\n (get l2 Float_kind);;
Error: This expression has type float but an expression was expected
   of type int

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Wanted: GADT examples: string length, counting module x

2012-03-21 Thread Goswin von Brederlow
Raoul Duke rao...@gmail.com writes:

 On Mon, Mar 19, 2012 at 9:13 AM, Jesper Louis Andersen
 If you want to play with dependent types, there are two ways which
 seem popular at the moment: Agda or Coq.

 and some not popular ones...

 http://www.ats-lang.org/

 http://sandycat.info/blog/deptypes-shen/

 et. al.

Neigther of which are examples for GADT in ocaml.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?

2012-03-10 Thread Goswin von Brederlow
Markus Weißmann markus.weissm...@in.tum.de writes:

 Taking one step back, what you actually want is a data structure that:

 -is a container for elements of some type 'a
 -some elements are tagged with a state tag 'b
 -the elements have an ordering
 -you can add elements
 -you can tag elements with a state tag
 -you can remove the current element
 -you can remove the tag from the current element
 -you can get the current element
 -you can move the current element one step (forwardbackward)
 -you can get the tag of the current element
 -you can move the current element one step to the next/prev tagged one

No, that was just the simplest example I could think of.

 And you want this structure to be fast, right? Or do you only care about
 the concrete implementation?
 You should think about the type of the elements and tags you want to be
 able to use: Do they have a natural ordering? Are they hashable?

 I would advice you to 1st write the interface with the functions you
 require, then think about the implementation. I'm under the impression
 that you have a decent C implementation and want to transfer that to OCaml
 at all costs.

I want a set of standard containers that are combinable so that given an
element from any one of the containers it can be removed from all of
them efficiently. Efficient doesn't allways equal fast. What I don't
want is an operation that should be O(1) to suddenly take O(log n).

More generally given an element from any one of the containers all the
normal operation of any other container should be possible. All
containers should have a remove this element operation. Lists have a
next. Double linked lists have next and prev. Priority queue
should have change priority. Queue might have move to front and
move to back.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?

2012-03-10 Thread Goswin von Brederlow
Gabriel Scherer gabriel.sche...@gmail.com writes:

 I think this is a case of using a fancy new construction when a
 simpler construction would do just as well.
 For some reasons some people, even beginners, are absolutely fond of
 first-class modules, and insist on using them in situation where
 they're really not needed. I not eager to see what they will do with
 GADTs...

 In your case, you could simply use ('a - 'a dllist) instead of your
 Accessor module. Your code works just fine with the following
 definitions of make_accessor and get_next:

   let make_accessor f = f (* useless, for code compatibility *)
   let get_next get y = (get y).next

 I think learning *not to use* fancy features is just as fun as using them.

Sure. I think you missed the For the fun of it right at the begining.

One advantage of using a first class module though is that they are
extensible. I could define 2 accessor modules:

    let module DList_Accessor = struct
      type a = c
      let get_next x = get_next x.dlist
      let get_prev x = get_prev x.dlist
      let set_next x = set_next x.dlist
      let set_prev x = set_prev x.dlist
    end

    let module PrioQueue_Accessor = struct
      type a = c
      let get_next x = get_next x.queue
      let get_prev x = get_prev x.queue
      let set_next x = set_next x.queue
      let set_prev x = set_prev x.queue
  let get_prio x = get_prio x.queue
  let set_prio x = set_prio x.queue
    end

Now any function that operates on (module DList) can also be called with
(module PrioQueue). You can't do that with closures or records of
closures. You can do it with objects though.

Note: you probably wouldn't implement a priotity queue as doubly linked
list. Just an example.

 On Fri, Mar 9, 2012 at 8:50 AM, Goswin von Brederlow goswin-...@web.de 
 wrote:
 Gabriel Scherer gabriel.sche...@gmail.com writes:

 I have implemented a small example in two versions: one with external
 dllist (I'm using the Batteries Dllist module), which indeed needs a
 small hack to bootstrap a single-element cycle, and one with inline
 pointers as you show, using simple value recursion. In the second case
 I also have a small abstraction layer for traversals -- only
 instantiated in the 'iter' case currently -- that could save some code
 duplication; that must be one of the closures tricks you were
 thinking of, but I don't claim that it's as efficient than duplicated
 specialized traversal functions.

 The code is here:
   
 https://gitorious.org/gasche-snippets/outside-or-inline-lists/blobs/master/goswin_dllists.ml

 For the fun of it I wrote a version with first class modules. It is
 pretty much like your inline solution with records of closures, just a
 different syntax.

 The one thing I can't eliminate is the code duplication for the
 initialization. I'm starting to believe there is no way around using
 Obj.magic or option types there. At least for the verry first task (in
 each dlist). The right-hand restriction in a let rec simply prevents
 putting that code into a function.

 For DList the duplication is minimal because the container is so
 simple. But for more complex container more code would have to be
 duplicated. The code could be hidden behind camlp4 though.

 MfG
        Goswin

 --

 (* Type for doubly linked list *)
 type 'a dlist = {
  mutable next : 'a;
  mutable prev : 'a;
 }

 (* Module to access a container in another type *)
 module type Accessor = sig
  type a
  val get : a - a dlist
 end

 (* create an access module from a closure *)
 let make_accessor : 'a . ('a - 'a dlist) -
  (module Accessor with type a = 'a) =
  fun (type c) (fn : c - c dlist) -
    let module M = struct
      type a = c
      let get x = fn x
    end
    in
    (module M : Accessor with type a = c)

 (* Iterator over a DList *)
 let get_next : 'a . (module Accessor with type a = 'a) - 'a - 'a =
  fun (type a) x y -
    let module D = (val x : Accessor with type a = a)
    in
    (D.get y).next

 (* Now lets use this: *)

 (* A task has 2 DList: all and state *)
 type task = {
  all : task dlist;
  state : task dlist;
  name : string;
 }

 (* Accessors for the two DLists *)
 let all = make_accessor (fun task - task.all)
 let state = make_accessor (fun task - task.state)

 (* Some sample tasks *)
 let rec task1 = {
  all = { next = task2; prev = task2; };
  state = { next = task1; prev = task1; };
  name = task1;
 }
 and task2 = {
  all = { next = task1; prev = task1; };
  state = { next = task2; prev = task2; };
  name = task2;
 }

 let () =
  Printf.printf all_next = %s, state_next = %s \n
    (get_next all task1).name
    (get_next state task1).name

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?

2012-03-08 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Am Donnerstag, den 08.03.2012, 06:11 +0100 schrieb Goswin von Brederlow:
 Gabriel Scherer gabriel.sche...@gmail.com writes:
 
  So then you need mutable option types or mutables that you initialize
  with dummy values and then overwrite with the real thing once all
  members of a cycle are created. Or some other trickery to make it
  work. Overall cycles are generally not good.
 
  I believe the trick you need (either passing a dummy value of your
  type, or Obj.magic) is less ugly that your Camlp4 solution for inline
  access.
  If I really needed absolute performance, I'd use the inline version
  just like in C, with mutable fields, but without preprocessor sugar.
  Preprocessing could avoid a bit of duplication (corresponding to
  manual inlining) on the structure-definition side, but wouldn't
  simplify the structure-using code much.
 
 How?
 
 type task = {
  mutable all_next : task;
  mutable all_prev : task;
  mutable state_next : task;
  mutable state_prev : tast;
  ...
 }
 
 Now how do yout write DList.insert, DList.remove, DList.iter, ...?
 
 I'm looking for some nice tricks using closures, functors, first class
 modules or something to make it look pretty and easy to use.

 There is a solution with functors. However, I don't dare to predict how
 fast it is (functors add additional indirection):

 module type DLINKABLE = sig
   type t
   val next : t - t option
   val prev : t - t option
   val set_next : t - t option - unit
   val set_prev : t - t option - unit
 end

 module type DHEAD = sig
   type dlist
   type t
   val first : dlist - t option
   val last : dlist - t option
   val set_first : dlist - t option - unit
   val set_last : dlist - t option - unit
 end

 module DLIST(H : DHEAD)(L : DLINKABLE with type t = H.t) = struct
   (* Here define operations over lists, e.g. *)

   let remove (dl:H.dlist) (x:D.t) =
 ( match D.prev x with
 | None -
 H.set_first dl (D.next x)
 | Some p -
 D.set_next (D.next x)
 );
 ( match D.next x with
 | None -
 H.set_last dl (D.prev x)
 | Some n -
 D.set_prev (D.prev x)
 )
 end

 For a concrete application you would only have to implement DLINKABLE
 and DHEAD, and then apply DLIST on these implementations. If elements
 are members of several lists, do this for each list separately. I guess
 you could define macros for generating DLINKABLE and DHEAD (even the
 macros built into camlp4 could be good enough for this).

 Gerd

That is a start. But you used an open ended doubly linked list. This
makes it simpler because you can have an empty list and initialize items
to not be linked to anything before inserting them. That avoids the
cycles.

If you have a closed (cyclic or ring) doubly linked list the problem
becomes how to create the first element that must link to itself. Can
you come up with something without option type or Obj.magic? Something
that doesn't duplicate the initialization of the DList for each
instance.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?

2012-03-08 Thread Goswin von Brederlow
Gabriel Scherer gabriel.sche...@gmail.com writes:

 I have implemented a small example in two versions: one with external
 dllist (I'm using the Batteries Dllist module), which indeed needs a
 small hack to bootstrap a single-element cycle, and one with inline
 pointers as you show, using simple value recursion. In the second case
 I also have a small abstraction layer for traversals -- only
 instantiated in the 'iter' case currently -- that could save some code
 duplication; that must be one of the closures tricks you were
 thinking of, but I don't claim that it's as efficient than duplicated
 specialized traversal functions.

 The code is here:
   
 https://gitorious.org/gasche-snippets/outside-or-inline-lists/blobs/master/goswin_dllists.ml

For the fun of it I wrote a version with first class modules. It is
pretty much like your inline solution with records of closures, just a
different syntax.

The one thing I can't eliminate is the code duplication for the
initialization. I'm starting to believe there is no way around using
Obj.magic or option types there. At least for the verry first task (in
each dlist). The right-hand restriction in a let rec simply prevents
putting that code into a function.

For DList the duplication is minimal because the container is so
simple. But for more complex container more code would have to be
duplicated. The code could be hidden behind camlp4 though.

MfG
Goswin

--

(* Type for doubly linked list *)
type 'a dlist = {
  mutable next : 'a;
  mutable prev : 'a;
}

(* Module to access a container in another type *)
module type Accessor = sig
  type a
  val get : a - a dlist
end

(* create an access module from a closure *)
let make_accessor : 'a . ('a - 'a dlist) -
  (module Accessor with type a = 'a) =
  fun (type c) (fn : c - c dlist) -
let module M = struct
  type a = c
  let get x = fn x
end
in
(module M : Accessor with type a = c)

(* Iterator over a DList *)
let get_next : 'a . (module Accessor with type a = 'a) - 'a - 'a =
  fun (type a) x y -
let module D = (val x : Accessor with type a = a)
in
(D.get y).next

(* Now lets use this: *)

(* A task has 2 DList: all and state *)
type task = {
  all : task dlist;
  state : task dlist;
  name : string;
}

(* Accessors for the two DLists *)
let all = make_accessor (fun task - task.all)
let state = make_accessor (fun task - task.state)

(* Some sample tasks *)
let rec task1 = {
  all = { next = task2; prev = task2; };
  state = { next = task1; prev = task1; };
  name = task1;
}
and task2 = {
  all = { next = task1; prev = task1; };
  state = { next = task2; prev = task2; };
  name = task2;
}

let () =
  Printf.printf all_next = %s, state_next = %s \n
(get_next all task1).name
(get_next state task1).name

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] Can one implement greedy/inline data structures in ocaml?

2012-03-07 Thread Goswin von Brederlow
Hi,

I'm wondering if it is possible to implement greedy/inline data
structures in ocaml in a generic and reusable way.

Maybe you don't know what greedy/inline data structures are or maybe I'm
using the wrong name. So here is what I mean:

Say you have a task structure that is in 2 doubly linked list. One list
links all the structures together and the second list links structures
with the same internal state (running, sleeping, waiting, ...). In C you
would then have something like this:

struct task {
  DLIST(task, all);
  DLIST(task, state);
  foo_t data;
}

The DLIST makro adds the prev/next pointers to the struct, one set
prefixed with all, the other with state. There would also be makros to
to init, insert, remove and iterate of the two lists.

The internal pointer needed for the DLIST are part of the struct
task. The big advantage of this (and why I ask) is that given a task one
can esily remove it from both lists.

On the other hand if the list is external to the struct and only
contains a pointer to the task then removing a task becomes
difficult. The task then needs pointers to each of the lists data
structures creating cycles. Not good for ocaml. It also would waste
memory for 2 pointers (per list).



So how would you design such greedy/inline data structures in ocaml?

Or how else would you design containers (list, doubly linked list,
queue, ...) so that things can be in multiple containers and removing
them is fast?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Can one implement greedy/inline data structures in ocaml?

2012-03-07 Thread Goswin von Brederlow
Edgar Friendly thelema...@gmail.com writes:

 On 03/07/2012 09:41 AM, Goswin von Brederlow wrote:
 The task then needs pointers to each of the lists data
 structures creating cycles. Not good for ocaml. It also would waste
 memory for 2 pointers (per list).

 Cycles are fine for ocaml, pointers are pretty cheap, and I think the
 answer to your question is no - there's a regularity to values that's
 required by the garbage collector, and what you seem to want is the
 ability to inline nested structures without pointers, and OCaml's data
 representation doesn't allow this, mainly because of the tag word
 needed by the GC.

 Beware premature optimization.

 E.

The problem with cycles is that you need to define all structures in a
cyled atomically as the same time. That means using let rec foo =
... and bar = ... and sometimes needing to lift function calls out of
the initialization.

For example and from memory you couldn't write this:

let rec task = {
all_list = all_list;
state_list = state_list;
foo = bar;
}
and all_list = DList.create_item all_list_head task
and state_list DList.create_item state_list_head task

At the time the DList.create_item is called the task binding is still
incomplete and you get an error that it can't be on the right-hand side.

So then you need mutable option types or mutables that you initialize
with dummy values and then overwrite with the real thing once all
members of a cycle are created. Or some other trickery to make it
work. Overall cycles are generally not good.


Inlining the DList wihtout another indirection like in C would be nice
but would certainly require some camlp4 code to pull off. But it would
certainly be enough to pull in a pointer to the list structure as long
as you don't also need a back pointer inside the list structure. The
cycle is the problem there, just one direction is fine.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-03-01 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 What I did not understand up to now: It is really easy to keep all this
 information in a helper data structure. Why extend Bigarray?

 Gerd

A helper data structure would mean keeping track of that information on
my own complicating the source and wasting memory at runtime.

On the other hand what I suggested only makes use of what is already
there in the Bigarray data structure. The functions would only make that
information available to the user. I'm not a fan of duplicating
information that is already there. Worst case the helper data structure
could be wrong due to some bug in the code.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-02-29 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Am Dienstag, den 28.02.2012, 21:17 +0100 schrieb Goswin von Brederlow:
 Hi,
 
 I'm implementing a RAID in userspace using ocaml and NBD (Network Block
 Device) as protocol to export the device. For this I'm using
 Bigarray.Array1 as buffer for data and wrote myself the right read(),
 write(), pread() and pwrite() stubs. The advantage of this (over
 strings) is that Bigarrays are not copied around by the GC so they don't
 need to copy data around between the Bigarray and a temporary buffer in
 the C stub.

 I used the same approach for PlasmaFS. The bigarray-based reads and
 writes are really missing in the stdlib. (If anybody wants to
 experiment, look into the Netsys_mem module of Ocamlnet, and use the
 functions mem_read and mem_write.)

 Btw, you should try to allocate the bigarrays in a page-aligned way.
 This makes the syscalls even more efficient.

Which is actualy a requirement for libaio. At least alignment to
blocksize. My libaio bindings include a stub to create an aligned
Bigarray.

 In my use case, I did not write to devices directly, and could assume
 that the blocks are backed by the page cache. So I did not run into this
 problem you describe in the following.

 For efficiency each request stores all its data in a single
 Bigarray.Array1. For reasons of the RAID implementation large requests
 are split into 4k chunks using Bigarray.Array1.sub and grouped into
 stripes. The stripes are then acted upon independently until all stripes
 involved in a request are finished. Then the request is completed.
 
 Now I get a problem when 2 requests come in that overlap. Say I get a
 write request for blocks 0 - 6 and a read request for blocks 4-9 in
 that order. Since I already have the data for block 4-6 in memory from
 the write request I do not want to reread them from disk. On the stripe
 level the data looks like this:
 
 |W0 W1 W2 W3 W4 W5 W6 .  .  .  | write 0-6
 |W4 W5 W6 R7 R8 R9 | read  4-9
 
 For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9
 in two writes. I think I would like to avoid sending each stripe chunk
 out seperately.

 Why not? This seems to be an elegant solution, and I don't see why this
 should make the accesses slower. The time for the extra context switches
 in negligible compared to the disk accesses.

A lot of the time data will just move between caches so disk speed is
secondary. And sending each chunk seperately would mean up to 32 context
switches instead of 1. And writing to a socket in chunks can lead to the
data being send out in frames less than the MTU which would be
wastefull. So I think there is some gain in limiting this.

  On the other hand I could implement (p)readv() and
 (p)writev() stubs.
 
 Anyway, my question now is how to detect which subarrays in the stripes
 are part of a common larger array? Do I need to write my own C stub that
 looks into the internas of the arrays to see if they share a common
 ancestor?  I think that would be preferable to tracking the origin of
 each subarray myself.

 Yes, subarrays are tracked, but this feature exists only to catch the
 right moment for unmmapping bigarrays (if needed). As far as I remember,
 this is not tracked as a sub/super relationship, but the bigarrays
 accessing the same buffer share then the same buffer descriptor, and
 when the use count drops to 0, the buffer is finally destroyed. So, you
 cannot say which bigarray is the original one, and it can even be that
 the original bigarray is already deallocated but the backing buffer is
 not yet.

All subarrays share a struct caml_ba_proxy, as you say to catch the
right moment for unmmapping bigarrays. So two subarrays are part of the
same bigarray if they have the same proxy object. That seems like an
easy enough test. Which one is the original bigarray doesn't matter, at
least to me.

 On a similar note how does Bigarray.Array1.blit handle arrays that are
 part of the same larger array and overlap?
 
 --
 I'm missing functions like:
 
 val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t
 Create a merged sub-array of 2 adjacent sub-arrays of the same
 big array.

 This function would be possible to implement. The requirement would be
 that both bigarrays use the same buffer descriptor.

 val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool
 True if the 2 (sub-)arrays are part of the same big array.

 I would not call it same_parent, but same_backing_buffer.

Maybe related would be better?

 val offset : ('a, 'b, 'c) t - int
 Offset of the sub-array in the underlying big array.

 I think this information is in the current implementation not available.
 As buffer sharing is also possible after reshaping, this is also not
 meaningful in the general case.

offset = array-data - array-proxy-data;

After reshaping the offset would be the offset into the original array
after reshaping

[Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-02-28 Thread Goswin von Brederlow
Hi,

I'm implementing a RAID in userspace using ocaml and NBD (Network Block
Device) as protocol to export the device. For this I'm using
Bigarray.Array1 as buffer for data and wrote myself the right read(),
write(), pread() and pwrite() stubs. The advantage of this (over
strings) is that Bigarrays are not copied around by the GC so they don't
need to copy data around between the Bigarray and a temporary buffer in
the C stub.

For efficiency each request stores all its data in a single
Bigarray.Array1. For reasons of the RAID implementation large requests
are split into 4k chunks using Bigarray.Array1.sub and grouped into
stripes. The stripes are then acted upon independently until all stripes
involved in a request are finished. Then the request is completed.

Now I get a problem when 2 requests come in that overlap. Say I get a
write request for blocks 0 - 6 and a read request for blocks 4-9 in
that order. Since I already have the data for block 4-6 in memory from
the write request I do not want to reread them from disk. On the stripe
level the data looks like this:

|W0 W1 W2 W3 W4 W5 W6 .  .  .  | write 0-6
|W4 W5 W6 R7 R8 R9 | read  4-9

For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9
in two writes. I think I would like to avoid sending each stripe chunk
out seperately. On the other hand I could implement (p)readv() and
(p)writev() stubs.

Anyway, my question now is how to detect which subarrays in the stripes
are part of a common larger array? Do I need to write my own C stub that
looks into the internas of the arrays to see if they share a common
ancestor?  I think that would be preferable to tracking the origin of
each subarray myself.


On a similar note how does Bigarray.Array1.blit handle arrays that are
part of the same larger array and overlap?

--
I'm missing functions like:

val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t
Create a merged sub-array of 2 adjacent sub-arrays of the same
big array.

val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool
True if the 2 (sub-)arrays are part of the same big array.

val offset : ('a, 'b, 'c) t - int
Offset of the sub-array in the underlying big array.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Fwd: interval trees

2012-02-16 Thread Goswin von Brederlow
Francois Berenger beren...@riken.jp writes:

 Hello,

 Anyone can translate this into being tail recursive
 if it's possible:

 let rec interval_tree intervals =
   match intervals with
   [] - Empty
 | _ -
 let x_mid = median intervals in
 let left, mid, right = partition intervals x_mid in
 let left_list = L.sort leftmost_bound_first mid in
 let right_list = L.sort rightmost_bound_first mid in
 Node (x_mid,
   left_list, right_list,
   interval_tree left, interval_tree right)

 I'm afraid my program could crash on a very long list of intervals.

 Thanks a lot,
 F.

Each recursion halves the lists, right? That means you need a very very
very long list to exceed the recursion depth.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Fwd: interval trees

2012-02-15 Thread Goswin von Brederlow
Francois Berenger beren...@riken.jp writes:

 Hello,

 I did a naive implementation of interval trees for float intervals.

 It is available here:
 https://github.com/HappyCrow/interval-tree

 I wonder if it is possible to construct the trees in a tail recursive
 fashion. Maybe I knew how to do this when I was still at university.

 Regards,
 Francois.

| Node of
   (* x_mid left_list right_list left_tree right_tree *)
  float * interval list * interval list * interval_tree * interval_tree

Why interval list? You only need a single interval in leafes and none in
other nodes (although it can help to store min and max in each node).

You are also missing insert and remove operations, which is the actually
hard part in this. Inserting an interval might require merging the
rightmost interval left of the root and the leftmost interval right of
the root. So you would have 2 removals and one insertion of a combined
interval, which complicates balancing the whole thing efficiently.

That is the part I'm struggling with.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl and security

2012-02-09 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Well, looks like that the upper 32 bits aren't considered at all for
 computing the hash value. This is for sure surprising - I would have
 expected that these bits somehow go in to the final value (e.g. that they
 are xor-ed with the lower half).

 My guess is that this was done to ensure that an int-to-something
 hashtable can be marshalled from 64 to 32 bit systems. Not sure for what
 this is important, but it looks like a nice symmetry.

Marshaling hashtables of ints might be an argument. BUT:

# Hashtbl.hash (1 lsr 40);;
- : int = 0
# Hashtbl.hash (Int64.of_int (1 lsr 40));;
- : int = 0
# Hashtbl.hash (Int64.of_int (1 lsr 40 + 1));;
- : int = 1

 In trunk the method for ensuring this seems to have changed (see function
 caml_hash_mix_intnat in
 http://caml.inria.fr/cgi-bin/viewvc.cgi/ocaml/trunk/byterun/hash.c?revision=12072view=markup).

 Gerd

Lets hope that fixes this.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl and security

2012-02-09 Thread Goswin von Brederlow
AUGER Cédric cedric.au...@lri.fr writes:

 More seriously, hashtables are here to store values, so you don't want
 want to have a hash value bigger than (2^30)-1.

 In fact even such a value is too big for that purpouse IMHO, since on a
 4Gio RAM computer, it will take half it to store the table, assuming
 each entry only takes 4 octets, that is it would be ok only for storing
 just the structure of the table (each entry has to be a pointer),
 so that in practice, either your table is ridiculously big to store
 too few values either it is relevant to have such a size and in this
 case your computer will spend all its time to swap since you won't
 have enough RAM.

But I have 128GB ram. :) That allows for a hashtable of size 2^33
entries pointing at simple values (e.g. ints) without swapping. And ram
sizes get bigger and bigger.

[Doesn't a Hashtbl of ints store the ints directly? That would allow
2^34 entries.]

  Clearly you cannot a bijection from 63 bits
 integers to 31 bits integers; so there are a lot of collisions. I do
 not have a 64 arch, but I guesse that if you hash 2^48+168 you will
 get 168. I guess such a function is ways to be ideal, since when
 playing with bitvectors having two integers which differs by only one
 bit is very common, but at least it is a fast function, and has a good
 repartition (you have 2^32 collisions exactly per bucket).

Ignoring the upper bits makes for a verry verry poor hash function. Also
means it is 100% trivial to create a DOS by tuneing your input values to
only differ in the upper bits.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl and security

2012-02-08 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Gerd Stolpmann i...@gerd-stolpmann.de writes:
 You are forgetting a variable in this. To create a DOS in the hashtable
 you need to hit the same bucket again and again. And bucket = hash % size.

 You focused on the hash but lets talk about the size for a moment.

 The size is rather limited and large hashtabels simply won't increate
 size anymore and allways have lots of collisions. So even if someone
 isn't actively trying to create DOS the performace still tanks as you
 add more items.

 I cannot completely follow here. If you add more elements, the bucket
 array is resized. The ocaml Hashtbl does this when the number of elements
 exceeds half of the array size. The argument of Hashtbl.create is only the
 initial size of the bucket array.

 The upper bound is Sys.max_array_length, of course. Granted, on 32 bit
 platforms it is low (4M). But nobody of sound mind would run ocaml
 programs on 32 bit for production purposes. (If you do, consider to
 switch. You can use more memory and get a little performance boost, not
 only for ocaml programs.)

 Gerd

In practice I see a serious performance degradation with large
hashtables. So much so that I had to use an array of hashtables or
hashtable of hashtables and split the values across them.

Maybe the size of the hashtable indeed isn't the problem but the hash
function. Here is something odd:

# for i = 1 to 63 do Printf.printf %d %d\n (1 lsl i) (Hashtbl.hash (1 lsl i)) 
done;;
2 2
4 4
8 8
16 16
32 32
64 64
128 128
256 256
512 512
1024 1024
2048 2048
4096 4096
8192 8192
16384 16384
32768 32768
65536 65536
131072 131072
262144 262144
524288 524288
1048576 1048576
2097152 2097152
4194304 4194304
8388608 8388608
16777216 16777216
33554432 33554432
67108864 67108864
134217728 134217728
268435456 268435456
536870912 536870912
1073741824 0
2147483648 0
4294967296 0
8589934592 0
17179869184 0
34359738368 0
68719476736 0
137438953472 0
274877906944 0
549755813888 0
1099511627776 0
219902322 0
4398046511104 0
8796093022208 0
17592186044416 0
35184372088832 0
70368744177664 0
140737488355328 0
281474976710656 0
562949953421312 0
1125899906842624 0
2251799813685248 0
4503599627370496 0
9007199254740992 0
18014398509481984 0
36028797018963968 0
72057594037927936 0
144115188075855872 0
288230376151711744 0
576460752303423488 0
1152921504606846976 0
2305843009213693952 0
-4611686018427387904 0
0 0
- : unit = ()

Is every value over a billion hashed to 0?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl and security

2012-01-30 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Hi,

 there was recently a security alert for web services that use hash
 tables to store web form parameters sent via POST (so that millions of
 such parameters can be sent in a single request). It is possible to keep
 the web service busy for hours with such a DoS (denial of service)
 attack. The type of attack boils down to a problem in most hash table
 implementations, namely that the hash functions are invertible, and it
 is possible for a malicious user to construct lots of keys that all map
 to the same bucket of the hash table, creating a mass collision.

 The text of the alert: 
 http://www.nruns.com/_downloads/advisory28122011.pdf

 I'd like to discuss this issue, because it is not restricted to the
 processing of web requests, but may also occur for all other data coming
 from untrusted sources. The web is only the most exposed area where this
 issue exists.

 So how is Ocaml affected? The hash functions used in recent Ocaml
 releases are also insecure in the above mentioned sense (currently
 MurmurHash3, and even a simpler hash function in previous releases). A
 quick survey of the Internet revealed at least one site that tries to
 break it. Probably a good cryptographer could do it in minutes. 

 A pure Hashtbl.add of the constructed keys does not yet lead to the
 performance degradation, but a Hashtbl.replace, and of course
 Hashtbl.find after the table is built up will. So it depends very much
 of the details of the programs whether they are affected or not.

 I've just checked that Ocamlnet uses only Hashtbl.add to collect POST
 parameters, so it is not directly vulnerable. But if the crafted request
 is actually served by a handler, the handler would get a degraded table,
 and could show in turn bad performance (again leading to DoS).

 What are possible fixes?

 1) Avoid hash tables in contexts where security is relevant. The
 alternative is Set (actually a balanced binary tree), which does not
 show this problem.

Unfortunately O(log n)  O(1) and that can be a deciding factor in the
overall runtime. Even for small n your code then runs 2,3,4,... times
slower.

 2) Use cryptographically secure hash functions.

 3) Use randomized hash tables. The trick here is that there is not a
 single hash function h anymore, but a family h(1)...h(n). When the hash
 table is created, one of the functions is picked randomly. This makes it
 impossible to craft an attack request, because you cannot predict the
 function.

 I don't think 1) is viable - hash tables are too wide spread, and are
 loved by programmers because of their good performance. 2) would be good
 in extremely critical contexts - although it is then again questionable,
 because it is likely to have worse performance than 1).

 So, the question is how to do 3). I see two problems here:

 a) how to define the family of hash functions. Is it e.g. sufficient to
 introduce an initialization vector for the Murmurhash algorithm, and
 fill it randomly? How to get a random number that is good enough?

 b) the Hashtbl in the standard library does not allow it to set the hash
 function dynamically. Maybe one can get this effect by using first-class
 modules (haven't checked).

 Anyway, I think these problems are difficult enough to deserve some
 discussion here. At least I cannot solve them immediately, and this
 problem may exist for lots of Ocaml applications.

 Gerd

You are forgetting a variable in this. To create a DOS in the hashtable
you need to hit the same bucket again and again. And bucket = hash % size.

You focused on the hash but lets talk about the size for a moment.

The size is rather limited and large hashtabels simply won't increate
size anymore and allways have lots of collisions. So even if someone
isn't actively trying to create DOS the performace still tanks as you
add more items.

And this isn't only a problem in ocaml. The glib hashtable also has a
maximum size in the 32bit range for example.


So for ocaml the first thing that needs to be fixed is to allow larger
size arrays of buckets.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl and security

2012-01-30 Thread Goswin von Brederlow
Xavier Leroy xavier.le...@inria.fr writes:

 On 01/01/2012 01:52 PM, Richard W.M. Jones wrote:
 On Fri, Dec 30, 2011 at 06:06:26PM +0100, Xavier Leroy wrote:
 Indeed.  The optional seed parameter to Hashtbl.create does exactly
 this in the new implementation of Hashtbl (the one based on Murmur3).
 
 It may be worth noting that Perl solved this problem (back in 2003) by
 unconditionally using a seed which is a global set to a random number
 during interpreter initialization.  

 That's how my initial reimplementation of Hashtbl worked, using the
 Random module to produce seeds, but I was told (correctly) that in
 security-sensitive applications it's better to leave the generation of
 random numbers under control of the programmer.  For some applications
 Random.self_init might be good enough and for others stronger
 randomness is needed.

 Of course, you can trivially emulate Perl's behavior using the new
 Hashtbl implementation + the Random module.

 - Xavier Leroy

It is also crucial if you are doing performance tests or debugging. You
want the same behaviour on every run for that.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml maintenance status / community fork (again)

2011-12-12 Thread Goswin von Brederlow
Mehdi Dogguy me...@dogguy.org writes:

 On 12/12/2011 11:59 AM, Benedikt Meurer wrote:
 
 On Dec 12, 2011, at 11:21 , Xavier Leroy wrote:
 
 - It complicates the lives of OCaml users, packagers, and
 3rd-party library developers: what version should they use?  what
 will be the basis for the packagers's distribution-specific
 patches?  what happens if a library is compatible with one version
 but not the other?  what if the community effort runs out of steam
 in a few years?
 
 If we can adopt the eglibc model, then the community thing will be 
 the version shipped by distributions, i.e. the community thing is the
 OCaml for distributions/packagers, not an alternative to the official
 version. That way we do no longer need to maintain specific patches /
 versions for Debian, Red Hat, MacPorts, etc. This ensures that
 versions are compatible across different distributions (because they
 do no longer need to maintain their own set of patches).
 

 No, distributions won't start shipping the community distribution just
 because it exists. I cannot speak for Fedora (and others), but
 Debian/Ubuntu won't switch to the community distribution that easily. In
 fact, it may appear as a seperate source package at some point but won't
 replace INRIA's ocaml in Debian.

That is probably what they said about eglibc too in the beginning. Now
eglibc is used in Debian/Ubuntu and is (or because it is) a great
success.

Obviously nobody is going to say: Yes, lets start this project, we will
ship it right away. It first has to proof itself. And first adopters
will be those that have the most problems with the standard ocaml.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml maintenance status / community fork (again)

2011-12-11 Thread Goswin von Brederlow
Xavier Leroy xavier.le...@inria.fr writes:

 On 12/08/2011 10:10 AM, Benedikt Meurer wrote:
 The relevant bug report PR/5404, which includes a backward
 compatible patch, is already waiting for a sign of life for 3 weeks
 now (maybe wait another 4 years to get the port fixed).

 More bile.  What's so urgent about it?  The next release of OCaml is 3-6
 months in the future; your suggestion will be examined by then.  You know,
 some people have busy professional lives. I can only devote 1 to 2 days per
 months to OCaml maintenance, and would prefer to spend them on technical
 work rather than on massaging egos or putting out the fires that you light
 up on this list.

I don't want to be aggressive or offensive and if it comes across as
that then please forgive me. No need to massage my ego also. I would
rather be told to piss off or that a patch will never be added because
it is stupid than have you play nice and have patches be left unatended
in mantis. Because how else will I ever learn?  That said ...


So maybe PR/5404 was a bad example. How about these?

[1] 1 month to acknowledge, 24 month silence since then
[2] 1 month to acknowledge, 24 month silence since then
[3] 11 month to acknowledge, 19 month silence since then
[4] 16 month to acknowledge, 18 month silence since then
[5] 15 month to acknowledge, 6 month silence since then

[1] http://caml.inria.fr/mantis/view.php?id=4919
[2] http://caml.inria.fr/mantis/view.php?id=4909
[3] http://caml.inria.fr/mantis/view.php?id=4812
[4] http://caml.inria.fr/mantis/view.php?id=4987
[5] http://caml.inria.fr/mantis/view.php?id=5005

All of these have patches (or just needs a function declaration copied to
the .h file). They don't need huge work consuming changes to the
compiler and they don't add anything fundamentally new to it. They just
improve/extend the existing stuff a little.

For those to lazy to follow the links here is one of the patches that
has been sitting there for 2 years now:

--
--- ocaml-3.11.0.orig/otherlibs/unix/unixsupport.h
+++ ocaml-3.11.0/otherlibs/unix/unixsupport.h
@@ -20,6 +20,7 @@
 #define Nothing ((value) 0)
 
 extern value unix_error_of_code (int errcode);
+extern int code_of_unix_error (value error);
 extern void unix_error (int errcode, char * cmdname, value arg) Noreturn;
 extern void uerror (char * cmdname, value arg) Noreturn;
 
--- ocaml-3.11.0.orig/otherlibs/unix/unixsupport.c
+++ ocaml-3.11.0/otherlibs/unix/unixsupport.c
@@ -263,6 +263,15 @@
   return err;
 }
 
+extern int code_of_unix_error (value error)
+{
+  if (Is_block(error)) {
+return Int_val(Field(error, 0));
+  } else {
+return error_table[Int_val(error)];
+  }
+}
+
 void unix_error(int errcode, char *cmdname, value cmdarg)
 {
   value res;
--

Adding this does not take a week of your summer vacation but still it
gets ignored.

Sure it is more fun to write code for first-class modules and the core
team has done a great job there. On the other hand trvial things are not
being done and that is verry frustrating to the submitter. Frankly I've
given up writing patches for ocaml because what is the point. They are
not getting looked at. Through things like that you are loosing a
workforce that could contribute at lot.


On the other hand if the model of glibc / eglibc for example where used
for ocaml the frustration would be reduced, the patches would get
audited by others in the community and added to the one community fork
before reaching the core team. You would get less patches, patches that
are better tested and cleaner. Bad patches would get filtered out
before they reach you.

This is not a condemnation of you or the core team. It is probably not
even something you can do anything about no matter how hard you try. It
is probably just a matter of scale. As a project growes at some point
there just seem to be a point where a single central core doesn't work
anymore. Things pile up, things fall through the cracks, priotization
leaves things starving. There plain aren't enough hours in the day to
cover every single bug or reply to every mail on the mainlinglist.

So let the community help and offload some work on them, like vetting a
simple patch like the above, so that you have more time to concentrate on the
difficult things like arm support or fist class modules (which I love by
the way) or to take a deserved vacation.

MfG
Goswin


-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Storing ocaml values outside ocaml's heap

2011-12-08 Thread Goswin von Brederlow
William Le Ferrand william.le-ferr...@polytechnique.edu writes:

 Dear list, 

 We are building a cache in ocaml and we're wondering if it would make sense to
 store ocaml values outside the reach of the gc. (gc on a 20GB cache hangs the
 process for a second or so).

 To run some experiments, we wrote a small library (https://github.com/besport/
 ocaml-everlasting) that exposes two functions, get and set. 

 When inserting a value, we copy recursively the blocs outside of the reach of
 the gc (and put the resulting value in some C array). When getting the value,
 we simply pass the pointer to the copied value to the ocaml code (the 
 structure
 is still coherent and the value is directly usable). We also wrote an update
 function that compare a new value with the existing value in cache, to avoid
 unnecessary memory allocation/deallocation.

 It does not seems very stable though, but I don't know if it is a bug in the
 update function or simply because this approach is not reasonable. Do you have
 any thoughts? Is there any clever way to build a large cache in an ocaml app 
 ? 

 Thanks in advance for any tips!

 Best

 William

For a generic case you will have to inspect the values, tell the GC
about all the other ocaml values it points to and track any heap values
that have a reference to your external value so you don't delete it
before it is dead. In short you need to write a GC.

For special cases you can store completly self contained values, e.g. a
array of floats or a record of non pointer types. The simplest of types.


For a cache you often have millions of objects of the same type. And
often those contain pointers but do not share them. Then it becomes
feasable to copy everything they point to into the cache as well. Or
write a little C glue to store the data in an abstract or custom block
that does not contain pointers to outside the block. Often you can also
store the data much more compact than ocaml allows and thereby reduce
the memory footprint and increase cache efficiency.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml maintenance status / community fork

2011-12-07 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Hi all,

 I will not jump in the how to save OCaml from dying because nothing
 moves discussion. But just in the nothing moves discussion.

 On Tue, Dec 6, 2011 at 2:52 PM, ivan chollet ivan.chol...@gmail.com
 wrote:
 The current status of OCaml is more than stable enough to serve its
 goals,
 which are to teach computer science to french undergrads and provide a
 playground for computer languages researchers.

 First, french undergrads sadly often still use camllight... Which is
 not the case for example of Harvard undergrad
 (http://www.seas.harvard.edu/courses/cs51/lectures.html) and some
 UPenn one (http://www.seas.upenn.edu/~cis341/). But you are right that
 I can't find any well known university out of France using OCaml to
 teach computer science...

 Well, if you ask whether _any_ FP language is taught, the results won't be
 much better.

 I'm currently doing consulting for a web company (in Germany) - around 60
 developers, many fresh from the University. There are only three guys
 knowing FP languages at all - one Scala, one Erlang, and one R. It's a
 complete failure of the academic education.

 IMHO it does not matter which FP language you are taught in. The point is
 that the students understand the ideas, and that they recognize them as
 relevant. These web developers here in the company have no clue that they
 actually developing a big continuation-style FP program.

 Gerd

In Tuebingen we started with scheme in the first year and later there
were several classes on FP languages and a few using ocaml, for example
the compiler construction class. But the later classes are pickchoose,
you just need enough credits from each of the 3 major groups, so people
can probably completly avoid FP based curses.

Personally I think the introduction to computer languages class we had
is a must. How else do you even know what is out there and if you like
it?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml maintenance status / community fork

2011-12-07 Thread Goswin von Brederlow
oliver oli...@first.in-berlin.de writes:

 Hello,


 during the last years, more than one person mourned about
 this or that dark sides of OCaml.

 Even some of the mourning and the proposals had mentioned good ideas and had
 positive motivation, after a while it became clear, that the same people with
 the one or the other good idea, failed badly in other areas. Good, that they
 did not have had too much influence in the development of OCaml.

 Even in general I like the community/bazaar, I think in case of OCaml,
 there is a lot of high knowledge in the core team, which was criticized
 by others already, but in the long run, it turned out that the core team
 had their reasons for a lot of decisions, which were criticized.
 Ocaml of course will also have some history-related issues that might be
 changed, but maybe also a lot of decisions inside, which relies on theoretical
 reasoning.

That is no excuse for not reacting to bug or feature patches. If
there is a reason not to accept a patch then that can be communicated.

There is no excuse for silence.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Some comments on recent discussions

2011-12-07 Thread Goswin von Brederlow
Ashish Agarwal agarwal1...@gmail.com writes:

 A standard library does not imply big or that it is part of the standard
 distribution. Both Batteries and Core would make fine standard libraries.
 Neither is very big and both are independent of the standard distribution. But
 having 5 different standard libraries is annoying precisely because the 
 library
 covers just basic concepts. We don't need 5 APIs for string functions. They 
 are
 all easy to learn independently but a pain if you have to know them all. A
 beginner won't bother taking the time to do a google search and find which one
 to use; before that they'll use another language.

We used to have 100 seperate little modules all over the place. Now we
have 2 contenders for a standard library. I think we are going the
right way there. Lets leave that allone for at least a year or two.

And we do (initially) need 5 APIs for string functions and we need to
test them all and decide which one fits best. And there can be more than
one that should be kept. E.g. one API with exceptions and one with
option types.  But then all modules in the standard library should
provide those APIs we want to keep consistently across all modules.

Lets agree that we eventually want one standard library.

Can we also agree that the standard library can be external to the
compiler, like Batteries or Core is?

 I also think the debate about github, ocamlforge, etc is fruitless. People 
 have
 different preferences and who knows what new tools will be developed a couple
 years from now. Instead of arguing about which one is best, we should accept
 that we will not agree on this. And why do we need to? We can aggregate the
 information from multiple sources and present it in a uniform way on a
 community website.

ACK. Totaly different fruit. Even talking about a standard library is
imho fruitless (at this point in time) and OT to the original problem.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml maintenance status / community fork

2011-12-06 Thread Goswin von Brederlow
Benedikt Meurer benedikt.meu...@googlemail.com writes:

 Dear caml-list,

 During the last year or two it seems that time and interest in OCaml 
 maintenance from the official OCaml development team is diminishing. It takes 
 several months to get a patch reviewed (if at all), which is quite 
 frustrating for OCaml contributors and even worse for OCaml users. I suspect 
 that this is one of the top reasons why there are only a few active 
 contributors to OCaml (and the number of active users, at least on the 
 mailing list, is declining).

 I understand that INRIA does not necessarily pay people for full time 
 maintenance jobs on OCaml (and Coq), and the official dev team is probably 
 already doing as much as possible to maintain OCaml. Given that OCaml is such 
 a nice language with a lot of useful frameworks available, it is too sad to 
 see it loosing ground just because of it's closed development process and 
 lack of time of the official team.

 I'd therefore propose to open up OCaml development to a wider range of 
 developers / contributors, to ensure that OCaml will be ready for the 
 (functional programming) future. There are already various OCaml forks in 
 the wild, with different goals and patch sets, so simply starting another 
 fork would be rather useless. Instead I'd suggest to bundle efforts in a new 
 OCaml community fork, which is always based on the most recent upstream 
 OCaml release (starting point would be 3.12.1 for now), and takes care to 
 review and integrate pending patches as well as developing and testing new 
 features. Let's say we'd name the fork OCaml-ng, then we'd try to release a 
 new patch set every month or two, based on the official OCaml release, i.e. 
 ocaml-3.12.1+ng201112 and so on, to get early testing and feedback (should 
 work together closely with the Debian/Ubuntu/etc. OCaml maintainers).

 With this process, OCaml upstream could merge (tested) patches from OCaml-ng 
 once they proved working in the wild, and thereby

 1. maintenance overhead for INRIA people is reduced,
 2. maintenance status of OCaml would be way better,
 3. there would be a lot less frustration for possible contributors, and
 4. users benefit from a better and more up to date OCaml.

 Now that does of course raise a few questions:

 1. What is the opinion of the official development team / INRIA on this?
 2. Who would help with the community fork?
 3. What about infrastructure?

 Feedback and suggestions are welcome.

 Benedikt

+1 for getting patches better/faster reviewd and included.

I'm still waiting to hear back for my Bigarray patch to support 31bit
integers.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] PlasmaFS, PlasmaKV, and MapReduce, version 0.5.1

2011-11-25 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Hi,

 I've just released Plasma-0.5.1, fixing a possible lock-up.

 General information about Plasma:

 Plasma consists now of three parts, namely PlasmaFS, PlasmaKV, and Plasma
 Map/Reduce:

   * PlasmaFS is a distributed replicating filesystem. Unlike other
 such filesystems, it is transactional and exhibits transactions
 to the user. Also, it implements almost all of what is known as
 POSIX semantics, and it is mountable.
   * PlasmaKV is a key/value database on top of PlasmaFS. It is
 designed for ultra-high read workloads, and offers interesting
 properties borrowed from PlasmaFS (e.g. replication and ACID
 transactions).
   * Plasma Map/reduce implements a variant of the popular
 data processing scheme.

 All pieces of software are bundled together in one download. The
 project page with further links is

 http://projects.camlcity.org/projects/plasma.html

 There is now also a homepage at

 http://plasma.camlcity.org

 THIS IS NOW A BETA RELEASE! I'm searching for testers. Whoever has
 access to a cluster please check Plasma out!

 Plasma is installable via GODI for Ocaml 3.12.

 For discussions on specifics of Plasma there is a separate mailing list:

 https://godirepo.camlcity.org/mailman/listinfo/plasma-list

 Gerd

Any plans to support fuse so the filesystem can be mounted locally?

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Odd failure to infer types

2011-09-17 Thread Goswin von Brederlow
Gabriel Scherer gabriel.sche...@gmail.com writes:

 Indeed, a let-bound [] is generalized to a polymorphic ('a list), while a
 let-bound (ref []), or (Array.make n []), is not generalized. It would be
 unsound to do so as if you had a reference to a polymorphic ('a list) you 
 could
 add elements of different types in the list.

 When a type variable is not generalized (polymorphic), it stays an unknown 
 of
 the type system until it is unified to a known type:

 # let r = ref [];;
 val r : '_a list ref = {contents = []}
 # r := 1 :: !r;;
 - : unit = ()
 # r;;
 - : int list ref = {contents = [1]}

 '_a is sometimes called a weakly polymorphic variable, in fact it is not
 polymorphic at all. It is just an unknown.

 In the code example above, the real type for '_a could be determined after two
 phrases in the top level. It may also happen as a compilation unit (if the 
 code
 was written in a file instead of fed phrase-per-phrase to the toplevel). We
 wouldn't observe the intermediary '_a inferred type, as the type are inferred
 globally.

I think you are missing the point.

I totaly get why it must be '_a instead if 'a. My question is why it
doesn't infer the full type from the call to the print function.

In my case the type inferences outputs something like this (using the
above example):

 # let r = ref [];;
 val r : '_a list ref = {contents = []}
 # r := 1 :: !r;;
 - : unit = ()
 # r;;
 - : '_a list ref = {contents = [1]}

I see my print call as equivalent to '1 :: !r'.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs