Re: [containers-users] Possible additions to Containers and Friends

2018-03-06 Thread SP
Speaking-too-soon is a valid and powerful code verification technique; it exploits tempting the bugs to make their move. -- SP ___ Containers-users mailing list Containers-users@lists.ocaml.org http://lists.ocaml.org/listinfo/containers-users

Re: [containers-users] Possible additions to Containers and Friends

2018-03-06 Thread Simon Cruanes
Of course I spoke too soon, and missed so validation cases (that would have been accepted by Peter's code). In particular, I just learnt about some interesting corner cases of UTF8, namely overlong encodings. If anyone is knowledgeable about UTF8, reviewing the code would be greatly appreciated!

Re: [containers-users] Possible additions to Containers and Friends

2018-03-06 Thread Simon Cruanes
I merged and adapted the code from Peter: https://github.com/c-cube/ocaml-containers/blob/master/src/core/CCUtf8_string.mli https://github.com/c-cube/ocaml-containers/blob/master/src/core/CCUtf8_string.ml it's stricter (only accepts valid UTF8) and the random tests should ensure that it agrees

Re: [containers-users] Possible additions to Containers and Friends

2018-03-01 Thread peter frey
I'm not sure I understand, what is the point of supporting "more" than utf8? In the original utf8 standard the encoding is: The code is encoded as a string of length 1 + additional length. The additional length is a 0-ary encoding of the length '10' to '110'  (i.e.: 1.. 6) The first char

Re: [containers-users] Possible additions to Containers and Friends

2018-02-26 Thread peter frey
Simon occasionally includes code from some other part of the libraries to avoid requiring, say, Gen to access Sequence or Containers; I don't remember offhand.  In the case of some tiny piece of code thats sensible. (And so far that is all I have provided) Pervasives has now a type uchar

Re: [containers-users] Possible additions to Containers and Friends

2018-02-25 Thread Simon Cruanes
Well, there's the standard uchar type, I think compatibility is achievable :) ___ Containers-users mailing list Containers-users@lists.ocaml.org http://lists.ocaml.org/listinfo/containers-users

Re: [containers-users] Possible additions to Containers and Friends

2018-02-24 Thread Simon Cruanes
Le Sat, 24 Feb 2018, Drup wrote: > Shouldn't we just standardize on bunzli's libraries (including the new > https://github.com/dbuenzli/utext) instead of trying to re-write code that > usually ends up being quite subtle in each standard library ? We could build on uutf, it's relatively small and

Re: [containers-users] Possible additions to Containers and Friends

2018-02-22 Thread SP
> Thanks for the suggestions. I'm no expert in unicode, but I do agree > that such basic functionalities should be more easily available. > Maybe a `Ustring` module in containers would make sense (as a private > alias to `string`); most functionalities below would fit there Is this for

[containers-users] Possible additions to Containers and Friends

2018-02-10 Thread peter frey
(* Reading recent posts on discuss.ocal.org gives me the impression that some tiny number of utf related routines should be more easily available. Container's Sequence.t and Gen.t, in particular could benefit from a couple of simple routines.  The code below fits well into that frame work. I