Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
The simplest runtime implementation would be non-zero subtype on the string, to mark that it is binary data and not unicode text. (Although that might make the stringp operator a bit ambiguous.) The main benefits are in static typechecking, making sure you don't send unencoded text to I/O functions

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Mirar @ Pike developers forum
I'm not sure I follow. Which problem should this solve, a mark in the string struct what the type of data the string contains?

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Chris Angelico
On Thu, Nov 24, 2016 at 12:40 AM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se> wrote: >>In Python, it's done with a prefix - u"asdf" is a Unicode string, and >>b"asdf" is a byte string. > > Since nominally strings are Unicode (with the extende

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
Yes, s will be Unicode. Of course, you need to declare the character encoding of your source file using a #charset tag (or use a BOM to indicate UTF encoding).

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Mirar @ Pike developers forum
I think it would be a good idea as well, see 21907878. The only thing that should have to care about the encoding should be the endpoints. How are string constants handled today? If I do string s = "räksmörgås"; am I guaranteed a certain encoding of s?

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
Well, I'm not sure that's actually abusing it; Stdio.Buffer is a sort of compromise for getting some of the benefits of a native buffer type while not getting all of the problems (it does not affect compatibility as it uses a separate set of APIs, and while that does lead to inconsistency it's not

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Stephen R. van den Berg
Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote: >can also look at Java, which has byte[] as the type for byte strings, >requiring literals like {'a','s','d','f'}, but I would like to see In the EngineIO implementation I currently abuse Stdio.Buffer to fulfill this bin

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
Yup, the thing we were discussing was how it would be nice to actually be able to declare when they contain something else. :-) But it is a valid point that binary encoded data is not necessarily 8-bit. You should definitely be allowed to declare something as buffer(12bit) if you want to store 1

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Mirar @ Pike developers forum
>It's valid Pike. Pike supports the full ISO/IEC 10646 31-bit range, >plus an equally large negative range. Also note that Pike strings doesn't necessarily contain Unicode, even if they usually do. They _could_ just as well contain RGB pixels or random memory access data from a 12-bit-word syst

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
>Right, and that's something that can't be done in the current >standard. Hence this entire proposal has to wait until some major >changes can be done. Yup. And then those changes should not be a repurposing of an existing mechanism (element ranges on the string type) but something more appropria

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Chris Angelico
On Thu, Nov 24, 2016 at 12:20 AM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se> wrote: >>\U12345678 possibly should be an error, as it's not valid Unicode. > > It's valid Pike. Pike supports the full ISO/IEC 10646 31-bit range, > plus an equal

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
>By "binary data", I mean eight-bit strings of arbitrary bytes - like >you'd read from a file or something. Currently, functions like >Stdio.read_file simply return "string", but they'll effectively be >returning string(8bit). No, Stdio.read_file currently returns string(8bit). That simply means

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Chris Angelico
On Wed, Nov 23, 2016 at 11:10 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se> wrote: >>I agree, but using string(8bit) to mean "binary data" is something >>that's 100% backward compatible. > > It would not be backwards compatible, since that

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Mirar @ Pike developers forum
Strings with known encoding that can transfer into other strings with a known encoding easily and readable (and in some cases without any interaction) would be useful. For instance, Stdio.FILE x = ...; x->set_encoding("utf8"); string s = "räksmörgås"; String t = String.JP2022("\33(BHello, world!

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
>I agree, but using string(8bit) to mean "binary data" is something >that's 100% backward compatible. It would not be backwards compatible, since that is not what string(8bit) means today. >Unicode text would always be referred >to as string(21bit), even if it happens to contain nothing but Latin

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Chris Angelico
On Wed, Nov 23, 2016 at 10:30 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se> wrote: > I think you are conflagrating range with interpretation. Both a > Latin1 string and an UTF-8 encoded one are 8-bit strings (with a 0-255 > range). What w

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
Even if it hadn't been, fixing that would have been the correct course of action. ;-)

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
I think you are conflagrating range with interpretation. Both a Latin1 string and an UTF-8 encoded one are 8-bit strings (with a 0-255 range). What would be useful is a datatype that declares that the elements are not Unicode characters (as they are in the Latin1 string case) but some raw binary

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Stephen R. van den Berg
Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote: >If there are no character values >127, then the encoding step is a >no-op, so skipping it buys you nothing except making your code harder >to read. I see. I should have guessed that string_to_utf8() is already smart en

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Chris Angelico
On Wed, Nov 23, 2016 at 10:00 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se> wrote: > If there are no character values >127, then the encoding step is a > no-op, so skipping it buys you nothing except making your code harder > to read. I en

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
If there are no character values >127, then the encoding step is a no-op, so skipping it buys you nothing except making your code harder to read.

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-23 Thread Stephen R. van den Berg
Martin Nilsson (Coppermist) @ Pike (-) developers forum wrote: >>Please review, any comments are welcome. >This looks wrong: > if (String.width(msg) > 8) >msg = string_to_utf8(msg); >You are always utf8-decoding the string, so you should always >utf8-encode them. Well spott

Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-11-22 Thread Martin Nilsson (Coppermist) @ Pike (-) developers forum
>Please review, any comments are welcome. This looks wrong: if (String.width(msg) > 8) msg = string_to_utf8(msg); You are always utf8-decoding the string, so you should always utf8-encode them.

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-09-20 Thread Stephen R. van den Berg
Martin Karlgren wrote: >I guess the API user could keep track of sid:s and Server objects separately, > if they don???t want a .farm? It was/is not intended to be used like that, though the only dependency here is that in the Server object there are exactly three references to the global clients l

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-09-20 Thread Martin Karlgren
Very nice, good work! I guess the API user could keep track of sid:s and Server objects separately, if they don’t want a .farm? I'd imagine that the globally shared sid lookup mapping might be regarded as a security issue in more complex setups, such as multiple listener ports with different us

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-09-20 Thread Stephen R. van den Berg
For the record I'd like to mention that using just the "specs" at https://github.com/socketio/engine.io-protocol results in an incorrect implementation. I tried porting the javascript implementation at first: it resulted in a mess of event-hell. So, I finally used the specs first, then some good-

Clean-room Engine.IO implementation committed to git 8.0/8.1

2016-09-20 Thread Stephen R. van den Berg
Please review, any comments are welcome. The docs still need improvement, working on that. Currently tackling Socket.IO. -- Stephen.