> * Richard L. Barnes wrote:
> >If frames are valid utf-8, then you don't need to keep any state (on
> >either end of the connection).
> 
> That's somewhat misleading. If you accept for instance both binary and text
> frames, you have to maintain the type of the first frame so you can reject
> continuation frames of the wrong type. You may also have to main- tain state
> coming from extension data or your protocol may require other state to be
> maintained. A minimal byte-oriented UTF-8 validator has nine states
> http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ and that seems less

very cool. thanks!

> onerous to maintain than having fragmenting senders and forwarders to find
> where character and byte boundaries coincide (you need to have the
> encoded text available and re-sychronize to character boundaries, or have
> the character stream available with analyze the encoded widths, or you have
> to include padding within the text, to start sending).

absolutely.

In total:

3 bits: opcode of first frame
1 bit: continuation state
4 bit: UTF-8 DFA state
============

1 octet state

 
_______________________________________________
Gen-art mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/gen-art

Reply via email to