> * Richard L. Barnes wrote: > >If frames are valid utf-8, then you don't need to keep any state (on > >either end of the connection). > > That's somewhat misleading. If you accept for instance both binary and text > frames, you have to maintain the type of the first frame so you can reject > continuation frames of the wrong type. You may also have to main- tain state > coming from extension data or your protocol may require other state to be > maintained. A minimal byte-oriented UTF-8 validator has nine states > http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ and that seems less
very cool. thanks! > onerous to maintain than having fragmenting senders and forwarders to find > where character and byte boundaries coincide (you need to have the > encoded text available and re-sychronize to character boundaries, or have > the character stream available with analyze the encoded widths, or you have > to include padding within the text, to start sending). absolutely. In total: 3 bits: opcode of first frame 1 bit: continuation state 4 bit: UTF-8 DFA state ============ 1 octet state _______________________________________________ Gen-art mailing list [email protected] https://www.ietf.org/mailman/listinfo/gen-art
