On Mon, Jan 02, 2017 at 11:42:30AM -0800, David Fotland wrote:

> I think the character set property just refers to the contents
> of comments and similar fields. The sgf format itself is entirely
> in the common characters in UTF-8 and US-ASCII.
> There is no need to assume a character set before the property.
> If you find the character set property in the root node,
> it should apply to a root comment, even if it comes earlier
> in the properties in the root node.

In order to recognize the end of a comment C[text], one has
to recognize the closing ]. If the character set is multibyte,
it may have characters that have ] as a second byte.
(For example, the word Honinbo, spelled in Big5, contains a byte ']'.)

If one escapes bytes in the middle of a character, the resulting file
is no longer a text file, and corruption is the result.

If one does not escape bytes in the middle of a character,
one needs to be able to recognize the characters, i.e., know
the character set. That is why the CA[] property needs to come
before any non-ASCII text.

Andries

[Let me also bcc you directly - my previous reply did not make it to the list 
yet.]
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to