LGTM.
There is a tricky problem here, probably deserving a comment in the SOYC
code. Ideally, the XML file should be non-lossy, and the original
string text should be recoverable. The best way I have run into to
accomplish that would be to convert the string data back into string
literal
Personally, I would just transform every character ==0 or 127 into a \x
or \u escape (or since this is XML you could use an entity reference,
#x;). There shouldn't be a ton of them and it isn't like XML is small
anyway.
http://gwt-code-reviews.appspot.com/61801
I like the #x; idea. There is just one potential problem: will XML
readers support it? The linked XML spec has the same restrictions on
encoded character entities as on raw characters appearing in the file.
Does anyone know if that restriction is honored in practice? Anyone
want to test on
On Wed, Aug 19, 2009 at 6:28 AM, sp...@google.com wrote:
I like the #x; idea. There is just one potential problem: will XML
readers support it? The linked XML spec has the same restrictions on
encoded character entities as on raw characters appearing in the file.
Does anyone know if that
Thanks, Lex.
I didn't try the #x; idea (see Ian's comment), but I also added the
other illegal characters. I'll leave the recoverability (in the
dashboard) for another day: (x00) and (u) seem good to me for human
consumption, and the surrogate blocks characters shouldn't really ever
in an