Re: [libxml-devel] problem with UTF-16 encoding

Tim Perrett Tue, 27 Nov 2007 02:08:53 -0800

Interesting stuff. Just changed back to utf-16, and using doc.dump I  
see the byte order mark and the rest of the xml - result :)


Cheers guys

Tim

On 27 Nov 2007, at 04:39, Dan Janowski wrote:

> I have modified Document#to_s to permit the inclusion of a second
> encoding argument (didn't know there was a first one, eh?). It will
> not change the document encoding, but will case libxml to produce a
> representation of the document in the requested encoding (transcoding
> it if necessary). The default for it is nil, and results in the
> document's encoding.
>
> A few other notes about UTF-16 specifically; UTF-16 will result in a
> two byte lead in, UTF-16BE will not, nor will UTF-16LE. These latter
> encodings are not familiar, but may or may not be of interest.
>
> You were getting two 8bit chars and nothing else because of the
> UTF-16 lead in, but it was also getting truncated because the wrong
> ruby string constructor was being called (which did not use the
> length returned by the libxml dump, so an ^@ was stopping the string).
>
> In other words, it was always broken (I had not previously modified
> this code), now it is less broken.
>
> Dan

_______________________________________________
libxml-devel mailing list
libxml-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/libxml-devel

Re: [libxml-devel] problem with UTF-16 encoding

Reply via email to