I have modified Document#to_s to permit the inclusion of a second
encoding argument (didn't know there was a first one, eh?). It will
not change the document encoding, but will case libxml to produce a
representation of the document in the requested encoding (transcoding
it if necessary). The default for it is nil, and results in the
document's encoding.
A few other notes about UTF-16 specifically; UTF-16 will result in a
two byte lead in, UTF-16BE will not, nor will UTF-16LE. These latter
encodings are not familiar, but may or may not be of interest.
You were getting two 8bit chars and nothing else because of the
UTF-16 lead in, but it was also getting truncated because the wrong
ruby string constructor was being called (which did not use the
length returned by the libxml dump, so an ^@ was stopping the string).
In other words, it was always broken (I had not previously modified
this code), now it is less broken.
Dan
On Nov 26, 2007, at 22:38, Dan Janowski wrote:
> I don't have 0.3x on my system anymore, but I do not think UTF16 will
> behave any differently. .to_s is written incorrectly, from what I can
> tell, since it just feeds the encoding of the document back into the
> formatter. But in either case, if you want the as-encoded document,
> you really want to use doc.dump.
>
> Encoding has never worked correctly within the library. It only
> functions properly when fed UTF-8 as I have had to employ Iconv for
> anything else.
>
>
> Dan
>
> On Nov 26, 2007, at 16:05, Tim Perrett wrote:
>
>> Hey Chaps
>>
>> There seems to be some kind of issue with UTF-16 encoding in libxml-
>> ruby version 0.5.2.0.
>>
>> When I do this:
>>
>> doc = XML::Document.new()
>> # doc.encoding = 'utf-16'
>> doc.root = XML::Node.new('root_node')
>> root = doc.root
>> puts doc
>> ## => <?xml version="1.0"?><root_node/>
>>
>> Uncomment the encoding however and you get this:
>>
>> doc = XML::Document.new()
>> doc.encoding = 'utf-16'
>> doc.root = XML::Node.new('root_node')
>> root = doc.root
>> puts doc
>> ## => ÿþ<
>>
>> Any idea whats going on here and how to fix it? The encoding features
>> used to work no problem at all. Im running ruby 1.8.6 (2007-06-07
>> patchlevel 36) [universal-darwin9.0]
>>
>> Cheers
>>
>> Tim
>>
>>
>> _______________________________________________
>> libxml-devel mailing list
>> [email protected]
>> http://rubyforge.org/mailman/listinfo/libxml-devel
>
> _______________________________________________
> libxml-devel mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/libxml-devel
_______________________________________________
libxml-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/libxml-devel