I have modified Document#to_s to permit the inclusion of a second encoding argument (didn't know there was a first one, eh?). It will not change the document encoding, but will case libxml to produce a representation of the document in the requested encoding (transcoding it if necessary). The default for it is nil, and results in the document's encoding.
A few other notes about UTF-16 specifically; UTF-16 will result in a two byte lead in, UTF-16BE will not, nor will UTF-16LE. These latter encodings are not familiar, but may or may not be of interest. You were getting two 8bit chars and nothing else because of the UTF-16 lead in, but it was also getting truncated because the wrong ruby string constructor was being called (which did not use the length returned by the libxml dump, so an ^@ was stopping the string). In other words, it was always broken (I had not previously modified this code), now it is less broken. Dan On Nov 26, 2007, at 22:38, Dan Janowski wrote: > I don't have 0.3x on my system anymore, but I do not think UTF16 will > behave any differently. .to_s is written incorrectly, from what I can > tell, since it just feeds the encoding of the document back into the > formatter. But in either case, if you want the as-encoded document, > you really want to use doc.dump. > > Encoding has never worked correctly within the library. It only > functions properly when fed UTF-8 as I have had to employ Iconv for > anything else. > > > Dan > > On Nov 26, 2007, at 16:05, Tim Perrett wrote: > >> Hey Chaps >> >> There seems to be some kind of issue with UTF-16 encoding in libxml- >> ruby version 0.5.2.0. >> >> When I do this: >> >> doc = XML::Document.new() >> # doc.encoding = 'utf-16' >> doc.root = XML::Node.new('root_node') >> root = doc.root >> puts doc >> ## => <?xml version="1.0"?><root_node/> >> >> Uncomment the encoding however and you get this: >> >> doc = XML::Document.new() >> doc.encoding = 'utf-16' >> doc.root = XML::Node.new('root_node') >> root = doc.root >> puts doc >> ## => ÿþ< >> >> Any idea whats going on here and how to fix it? The encoding features >> used to work no problem at all. Im running ruby 1.8.6 (2007-06-07 >> patchlevel 36) [universal-darwin9.0] >> >> Cheers >> >> Tim >> >> >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel@rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel > > _______________________________________________ > libxml-devel mailing list > libxml-devel@rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel _______________________________________________ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel