There is a serious inconsistency when "round tripping" XML containing UTF-8 characters. If you output the document to a string after parsing you get the UTF-8 back out, if you just grab a node and convert to a string you get UTF-8 characters substituted with entities:
utf8test.rb: require 'xml/libxml' xml = <<XML <?xml version="1.0" encoding="UTF-8"?> <title>This is a UTF-8 pi: π</title> XML parser = XML::Parser.new parser.string = xml doc = parser.parse puts doc.to_s puts doc.root.to_s This outputs: <?xml version="1.0" encoding="UTF-8"?> <title>This is a UTF-8 pi: π</title> <title>This is a UTF-8 pi: π</title> I would think that the behavior of to_s by default would be to write the XML out as a string just as it was parsed. Another variant should be provided if character conversion is desirable. --Paul _______________________________________________ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel