[xml] xmlSaveFormatFileEnc() creating invalid XML
Here is a simple test case that takes the text from an apparently-valid UTF-8 file, puts it in a text child node, and then writes the XML file out. But the XML file fails validation with xmllint with this error: ./output.xml:4: parser error : PCDATA invalid Char value 12 Am I doing something wrong? -- murr...@murrayc.com www.murrayc.com www.openismus.com The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION /* Build like so: * gcc test_writes_invalid.c `pkg-config gio-2.0 libxml-2.0 --cflags --libs` * * then try: * xmllint --nooout ./output.txt to see: * to see: *./output.xml:4: parser error : PCDATA invalid Char value 12 */ #include #include #include int main(int argc, char** argv) { g_type_init (); xmlDocPtr document = xmlNewDoc(BAD_CAST "1.0"); xmlNodePtr root_node = xmlNewNode(NULL, BAD_CAST "root"); xmlDocSetRootElement(document, root_node); GFile* file = g_file_new_for_path("./input.txt"); char* contents = 0; gsize length = 0; if(!g_file_load_contents(file, 0, &contents, &length, NULL, NULL)) { g_warning("g_file_load_contents() failed"); return -1; } xmlNodePtr child_node = xmlNewText(BAD_CAST contents); xmlAddChild(root_node, child_node); g_free(contents); xmlSaveFormatFileEnc("output.xml", document, "UTF-8", 1); return 0; } ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] xmlSaveFormatFileEnc() creating invalid XML
On Wed, 2011-09-14 at 16:10 +0800, Daniel Veillard wrote: > On Fri, Sep 09, 2011 at 04:30:45PM +0200, Murray Cumming wrote: > > On Fri, 2011-09-09 at 10:21 -0400, Jason Viers wrote: > > > On 9/9/2011 05:37, Murray Cumming wrote: > > > > Here is a simple test case that takes the text from an apparently-valid > > > > UTF-8 file > > > > > > Not all valid UTF-8 is valid in XML. Only a subset, as defined in > > > http://www.w3.org/TR/2008/REC-xml-20081126/#charsets > > > > > > Note that Form Feed (0xC) is not allowed. Your original input document > > > contains a formfeed character, and this is what ends up being invalid. > > > It's not a matter of escaping; form feed as a literal byte, numeric > > > reference, etc., is not allowed. > > > Stripping the form feed from the input allows it to serialize properly. > > > > Ah, I didn't know that it couldn't be there even if escaped. Thanks. > > > > Shouldn't libxml warn about that at the same time that it would escape > > characters such as & and < rather than writing invalid XML? > > It's a choice, either you make all APIs validate all input strings > or you rely on the client to do it. In libxml2 I took the second path > and that was decided 10+ years ago. The parser on the other hand is > strict but that's mandatory to follow the spec. OK. Thanks. Is that documented? -- murr...@murrayc.com www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] xmlSaveFormatFileEnc() creating invalid XML
On Fri, 2011-09-09 at 10:21 -0400, Jason Viers wrote: > On 9/9/2011 05:37, Murray Cumming wrote: > > Here is a simple test case that takes the text from an apparently-valid > > UTF-8 file > > Not all valid UTF-8 is valid in XML. Only a subset, as defined in > http://www.w3.org/TR/2008/REC-xml-20081126/#charsets > > Note that Form Feed (0xC) is not allowed. Your original input document > contains a formfeed character, and this is what ends up being invalid. > It's not a matter of escaping; form feed as a literal byte, numeric > reference, etc., is not allowed. > Stripping the form feed from the input allows it to serialize properly. Ah, I didn't know that it couldn't be there even if escaped. Thanks. Shouldn't libxml warn about that at the same time that it would escape characters such as & and < rather than writing invalid XML? -- murr...@murrayc.com www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] xmlSaveFormatFileEnc() creating invalid XML
Here is a simple test case that takes the text from an apparently-valid UTF-8 file, puts it in a text child node, and then writes the XML file out. But the XML file fails validation with xmllint with this error: ./output.xml:4: parser error : PCDATA invalid Char value 12 Am I doing something wrong? -- murr...@murrayc.com www.murrayc.com www.openismus.com The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION /* Build like so: * gcc test_writes_invalid.c `pkg-config gio-2.0 libxml-2.0 --cflags --libs` * * then try: * xmllint --nooout ./output.txt to see: * to see: *./output.xml:4: parser error : PCDATA invalid Char value 12 */ #include #include #include int main(int argc, char** argv) { g_type_init (); xmlDocPtr document = xmlNewDoc(BAD_CAST "1.0"); xmlNodePtr root_node = xmlNewNode(NULL, BAD_CAST "root"); xmlDocSetRootElement(document, root_node); GFile* file = g_file_new_for_path("./input.txt"); char* contents = 0; gsize length = 0; if(!g_file_load_contents(file, 0, &contents, &length, NULL, NULL)) { g_warning("g_file_load_contents() failed"); return -1; } xmlNodePtr child_node = xmlNewText(BAD_CAST contents); xmlAddChild(root_node, child_node); g_free(contents); xmlSaveFormatFileEnc("output.xml", document, "UTF-8", 1); return 0; } ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] xmllint: validating a document that doesn't specify a DTD
On Tue, 2009-11-17 at 14:24 +0100, Daniel Veillard wrote: > On Tue, Nov 17, 2009 at 01:33:44PM +0100, Murray Cumming wrote: > > Should I be able to validate an XML document (such as a .glade file) > > that has no DOCTYPE line, and therefore doesn't specify a DTD? > > > > When I try it with xmllint, I get this error > > validity error : Validation failed: no DTD found ! > > even when I have specified a local DTD with --dtdvalid. > > Works for me with the version from git head: Thanks. I was actually using xmllint --valid --dtdvalid mydtd.dtd mydoc.xml So is --dtdvalid an alternative to --valid rather than a way of using --valid? -- murr...@murrayc.com www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] xmllint: validating a document that doesn't specify a DTD
Should I be able to validate an XML document (such as a .glade file) that has no DOCTYPE line, and therefore doesn't specify a DTD? When I try it with xmllint, I get this error validity error : Validation failed: no DTD found ! even when I have specified a local DTD with --dtdvalid. -- murr...@murrayc.com www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] Preserving the SAX tutorial
We are moving most good GNOME documentation out of developer.gnome.org, usually into library.gnome.org. That should mean that it's better organized, kept up-to-date, and translated. This libxml SAX tutorial seems to still be relevant and useful. Would you like to add it to the regular libxml documentation. James, is that OK? http://developer.gnome.org/doc/tutorials/xml-sax/xml-sax.html -- [EMAIL PROTECTED] www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] xmlSaveFormatFileEnc() doesn't always write encoding declaration.
The xmlSaveFormatFileEnc function accepts a NULL for the encoding, though the documentation doesn't say what that would mean: http://xmlsoft.planetmirror.com/html/libxml-tree.html#xmlSaveFormatFileEnc int xmlSaveFormatFileEnc(const char * filename, xmlDocPtr cur, const char * encoding, int format) It does seem to default to UTF-8 encoding when NULL is used, but NULL also means that the encoding declaration will not be written at the start of the XML file. Is that ever a good thing? It seems like a bug. Would it break anything to make it always write the encoding declaration? -- Murray Cumming [EMAIL PROTECTED] www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Problems with xmlChar* as a key in STL Map
On Wed, 2005-05-25 at 09:12 +0300, Antti Mäkinen wrote: [snip] > when I use the acquired xmlChar* as a key in a STL map [snip] This is a fairly basic C/C++ mistake. It will probably be easier for you if you use the C++ interface: libxml++ http://libxmlplusplus.sourceforge.net -- Murray Cumming [EMAIL PROTECTED] www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] Re: [sigc] spec file and include path name
On Fri, 2005-04-29 at 17:38 -0400, Carl Nygard wrote: > I just downloaded 2.10, and I'm seeing a discrepancy between make > install and what the spec file is expecting. > > lib version is 2.10 > make install tries to put includes in: > test -z "/usr/local/include/libxml++-2.6/libxml++/parsers" || mkdir -p -- > "/usr/local/include/libxml++-2.6/libxml++/parsers" > spec file is: > %files devel > %defattr(-,root,root) > /usr/include/libxml++-2.8 > /usr/lib/*.a > /usr/lib/*.la > /usr/lib/pkgconfig/libxml++-2.8.pc > > rpmbuild errors: > > RPM build errors: > File not found: /var/tmp/libxml++-root/usr/include/libxml++-2.8 > File not found: /var/tmp/libxml++-root/usr/lib/pkgconfig/libxml++-2.8.pc > > > At first I thought s/2.8/2.10/ in the specfile would work, but the > makefile using 2.6 makes me want to ask what's going on. > > Guidance? You sent this to the libsigc++ list instead of the libxml++ list. I do that sometimes too. The library is called libxml++-2.6. The latest version of libxml++-2.6 is 2.10. A patch to the .spec file would be welcome. I doubt many people use it. As usual, I'd recommend removing it if it remains broken, in favour of people using their distro packages anyway. -- Murray Cumming [EMAIL PROTECTED] www.murrayc.com www.openismus.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml