[xml] xmlSaveFormatFileEnc() creating invalid XML

2011-09-15 Thread Murray Cumming
Here is a simple test case that takes the text from an apparently-valid
UTF-8 file, puts it in a text child node, and then writes the XML file
out. But the XML file fails validation with xmllint with this error:
./output.xml:4: parser error : PCDATA invalid Char value 12

Am I doing something wrong?

-- 
murr...@murrayc.com
www.murrayc.com
www.openismus.com
The precise terms and conditions for copying, distribution and
modification follow.

GNU GENERAL PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

/* Build like so:
 *   gcc test_writes_invalid.c `pkg-config gio-2.0 libxml-2.0 --cflags --libs`
 *
 * then try:
 *   xmllint --nooout ./output.txt to see:
 * to see:
 *./output.xml:4: parser error : PCDATA invalid Char value 12
 */

#include 
#include 
#include 


int
main(int argc, char** argv)
{
  g_type_init ();
  
  xmlDocPtr document = xmlNewDoc(BAD_CAST "1.0");
  xmlNodePtr root_node = xmlNewNode(NULL, BAD_CAST "root");
  xmlDocSetRootElement(document, root_node);

  GFile* file = g_file_new_for_path("./input.txt");
  char* contents = 0;
  gsize length = 0;
  if(!g_file_load_contents(file, 0, &contents, &length, NULL, NULL))
  {
g_warning("g_file_load_contents() failed");
return -1;
  }
  
  xmlNodePtr child_node = xmlNewText(BAD_CAST contents);
  xmlAddChild(root_node, child_node); 
  g_free(contents);

  xmlSaveFormatFileEnc("output.xml", document, "UTF-8", 1);

  return 0;
}

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] xmlSaveFormatFileEnc() creating invalid XML

2011-09-14 Thread Murray Cumming
On Wed, 2011-09-14 at 16:10 +0800, Daniel Veillard wrote:
> On Fri, Sep 09, 2011 at 04:30:45PM +0200, Murray Cumming wrote:
> > On Fri, 2011-09-09 at 10:21 -0400, Jason Viers wrote:
> > > On 9/9/2011 05:37, Murray Cumming wrote:
> > > > Here is a simple test case that takes the text from an apparently-valid
> > > > UTF-8 file
> > > 
> > > Not all valid UTF-8 is valid in XML.  Only a subset, as defined in
> > > http://www.w3.org/TR/2008/REC-xml-20081126/#charsets
> > > 
> > > Note that Form Feed (0xC) is not allowed.  Your original input document 
> > > contains a formfeed character, and this is what ends up being invalid.  
> > > It's not a matter of escaping; form feed as a literal byte, numeric 
> > > reference, etc., is not allowed.
> > > Stripping the form feed from the input allows it to serialize properly.
> > 
> > Ah, I didn't know that it couldn't be there even if escaped. Thanks.
> > 
> > Shouldn't libxml warn about that at the same time that it would escape
> > characters such as & and < rather than writing invalid XML?
> 
>   It's a choice, either you make all APIs validate all input strings
> or you rely on the client to do it. In libxml2 I took the second path
> and that was decided 10+ years ago. The parser on the other hand is
> strict but that's mandatory to follow the spec.

OK. Thanks. Is that documented?

-- 
murr...@murrayc.com
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] xmlSaveFormatFileEnc() creating invalid XML

2011-09-09 Thread Murray Cumming
On Fri, 2011-09-09 at 10:21 -0400, Jason Viers wrote:
> On 9/9/2011 05:37, Murray Cumming wrote:
> > Here is a simple test case that takes the text from an apparently-valid
> > UTF-8 file
> 
> Not all valid UTF-8 is valid in XML.  Only a subset, as defined in
> http://www.w3.org/TR/2008/REC-xml-20081126/#charsets
> 
> Note that Form Feed (0xC) is not allowed.  Your original input document 
> contains a formfeed character, and this is what ends up being invalid.  
> It's not a matter of escaping; form feed as a literal byte, numeric 
> reference, etc., is not allowed.
> Stripping the form feed from the input allows it to serialize properly.

Ah, I didn't know that it couldn't be there even if escaped. Thanks.

Shouldn't libxml warn about that at the same time that it would escape
characters such as & and < rather than writing invalid XML?

-- 
murr...@murrayc.com
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] xmlSaveFormatFileEnc() creating invalid XML

2011-09-09 Thread Murray Cumming
Here is a simple test case that takes the text from an apparently-valid
UTF-8 file, puts it in a text child node, and then writes the XML file
out. But the XML file fails validation with xmllint with this error:
./output.xml:4: parser error : PCDATA invalid Char value 12

Am I doing something wrong?

-- 
murr...@murrayc.com
www.murrayc.com
www.openismus.com

The precise terms and conditions for copying, distribution and
modification follow.

GNU GENERAL PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

/* Build like so:
 *   gcc test_writes_invalid.c `pkg-config gio-2.0 libxml-2.0 --cflags --libs`
 *
 * then try:
 *   xmllint --nooout ./output.txt to see:
 * to see:
 *./output.xml:4: parser error : PCDATA invalid Char value 12
 */

#include 
#include 
#include 


int
main(int argc, char** argv)
{
  g_type_init ();
  
  xmlDocPtr document = xmlNewDoc(BAD_CAST "1.0");
  xmlNodePtr root_node = xmlNewNode(NULL, BAD_CAST "root");
  xmlDocSetRootElement(document, root_node);

  GFile* file = g_file_new_for_path("./input.txt");
  char* contents = 0;
  gsize length = 0;
  if(!g_file_load_contents(file, 0, &contents, &length, NULL, NULL))
  {
g_warning("g_file_load_contents() failed");
return -1;
  }
  
  xmlNodePtr child_node = xmlNewText(BAD_CAST contents);
  xmlAddChild(root_node, child_node); 
  g_free(contents);

  xmlSaveFormatFileEnc("output.xml", document, "UTF-8", 1);

  return 0;
}

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] xmllint: validating a document that doesn't specify a DTD

2009-11-17 Thread Murray Cumming
On Tue, 2009-11-17 at 14:24 +0100, Daniel Veillard wrote:
> On Tue, Nov 17, 2009 at 01:33:44PM +0100, Murray Cumming wrote:
> > Should I be able to validate an XML document (such as a .glade file)
> > that has no DOCTYPE line, and therefore doesn't specify a DTD?
> > 
> > When I try it with xmllint, I get this error
> >   validity error : Validation failed: no DTD found !
> > even when I have specified a local DTD with --dtdvalid.
> 
>   Works for me with the version from git head:

Thanks. I was actually using 
  xmllint --valid --dtdvalid mydtd.dtd mydoc.xml

So is --dtdvalid an alternative to --valid rather than a way of using
--valid?

-- 
murr...@murrayc.com
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] xmllint: validating a document that doesn't specify a DTD

2009-11-17 Thread Murray Cumming
Should I be able to validate an XML document (such as a .glade file)
that has no DOCTYPE line, and therefore doesn't specify a DTD?

When I try it with xmllint, I get this error
  validity error : Validation failed: no DTD found !
even when I have specified a local DTD with --dtdvalid.

-- 
murr...@murrayc.com
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] Preserving the SAX tutorial

2008-10-24 Thread Murray Cumming
We are moving most good GNOME documentation out of developer.gnome.org,
usually into library.gnome.org. That should mean that it's better
organized, kept up-to-date, and translated.

This libxml SAX tutorial seems to still be relevant and useful. Would
you like to add it to the regular libxml documentation. James, is that
OK?
http://developer.gnome.org/doc/tutorials/xml-sax/xml-sax.html

-- 
[EMAIL PROTECTED]
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] xmlSaveFormatFileEnc() doesn't always write encoding declaration.

2006-10-23 Thread Murray Cumming
The xmlSaveFormatFileEnc function accepts a NULL for the encoding,
though the documentation doesn't say what that would mean:
http://xmlsoft.planetmirror.com/html/libxml-tree.html#xmlSaveFormatFileEnc

int xmlSaveFormatFileEnc(const char * filename, 
 xmlDocPtr cur, 
 const char * encoding, 
 int format)

It does seem to default to UTF-8 encoding when NULL is used, but NULL also 
means that the encoding declaration will not be written at the start of the 
XML file. Is that ever a good thing? It seems like a bug. Would it break 
anything to make it always write the encoding declaration? 

-- 
Murray Cumming
[EMAIL PROTECTED]
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Problems with xmlChar* as a key in STL Map

2005-05-25 Thread Murray Cumming
On Wed, 2005-05-25 at 09:12 +0300, Antti Mäkinen wrote:
[snip]
> when I use the acquired xmlChar* as a key in a STL map
[snip]

This is a fairly basic C/C++ mistake. 

It will probably be easier for you if you use the C++ interface:
libxml++
http://libxmlplusplus.sourceforge.net

-- 
Murray Cumming
[EMAIL PROTECTED]
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] Re: [sigc] spec file and include path name

2005-04-30 Thread Murray Cumming
On Fri, 2005-04-29 at 17:38 -0400, Carl Nygard wrote:
> I just downloaded 2.10, and I'm seeing a discrepancy between make
> install and what the spec file is expecting.
> 
> lib version is 2.10
> make install  tries to put includes in:
>   test -z "/usr/local/include/libxml++-2.6/libxml++/parsers" || mkdir -p -- 
> "/usr/local/include/libxml++-2.6/libxml++/parsers"
> spec file is:
> %files devel
> %defattr(-,root,root)
> /usr/include/libxml++-2.8
> /usr/lib/*.a
> /usr/lib/*.la
> /usr/lib/pkgconfig/libxml++-2.8.pc
> 
> rpmbuild errors:
> 
> RPM build errors:
> File not found: /var/tmp/libxml++-root/usr/include/libxml++-2.8
> File not found: /var/tmp/libxml++-root/usr/lib/pkgconfig/libxml++-2.8.pc
> 
> 
> At first I thought s/2.8/2.10/ in the specfile would work, but the
> makefile using 2.6 makes me want to ask what's going on.
> 
> Guidance?

You sent this to the libsigc++ list instead of the libxml++ list. I do
that sometimes too.

The library is called libxml++-2.6. The latest version of libxml++-2.6
is 2.10. 

A patch to the .spec file would be welcome. I doubt many people use it.
As usual, I'd recommend removing it if it remains broken, in favour of
people using their distro packages anyway.

-- 
Murray Cumming
[EMAIL PROTECTED]
www.murrayc.com
www.openismus.com

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml