Re: [xml] xmlReadFile Fails Where xmlParseFile Succeeds

2015-10-16 Thread Daniel Veillard
On Mon, Jul 27, 2015 at 02:38:23PM -0400, Paul Braman wrote:
> The following bit of code fails
> 
> xmlInitParser();
> xmlDocPtr maindoc = xmlReadFile("maindoc.xml", NULL, 0);
> xmlDocPtr subdoc = xmlReadFile("subdoc.xml", NULL, 0);
> xmlNodePtr content = xmlDocGetRootElement(subdoc);
> xmlUnlinkNode(content);
> xmlAddChild(xmlDocGetRootElement(maindoc), content);
> xmlFreeDoc(subdoc);
> xmlFreeDoc(maindoc);
> xmlCleanupParser();
> 
> 
> with a crash upon calling xmlFreeDoc(maindoc) (problem freeing memory)
> where this code succeeds just fine and dandy
> 
> xmlInitParser();
> xmlDocPtr maindoc = xmlParseFile("maindoc.xml");
> xmlDocPtr subdoc = xmlParseFile("subdoc.xml");
> xmlNodePtr content = xmlDocGetRootElement(subdoc);
> xmlUnlinkNode(content);
> xmlAddChild(xmlDocGetRootElement(maindoc), content);
> xmlFreeDoc(subdoc);
> xmlFreeDoc(maindoc);
> xmlCleanupParser();
> 
> 
> I understand I should use xmlReadFile instead of xmlParseFile. However, I
> can't figure on what's different between the two that could be causing the
> crash in the first block of code?
> 
> Note, I've tried with multiple versions, even 2.9.2.
> 
> Alternatively, how *should* I be structuring the code to do what I'm doing
> here? (I know a call to
> 
> xmlAddChild(xmlDocGetRootElement(maindoc),
> xmlCopyNode(xmlDocGetRootElement(subdoc), 1));
> 
> 
> in place of the get/unlink/add sequence works but I'd like to understand
> why the code above fails.)

  the problem is dictionaries. the Read* function boost the processing speed
by using a dictionary for all the strings in markup etc. The dictionary is
allocated to the document. when moving content you have pointers from that
part of the subdoc to subdoc dictionary which get pruned to maindoc using
a different dictionary. When trying to free the reference to the local document
dictionary is lost, libxml2 then tries to free the strings from subdoc
dictionary and that fails.
  2 ways around that:

  Option 1: disable dictionaries when parsing if you do that kind of copy
 paste between documents (XML_PARSE_NODICT option see parser.h)

  Option 2: do the tweaking so that all the files get parsed  with the same
 dictionary shared by all documents, more complex, libxslt does that for
 example, I don't have a simple document explaining it unfortunately.

Daniel

> ___
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> https://mail.gnome.org/mailman/listinfo/xml


-- 
Daniel Veillard  | Open Source and Standards, Red Hat
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] xmlReadFile Fails Where xmlParseFile Succeeds

2015-07-27 Thread Paul Braman
The following bit of code fails

xmlInitParser();
xmlDocPtr maindoc = xmlReadFile("maindoc.xml", NULL, 0);
xmlDocPtr subdoc = xmlReadFile("subdoc.xml", NULL, 0);
xmlNodePtr content = xmlDocGetRootElement(subdoc);
xmlUnlinkNode(content);
xmlAddChild(xmlDocGetRootElement(maindoc), content);
xmlFreeDoc(subdoc);
xmlFreeDoc(maindoc);
xmlCleanupParser();


with a crash upon calling xmlFreeDoc(maindoc) (problem freeing memory)
where this code succeeds just fine and dandy

xmlInitParser();
xmlDocPtr maindoc = xmlParseFile("maindoc.xml");
xmlDocPtr subdoc = xmlParseFile("subdoc.xml");
xmlNodePtr content = xmlDocGetRootElement(subdoc);
xmlUnlinkNode(content);
xmlAddChild(xmlDocGetRootElement(maindoc), content);
xmlFreeDoc(subdoc);
xmlFreeDoc(maindoc);
xmlCleanupParser();


I understand I should use xmlReadFile instead of xmlParseFile. However, I
can't figure on what's different between the two that could be causing the
crash in the first block of code?

Note, I've tried with multiple versions, even 2.9.2.

Alternatively, how *should* I be structuring the code to do what I'm doing
here? (I know a call to

xmlAddChild(xmlDocGetRootElement(maindoc),
xmlCopyNode(xmlDocGetRootElement(subdoc), 1));


in place of the get/unlink/add sequence works but I'd like to understand
why the code above fails.)

- Paul Braman
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml