Dang!  And I really intended to butt out of this discussion :-)

On Thu, 2007-23-08 at 12:14 -0600, Charlie Savage wrote:
> > The problems that I have discovered stem from libxml freeing entire
> > document trees while there are still ruby objects referencing the nodes.
> > When those ruby objects are subsequently garbage-collected, the xmlNodes
> > in those objects have already been freed and sadness ensues.  But I
> > think the current mechanism is so close to working that it does not
> > warrant a complete rewrite.
> 
> Why would a Ruby object "own" a xmlnode in a document tree?  Can that be 
> avoided?

I don't see how it can be avoided.  My app creates xml documents by
processing other documents.  Each node in the created document is added
by first creating it as a ruby object, then adding attributes, etc, and
finally adding it to the document.

> If not, can you install a callback into libxml to be alerted when a node 
> is freed?  If so, then you could decrement the reference count.  But 
> you'd still have to keep some long-term memory around so that any 
> existing Ruby objects pointing to the node could check the reference 
> count and see if it is zero. How do you know when to free that memory? 
> I see no way, unless you go to the Ruby object <-> libxml object mapping 
> like I described in the last email.

I don't know if libxml2 provides such callbacks, but I don't see how
this would work anyway.  The xmlNode has no idea what ruby objects are
referencing it.  Instead of trying to ensure that we free the ruby
object when the xmlNode is freed, we have to ensure that such nodes
referenced by ruby objects are not freed at all by xmlFree.

> > As far as I can see, all trees should simply be walked prior to calling
> > XMLFree to check for nodes referenced from ruby. 
> 
> Yes, that is another approach.  It would still require mapping ruby 
> objects to libxml objects.  Unless you want to use object space and 
> iterate over Ruby objects that way.

I don't think it requires any more mapping than is in place now, with
each ruby object knowing the xmlNode it represents.  Beyond that the
reference counting should be enough.

Right now, each xmlNode knows how many ruby objects it is referenced
from (assuming the reference counting works).  When a ruby object is
freed, it decrements the reference count and if it is the last object
referencing the xmlNode, and that xmlNode is not referenced by parent
nodes, then the xmlNode is freed using xmlFree.  The problem is that
xmlFree will free child nodes without checking whether they are
referenced from ruby.  That's why I think we need to walk the tree and
remove such nodes prior to letting xmlFree do its stuff.

> It would be easier to avoid the whole issue, and not have the Ruby 
> objects own the libxml objects at all.  You would have an issue that a 
> programmer would have to make sure not to use a Ruby object pointing to 
> an invalid libxml object though.

I don't see how this would be possible.  My target documents are built
one node at a time from ruby objects.  I think ruby needs to own those
objects.

> > If it weren't so close to working already I'd probably be keen to see
> > such a proven approach used, even if it is tricky.  As it is, I'd prefer
> > to see reasonably simple bug-fixes applied.  Not my choice though - I'm
> > not one of the project developers so I'll butt out now :-)
> 
> Your insights are much appreciated, please keep contributing your 
> thoughts :)

Thanks.

__
Marc

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
libxml-devel mailing list
libxml-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/libxml-devel

Reply via email to