I am parsing 120K of XML into a document and then running

  def get_nodes(node, namespace)
    self.find("./dn:#{node}", "dn:#{namespace}")
  end

 several times.

Memory usage for my test driver sits at 20 megs if I run get_nodes less than 10 times. If I run get_nodes 1000 times my memory usage jumps from 20 megs to around 140 megs and does not come back down until the process exits. If I force a GC.start at the end of each loop I can keep the memory usage down but that is not practical in the real world where I need this code to be at least somewhat fast.

I am only building the document once during the entire duration of the test program so the parsing of the large string should not be a problem.

Any ideas as to why my memory usage grows and then never comes down?

If the memory usage caps off at certain levels but isn't continually growing (i.e. a leak), then this is a "problem" with the Ruby GC and not with libxml. libxml just leverages Ruby's GC for memory allocation, etc. See if there is an updated GC patch that you can apply. I don't have the URL handy, but this post makes reference to it:

http://antoniocangiano.com/2007/02/10/top-10-ruby-on-rails-performance-tips/

One could argue, however, that using GC.start is practical if done in tight loops. What exactly are you trying to do with your fragments? Maybe there's a more efficient way of getting the result you're interested in.

-sc

--
Sean Chittenden
[EMAIL PROTECTED]



_______________________________________________
libxml-devel mailing list
libxml-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/libxml-devel

Reply via email to