At 13.53 12/12/2006 +0200, Motti Shneor wrote:
Wow!!! That puts my problem on another level.
First, As far as I know, recycled nodes are NOT reused for imported
nodes. In my code
[...]
There is just ONE GLOBAL DOMDocument per generated library which holds
all the created nodes and sub-trees. I was not, of course, involved in
the design and development of that code-generator, but I guess they
thought that in-document manipulations were lighter than inter-document
manipulations. So instead of keeping many documents, and maintaining
their cached grammar and schemas they simply maintain lots of sub-trees,
that are most of the time not connected to any parent --- the document
is just a container in which sub-trees float.
[...]
My alternative is to completely rewrite the code-generator to use
independent documents for each object, which is a lot of work.
Any ideas?
Unfortunately you will have to rewrite that code; having a single
DOMDocument will cause your memory to grow indefinitely (even if
DOMnode can be reused, there is a lot of other memory that isn't
going to be, e.g. arrays used by attributes, strings for node names,
arrays returned by getElementsByTagName...)
Alberto
Thanks again -
Motti Shneor
-----Original Message-----
From: Alberto Massari [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 12, 2006 1:13 PM
To: [email protected]
Cc: Ziv Tsoref
Subject: RE: DOMDocument memory bloating problem
Hi Motti,
At 12.11 12/12/2006 +0200, Motti Shneor wrote:
>Hello Alberto, and thanks a lot for the enlightening answer. However, I
>need few more clarifications.
>
>1. Debugging through the DOMNode->remove()->release() process, I have
>seen these strings being pushed into a "recycled" container. Why does
>the code bother to do that, if the "pages" as you call them are never
>actually cleaned?
Because they are recycled the next time a new node is created.
>2. I have noticed, too, that on some occasions xerces DOES reuse
>released nodes (I got the same pointers again and again when creating
>elements, attributes etc.) What is the rule here? I suspect that
>repeated doc->importNode() is the call that bloats my DOMDocument. But
I
>have no proof...
Calling importNode is the way you can copy DOMNodes from a
DOMDocument to another (all the DOMNodes in a DOM tree must come from
the same memory pool owned by the DOMDocument at the root); it will
end up creating copies of the source nodes, recycling released nodes
if they are available.
>3. If DOMDocument does not keep track of the cleaned "pages" (Are they
>the "buckets" in the code?) can I add a cleanup function to
>DOMDocumentImpl.cpp/hpp to EXPLICITELY scan and release such "pages" ?
>Can you hint on the implications? I don't need to do it very often so
>such function can be (for my purpose) inefficient, but I absolutely
need
>to do this at times.
The implication is that you should track all the
allocations/deallocations made by DOMDocument from each page, and
that would slow down the entire program, not just the cleanup phase.
The alternative approach (scanning the entire tree to check where the
nodes are pointing to is both inefficient and prone to errors, as
some pointers could be held by arrays or maps, many levels down).
So, you are left with two choices:
1) redesign your code to avoid allocating/deallocating many nodes
(why do you need to call importNode so many times? would a brand new
DOMDocument that is deleted at the end of the processing do the same
work?)
2) change the code of the DOMDocument memory manager to track all the
memory pieces (e.g. on Windows, you could use a private heap with
HeapCreate/HeapAlloc/HeapFree/HeapDestroy)
Hope this helps,
Alberto
>Thanks a lot -
>Motti Shneor
>
>-----Original Message-----
>From: Alberto Massari [mailto:[EMAIL PROTECTED]
>Sent: Tuesday, December 12, 2006 10:15 AM
>To: [email protected]
>Subject: Re: DOMDocument memory bloating problem
>
>Hi Motti,
>unfortunately there is no such control on the memory allocated by
>DOMDocument: all of the nodes and strings used in the DOM tree come
>from a memory pool allocated by the DOMDocument, and they can be
>freed only by deleting the entire page (and DOMDocument doesn't keep
>statistics to check whether an entire page contains only released
>nodes). So the only way to release the memory is by releasing the
>entire DOMDocument.
>
>Sorry if this is not the answer you would have liked,
>Alberto
>
>At 09.36 12/12/2006 +0200, Motti Shneor wrote:
> >Hello everyone. Happy to join the list.
> >
> >I use a system that reuses the same xerces::DomDocument for long
>period,
> >adding and releasing DomNodes (elements, attributes etc.)
continuously.
> >
> >Although I DomNode->remove()->release() every unneeded node, the
memory
> >taken up by DomDocument seems to ever increase, to the point the
>program
> >becomes unusable.
> >
> >In the docs, it is recommended that I release unused nodes, but it
only
> >is assured that they are actually released when the document is
> >released. This is not good enough in my situation.
> >
> >I see that xerces memory manager's "deallocate()" is never called on
my
> >nodes until I explicitly DomDocument *myDoc->release();
> >
> >
> >I am seeking a way to instruct a DomDocument to actually clear and
free
> >its RELEASED nodes. Something like a partial DomDocoment->release()
>that
> >will only clean up its heap from released stuff.
> >
> >Is it possible? Is there a simple way to do this? What are the
prices?
> >
> >Any ideas?
> >
> >
> >Motti Shneor
> >Senior Software Engineer
> >Orbograph Ltd.
> >[EMAIL PROTECTED]
> >http://www.orbograph.com
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: [EMAIL PROTECTED]
> >For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]