On Thu, 2009-01-29 at 09:22 +0100, Niklas Nebel wrote:
> On 01/28/09 20:38, Kohei Yoshida wrote:
> >>> That may be fixed with
> >>> http://qa.openoffice.org/issues/show_bug.cgi?id=93998
> >>> in CWS koheidatapilot02 integrated to DEV300_m37, please check.
> >>>   
> >> m39 still tries to allocate >1GB of RAM.
> > 
> > Well, the fix in koheidatapiot02 deals with an entire different issue
> > than i55266.  So, I'm not at all surprised.
> 
> That's right. "Opposite ends" of the calculation, in a way.
> 
> > Fixing i55266 (and probably i97886) needs a non-trivial refactoring
> > since the reason for such large memory footprint may be a design issue.
> > Current data pilot's result computation code uses a recursive algorithm,
> > and each recursion instantiates a new set of objects.  Unfortunately,
> > when the number of unique field member values becomes large, which is
> > certainly the case for the above referenced issues, the size of
> > instantiated objects grows, and the memory usage spikes.  Note that this
> > algorithm is *not* weak against the total size of a source data, but is
> > weak against the large number of unique field member values.
> > 
> > We could first try to minimize memory use without changing the algorithm
> > itself (by reusing some of the same tricks I used to reduce memory usage
> > in the cache table).  But if that turns out to be insufficient, then we
> > may have to re-write the algorithm itself, in which case the required
> > effort becomes pretty significant.
> 
> I don't think this really needs fundamental changes. The recursive 
> structure isn't wrong. To avoid the excessive memory usage, the 
> ScDPResultMember objects in a ScDPResultDimension have to be created 
> on-demand, instead of all in advance. This would require changes to the 
> inter-item calculations ("Displayed value") and sorting, so it's not a 
> trivial change, but it seems possible.

Ok.  Thanks for you input.

I'm not saying the recursive structure is wrong for this particular
case, but a recursive algorithm in general can be potentially more
memory-hungry than an iterative algorithm, *if* the algorithm can be
designed in either way.  Having said that, using a recursive structure
does lead to a simpler implementation (which is always good), and
sometimes using a recursion is the only option.

> Speaking of issue 93998, the case of pressing Ctrl-A and starting the 
> data pilot, which took seconds in 2.4, doesn't crash anymore, but the 
> time (and memory) it takes can still be considered a regression.

Yes, there are still ways to squeeze out memory use further from the
cache table, especially for the Ctrl-A case.  So, not all hope is lost
there.

Kohei



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to