On Thu, 2009-01-29 at 09:22 +0100, Niklas Nebel wrote: > On 01/28/09 20:38, Kohei Yoshida wrote: > >>> That may be fixed with > >>> http://qa.openoffice.org/issues/show_bug.cgi?id=93998 > >>> in CWS koheidatapilot02 integrated to DEV300_m37, please check. > >>> > >> m39 still tries to allocate >1GB of RAM. > > > > Well, the fix in koheidatapiot02 deals with an entire different issue > > than i55266. So, I'm not at all surprised. > > That's right. "Opposite ends" of the calculation, in a way. > > > Fixing i55266 (and probably i97886) needs a non-trivial refactoring > > since the reason for such large memory footprint may be a design issue. > > Current data pilot's result computation code uses a recursive algorithm, > > and each recursion instantiates a new set of objects. Unfortunately, > > when the number of unique field member values becomes large, which is > > certainly the case for the above referenced issues, the size of > > instantiated objects grows, and the memory usage spikes. Note that this > > algorithm is *not* weak against the total size of a source data, but is > > weak against the large number of unique field member values. > > > > We could first try to minimize memory use without changing the algorithm > > itself (by reusing some of the same tricks I used to reduce memory usage > > in the cache table). But if that turns out to be insufficient, then we > > may have to re-write the algorithm itself, in which case the required > > effort becomes pretty significant. > > I don't think this really needs fundamental changes. The recursive > structure isn't wrong. To avoid the excessive memory usage, the > ScDPResultMember objects in a ScDPResultDimension have to be created > on-demand, instead of all in advance. This would require changes to the > inter-item calculations ("Displayed value") and sorting, so it's not a > trivial change, but it seems possible.
Ok. Thanks for you input. I'm not saying the recursive structure is wrong for this particular case, but a recursive algorithm in general can be potentially more memory-hungry than an iterative algorithm, *if* the algorithm can be designed in either way. Having said that, using a recursive structure does lead to a simpler implementation (which is always good), and sometimes using a recursion is the only option. > Speaking of issue 93998, the case of pressing Ctrl-A and starting the > data pilot, which took seconds in 2.4, doesn't crash anymore, but the > time (and memory) it takes can still be considered a regression. Yes, there are still ways to squeeze out memory use further from the cache table, especially for the Ctrl-A case. So, not all hope is lost there. Kohei --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
