Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266
On 01/28/09 20:38, Kohei Yoshida wrote: That may be fixed with http://qa.openoffice.org/issues/show_bug.cgi?id=93998 in CWS koheidatapilot02 integrated to DEV300_m37, please check. m39 still tries to allocate 1GB of RAM. Well, the fix in koheidatapiot02 deals with an entire different issue than i55266. So, I'm not at all surprised. That's right. Opposite ends of the calculation, in a way. Fixing i55266 (and probably i97886) needs a non-trivial refactoring since the reason for such large memory footprint may be a design issue. Current data pilot's result computation code uses a recursive algorithm, and each recursion instantiates a new set of objects. Unfortunately, when the number of unique field member values becomes large, which is certainly the case for the above referenced issues, the size of instantiated objects grows, and the memory usage spikes. Note that this algorithm is *not* weak against the total size of a source data, but is weak against the large number of unique field member values. We could first try to minimize memory use without changing the algorithm itself (by reusing some of the same tricks I used to reduce memory usage in the cache table). But if that turns out to be insufficient, then we may have to re-write the algorithm itself, in which case the required effort becomes pretty significant. I don't think this really needs fundamental changes. The recursive structure isn't wrong. To avoid the excessive memory usage, the ScDPResultMember objects in a ScDPResultDimension have to be created on-demand, instead of all in advance. This would require changes to the inter-item calculations (Displayed value) and sorting, so it's not a trivial change, but it seems possible. Speaking of issue 93998, the case of pressing Ctrl-A and starting the data pilot, which took seconds in 2.4, doesn't crash anymore, but the time (and memory) it takes can still be considered a regression. Comparing the problems and benefits of the cache table so far, a rewrite of the results calculation doesn't seem like something we should do. Niklas - To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org For additional commands, e-mail: dev-h...@sc.openoffice.org
Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266
On Thu, 2009-01-29 at 09:22 +0100, Niklas Nebel wrote: On 01/28/09 20:38, Kohei Yoshida wrote: That may be fixed with http://qa.openoffice.org/issues/show_bug.cgi?id=93998 in CWS koheidatapilot02 integrated to DEV300_m37, please check. m39 still tries to allocate 1GB of RAM. Well, the fix in koheidatapiot02 deals with an entire different issue than i55266. So, I'm not at all surprised. That's right. Opposite ends of the calculation, in a way. Fixing i55266 (and probably i97886) needs a non-trivial refactoring since the reason for such large memory footprint may be a design issue. Current data pilot's result computation code uses a recursive algorithm, and each recursion instantiates a new set of objects. Unfortunately, when the number of unique field member values becomes large, which is certainly the case for the above referenced issues, the size of instantiated objects grows, and the memory usage spikes. Note that this algorithm is *not* weak against the total size of a source data, but is weak against the large number of unique field member values. We could first try to minimize memory use without changing the algorithm itself (by reusing some of the same tricks I used to reduce memory usage in the cache table). But if that turns out to be insufficient, then we may have to re-write the algorithm itself, in which case the required effort becomes pretty significant. I don't think this really needs fundamental changes. The recursive structure isn't wrong. To avoid the excessive memory usage, the ScDPResultMember objects in a ScDPResultDimension have to be created on-demand, instead of all in advance. This would require changes to the inter-item calculations (Displayed value) and sorting, so it's not a trivial change, but it seems possible. Ok. Thanks for you input. I'm not saying the recursive structure is wrong for this particular case, but a recursive algorithm in general can be potentially more memory-hungry than an iterative algorithm, *if* the algorithm can be designed in either way. Having said that, using a recursive structure does lead to a simpler implementation (which is always good), and sometimes using a recursion is the only option. Speaking of issue 93998, the case of pressing Ctrl-A and starting the data pilot, which took seconds in 2.4, doesn't crash anymore, but the time (and memory) it takes can still be considered a regression. Yes, there are still ways to squeeze out memory use further from the cache table, especially for the Ctrl-A case. So, not all hope is lost there. Kohei - To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org For additional commands, e-mail: dev-h...@sc.openoffice.org
Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266
Hi Kirill, On Tuesday, 2009-01-27 22:13:50 +0300, Kirill Palagin wrote: Please see http://www.openoffice.org/issues/show_bug.cgi?id=55266 - Calc allocates 1.1GB of RAM for 81kb file with DataPilot.. That may be fixed with http://qa.openoffice.org/issues/show_bug.cgi?id=93998 in CWS koheidatapilot02 integrated to DEV300_m37, please check. Strictly speaking this is not a crash, but in order to complete the task user needs to have at least 1.5GB of RAM, otherwise machine (not just Office) will appear unresponsive, forcing user to power it off and loose data in all applications. Having to power off the machine under such circumstances I'd consider an issue of the operating system though.. Eike -- OOo/SO Calc core developer. Number formatter stricken i18n transpositionizer. SunSign 0x87F8D412 : 2F58 5236 DB02 F335 8304 7D6C 65C9 F9B5 87F8 D412 OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS Please don't send personal mail to the e...@sun.com account, which I use for mailing lists only and don't read from outside Sun. Use er...@sun.com Thanks. pgpxkSGQVpwtJ.pgp Description: PGP signature
Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266
Hi Eike. Eike Rathke пишет: Hi Kirill, On Tuesday, 2009-01-27 22:13:50 +0300, Kirill Palagin wrote: Please see http://www.openoffice.org/issues/show_bug.cgi?id=55266 - Calc allocates 1.1GB of RAM for 81kb file with DataPilot.. That may be fixed with http://qa.openoffice.org/issues/show_bug.cgi?id=93998 in CWS koheidatapilot02 integrated to DEV300_m37, please check. m39 still tries to allocate 1GB of RAM. Thanks for responding. Regards, Kirill. - To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org For additional commands, e-mail: dev-h...@sc.openoffice.org
Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266
On Wed, 2009-01-28 at 20:52 +0300, Kirill Palagin wrote: Hi Eike. Eike Rathke пишет: Hi Kirill, On Tuesday, 2009-01-27 22:13:50 +0300, Kirill Palagin wrote: Please see http://www.openoffice.org/issues/show_bug.cgi?id=55266 - Calc allocates 1.1GB of RAM for 81kb file with DataPilot.. That may be fixed with http://qa.openoffice.org/issues/show_bug.cgi?id=93998 in CWS koheidatapilot02 integrated to DEV300_m37, please check. m39 still tries to allocate 1GB of RAM. Well, the fix in koheidatapiot02 deals with an entire different issue than i55266. So, I'm not at all surprised. Fixing i55266 (and probably i97886) needs a non-trivial refactoring since the reason for such large memory footprint may be a design issue. Current data pilot's result computation code uses a recursive algorithm, and each recursion instantiates a new set of objects. Unfortunately, when the number of unique field member values becomes large, which is certainly the case for the above referenced issues, the size of instantiated objects grows, and the memory usage spikes. Note that this algorithm is *not* weak against the total size of a source data, but is weak against the large number of unique field member values. We could first try to minimize memory use without changing the algorithm itself (by reusing some of the same tricks I used to reduce memory usage in the cache table). But if that turns out to be insufficient, then we may have to re-write the algorithm itself, in which case the required effort becomes pretty significant. Anyway, this is my initial thought on this issue. Looking deeper may reveal more detailed view of the problem. Kohei - To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org For additional commands, e-mail: dev-h...@sc.openoffice.org
Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266
Kohei Yoshida пишет: Fixing i55266 (and probably i97886) needs a non-trivial refactoring since the reason for such large memory footprint may be a design issue. Current data pilot's result computation code uses a recursive algorithm, and each recursion instantiates a new set of objects. Unfortunately, when the number of unique field member values becomes large, which is certainly the case for the above referenced issues, the size of instantiated objects grows, and the memory usage spikes. Note that this algorithm is *not* weak against the total size of a source data, but is weak against the large number of unique field member values. We could first try to minimize memory use without changing the algorithm itself (by reusing some of the same tricks I used to reduce memory usage in the cache table). But if that turns out to be insufficient, then we may have to re-write the algorithm itself, in which case the required effort becomes pretty significant. Anyway, this is my initial thought on this issue. Looking deeper may reveal more detailed view of the problem. Kohei, thank you very much for our analisys and explanation! Now I feel a whole lot better knowing why we crash and why issue is not fixed for this long. With best regards, Kirill Palagin. - To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org For additional commands, e-mail: dev-h...@sc.openoffice.org