Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-29 Thread Niklas Nebel

On 01/28/09 20:38, Kohei Yoshida wrote:

That may be fixed with
http://qa.openoffice.org/issues/show_bug.cgi?id=93998
in CWS koheidatapilot02 integrated to DEV300_m37, please check.
  

m39 still tries to allocate 1GB of RAM.


Well, the fix in koheidatapiot02 deals with an entire different issue
than i55266.  So, I'm not at all surprised.


That's right. Opposite ends of the calculation, in a way.


Fixing i55266 (and probably i97886) needs a non-trivial refactoring
since the reason for such large memory footprint may be a design issue.
Current data pilot's result computation code uses a recursive algorithm,
and each recursion instantiates a new set of objects.  Unfortunately,
when the number of unique field member values becomes large, which is
certainly the case for the above referenced issues, the size of
instantiated objects grows, and the memory usage spikes.  Note that this
algorithm is *not* weak against the total size of a source data, but is
weak against the large number of unique field member values.

We could first try to minimize memory use without changing the algorithm
itself (by reusing some of the same tricks I used to reduce memory usage
in the cache table).  But if that turns out to be insufficient, then we
may have to re-write the algorithm itself, in which case the required
effort becomes pretty significant.


I don't think this really needs fundamental changes. The recursive 
structure isn't wrong. To avoid the excessive memory usage, the 
ScDPResultMember objects in a ScDPResultDimension have to be created 
on-demand, instead of all in advance. This would require changes to the 
inter-item calculations (Displayed value) and sorting, so it's not a 
trivial change, but it seems possible.


Speaking of issue 93998, the case of pressing Ctrl-A and starting the 
data pilot, which took seconds in 2.4, doesn't crash anymore, but the 
time (and memory) it takes can still be considered a regression.


Comparing the problems and benefits of the cache table so far, a rewrite 
of the results calculation doesn't seem like something we should do.


Niklas

-
To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org
For additional commands, e-mail: dev-h...@sc.openoffice.org



Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-29 Thread Kohei Yoshida
On Thu, 2009-01-29 at 09:22 +0100, Niklas Nebel wrote:
 On 01/28/09 20:38, Kohei Yoshida wrote:
  That may be fixed with
  http://qa.openoffice.org/issues/show_bug.cgi?id=93998
  in CWS koheidatapilot02 integrated to DEV300_m37, please check.

  m39 still tries to allocate 1GB of RAM.
  
  Well, the fix in koheidatapiot02 deals with an entire different issue
  than i55266.  So, I'm not at all surprised.
 
 That's right. Opposite ends of the calculation, in a way.
 
  Fixing i55266 (and probably i97886) needs a non-trivial refactoring
  since the reason for such large memory footprint may be a design issue.
  Current data pilot's result computation code uses a recursive algorithm,
  and each recursion instantiates a new set of objects.  Unfortunately,
  when the number of unique field member values becomes large, which is
  certainly the case for the above referenced issues, the size of
  instantiated objects grows, and the memory usage spikes.  Note that this
  algorithm is *not* weak against the total size of a source data, but is
  weak against the large number of unique field member values.
  
  We could first try to minimize memory use without changing the algorithm
  itself (by reusing some of the same tricks I used to reduce memory usage
  in the cache table).  But if that turns out to be insufficient, then we
  may have to re-write the algorithm itself, in which case the required
  effort becomes pretty significant.
 
 I don't think this really needs fundamental changes. The recursive 
 structure isn't wrong. To avoid the excessive memory usage, the 
 ScDPResultMember objects in a ScDPResultDimension have to be created 
 on-demand, instead of all in advance. This would require changes to the 
 inter-item calculations (Displayed value) and sorting, so it's not a 
 trivial change, but it seems possible.

Ok.  Thanks for you input.

I'm not saying the recursive structure is wrong for this particular
case, but a recursive algorithm in general can be potentially more
memory-hungry than an iterative algorithm, *if* the algorithm can be
designed in either way.  Having said that, using a recursive structure
does lead to a simpler implementation (which is always good), and
sometimes using a recursion is the only option.

 Speaking of issue 93998, the case of pressing Ctrl-A and starting the 
 data pilot, which took seconds in 2.4, doesn't crash anymore, but the 
 time (and memory) it takes can still be considered a regression.

Yes, there are still ways to squeeze out memory use further from the
cache table, especially for the Ctrl-A case.  So, not all hope is lost
there.

Kohei



-
To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org
For additional commands, e-mail: dev-h...@sc.openoffice.org



Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-28 Thread Eike Rathke
Hi Kirill,

On Tuesday, 2009-01-27 22:13:50 +0300, Kirill Palagin wrote:

 Please see http://www.openoffice.org/issues/show_bug.cgi?id=55266 - Calc  
 allocates 1.1GB of RAM for 81kb file with DataPilot..

That may be fixed with
http://qa.openoffice.org/issues/show_bug.cgi?id=93998
in CWS koheidatapilot02 integrated to DEV300_m37, please check.

 Strictly speaking  
 this is not a crash, but in order to complete the task user needs to  
 have at least 1.5GB of RAM, otherwise machine (not just Office) will  
 appear unresponsive, forcing user to power it off and loose data in all  
 applications.

Having to power off the machine under such circumstances I'd consider an
issue of the operating system though..

  Eike

-- 
 OOo/SO Calc core developer. Number formatter stricken i18n transpositionizer.
 SunSign   0x87F8D412 : 2F58 5236 DB02 F335 8304  7D6C 65C9 F9B5 87F8 D412
 OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
 Please don't send personal mail to the e...@sun.com account, which I use for
 mailing lists only and don't read from outside Sun. Use er...@sun.com Thanks.


pgpxkSGQVpwtJ.pgp
Description: PGP signature


Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-28 Thread Kirill Palagin

Hi Eike.

Eike Rathke пишет:

Hi Kirill,

On Tuesday, 2009-01-27 22:13:50 +0300, Kirill Palagin wrote:

  
Please see http://www.openoffice.org/issues/show_bug.cgi?id=55266 - Calc  
allocates 1.1GB of RAM for 81kb file with DataPilot..



That may be fixed with
http://qa.openoffice.org/issues/show_bug.cgi?id=93998
in CWS koheidatapilot02 integrated to DEV300_m37, please check.
  

m39 still tries to allocate 1GB of RAM.


Thanks for responding.
Regards,

Kirill.

-
To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org
For additional commands, e-mail: dev-h...@sc.openoffice.org



Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-28 Thread Kohei Yoshida
On Wed, 2009-01-28 at 20:52 +0300, Kirill Palagin wrote:
 Hi Eike.
 
 Eike Rathke пишет:
  Hi Kirill,
 
  On Tuesday, 2009-01-27 22:13:50 +0300, Kirill Palagin wrote:
 

  Please see http://www.openoffice.org/issues/show_bug.cgi?id=55266 - Calc  
  allocates 1.1GB of RAM for 81kb file with DataPilot..
  
 
  That may be fixed with
  http://qa.openoffice.org/issues/show_bug.cgi?id=93998
  in CWS koheidatapilot02 integrated to DEV300_m37, please check.

 m39 still tries to allocate 1GB of RAM.

Well, the fix in koheidatapiot02 deals with an entire different issue
than i55266.  So, I'm not at all surprised.

Fixing i55266 (and probably i97886) needs a non-trivial refactoring
since the reason for such large memory footprint may be a design issue.
Current data pilot's result computation code uses a recursive algorithm,
and each recursion instantiates a new set of objects.  Unfortunately,
when the number of unique field member values becomes large, which is
certainly the case for the above referenced issues, the size of
instantiated objects grows, and the memory usage spikes.  Note that this
algorithm is *not* weak against the total size of a source data, but is
weak against the large number of unique field member values.

We could first try to minimize memory use without changing the algorithm
itself (by reusing some of the same tricks I used to reduce memory usage
in the cache table).  But if that turns out to be insufficient, then we
may have to re-write the algorithm itself, in which case the required
effort becomes pretty significant.

Anyway, this is my initial thought on this issue.  Looking deeper may
reveal more detailed view of the problem.

Kohei


-
To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org
For additional commands, e-mail: dev-h...@sc.openoffice.org



Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-28 Thread Kirill Palagin

Kohei Yoshida пишет:

Fixing i55266 (and probably i97886) needs a non-trivial refactoring
since the reason for such large memory footprint may be a design issue.
Current data pilot's result computation code uses a recursive algorithm,
and each recursion instantiates a new set of objects.  Unfortunately,
when the number of unique field member values becomes large, which is
certainly the case for the above referenced issues, the size of
instantiated objects grows, and the memory usage spikes.  Note that this
algorithm is *not* weak against the total size of a source data, but is
weak against the large number of unique field member values.

We could first try to minimize memory use without changing the algorithm
itself (by reusing some of the same tricks I used to reduce memory usage
in the cache table).  But if that turns out to be insufficient, then we
may have to re-write the algorithm itself, in which case the required
effort becomes pretty significant.

Anyway, this is my initial thought on this issue.  Looking deeper may
reveal more detailed view of the problem.

  

Kohei,
thank you very much for our analisys and explanation!
Now I feel a whole lot better knowing why we crash
and why issue is not fixed for this long.


With best regards,
Kirill Palagin.

-
To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org
For additional commands, e-mail: dev-h...@sc.openoffice.org