Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-29 Thread Niklas Nebel

On 01/28/09 20:38, Kohei Yoshida wrote:

That may be fixed with
http://qa.openoffice.org/issues/show_bug.cgi?id=93998
in CWS koheidatapilot02 integrated to DEV300_m37, please check.
  

m39 still tries to allocate 1GB of RAM.


Well, the fix in koheidatapiot02 deals with an entire different issue
than i55266.  So, I'm not at all surprised.


That's right. Opposite ends of the calculation, in a way.


Fixing i55266 (and probably i97886) needs a non-trivial refactoring
since the reason for such large memory footprint may be a design issue.
Current data pilot's result computation code uses a recursive algorithm,
and each recursion instantiates a new set of objects.  Unfortunately,
when the number of unique field member values becomes large, which is
certainly the case for the above referenced issues, the size of
instantiated objects grows, and the memory usage spikes.  Note that this
algorithm is *not* weak against the total size of a source data, but is
weak against the large number of unique field member values.

We could first try to minimize memory use without changing the algorithm
itself (by reusing some of the same tricks I used to reduce memory usage
in the cache table).  But if that turns out to be insufficient, then we
may have to re-write the algorithm itself, in which case the required
effort becomes pretty significant.


I don't think this really needs fundamental changes. The recursive 
structure isn't wrong. To avoid the excessive memory usage, the 
ScDPResultMember objects in a ScDPResultDimension have to be created 
on-demand, instead of all in advance. This would require changes to the 
inter-item calculations (Displayed value) and sorting, so it's not a 
trivial change, but it seems possible.


Speaking of issue 93998, the case of pressing Ctrl-A and starting the 
data pilot, which took seconds in 2.4, doesn't crash anymore, but the 
time (and memory) it takes can still be considered a regression.


Comparing the problems and benefits of the cache table so far, a rewrite 
of the results calculation doesn't seem like something we should do.


Niklas

-
To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org
For additional commands, e-mail: dev-h...@sc.openoffice.org



Re: [sc-dev] [almost] Crasher is alive for 3 years - issue 55266

2009-01-29 Thread Kohei Yoshida
On Thu, 2009-01-29 at 09:22 +0100, Niklas Nebel wrote:
 On 01/28/09 20:38, Kohei Yoshida wrote:
  That may be fixed with
  http://qa.openoffice.org/issues/show_bug.cgi?id=93998
  in CWS koheidatapilot02 integrated to DEV300_m37, please check.

  m39 still tries to allocate 1GB of RAM.
  
  Well, the fix in koheidatapiot02 deals with an entire different issue
  than i55266.  So, I'm not at all surprised.
 
 That's right. Opposite ends of the calculation, in a way.
 
  Fixing i55266 (and probably i97886) needs a non-trivial refactoring
  since the reason for such large memory footprint may be a design issue.
  Current data pilot's result computation code uses a recursive algorithm,
  and each recursion instantiates a new set of objects.  Unfortunately,
  when the number of unique field member values becomes large, which is
  certainly the case for the above referenced issues, the size of
  instantiated objects grows, and the memory usage spikes.  Note that this
  algorithm is *not* weak against the total size of a source data, but is
  weak against the large number of unique field member values.
  
  We could first try to minimize memory use without changing the algorithm
  itself (by reusing some of the same tricks I used to reduce memory usage
  in the cache table).  But if that turns out to be insufficient, then we
  may have to re-write the algorithm itself, in which case the required
  effort becomes pretty significant.
 
 I don't think this really needs fundamental changes. The recursive 
 structure isn't wrong. To avoid the excessive memory usage, the 
 ScDPResultMember objects in a ScDPResultDimension have to be created 
 on-demand, instead of all in advance. This would require changes to the 
 inter-item calculations (Displayed value) and sorting, so it's not a 
 trivial change, but it seems possible.

Ok.  Thanks for you input.

I'm not saying the recursive structure is wrong for this particular
case, but a recursive algorithm in general can be potentially more
memory-hungry than an iterative algorithm, *if* the algorithm can be
designed in either way.  Having said that, using a recursive structure
does lead to a simpler implementation (which is always good), and
sometimes using a recursion is the only option.

 Speaking of issue 93998, the case of pressing Ctrl-A and starting the 
 data pilot, which took seconds in 2.4, doesn't crash anymore, but the 
 time (and memory) it takes can still be considered a regression.

Yes, there are still ways to squeeze out memory use further from the
cache table, especially for the Ctrl-A case.  So, not all hope is lost
there.

Kohei



-
To unsubscribe, e-mail: dev-unsubscr...@sc.openoffice.org
For additional commands, e-mail: dev-h...@sc.openoffice.org