Re: [math] stat package design

Mark R. Diggory Tue, 08 Jul 2003 09:16:12 -0700

Thanks Anton, I just beat you to the punch with this idea last night (If you look at the interface for Storeless, increment no longer returns a values, this really helped clear up where certain computations should occur in relation to updating the internal state vs, calculating the higher stat info, see for instance "incrementing" moments vs calculating variance, skew and kurtosis from them.

Anton Tagunov wrote:

Hello, Developers!
1) The ongoing effort of modularizing stat computations
  is very worthy thing. I'm tracking this thread and
  just wish you success.
2) As I have observed, you currently are implementing
  two sorts of methods:
storageless, like

double computeXXX( double[] ) storage-empowered, like

double increment( double )

Its actually the opposite

storageless, like

double increment( double ) storage-empowered, like

double computeXXX( double[] )

the "storelessness" is in th concept that the entire double[] does not need to be maintained, not in the value is stored as a property in the object. I should javadoc this futher info in the interfaces so that is clearer.

but the later sort of methods do return the value accumulated up to the moment on every call, which often costs a certain amount of CPU cycles.

Prior to my changes last night "increment" was actually calculating the entire statistic. "getValue" was only returning the value of a precalculated property. The problem is that the cpu cycles need to be "spent" no matter the approach of full calculation in "increment" or "partial calc" in increment and "partial calc" in getValue. I think addVAlue is going to get called alot more than getValue, so I've optimized to reduce the amount of calculation going on in increment as much as possible. (it would be possible to maintain a boolean state about if "increment" has been called, this way calling getValue repeatedly will not result in repeatedly calculating the same statistical value over and over. I'll look into adding this.

Can we imagine another contract,

      void increment( double )
      double getResult()

  which would allow to feed in values one-by-one,
  but get the result only once?

  This would probably be closer to efficience
  to the storageless approach, yet we won't
  have to create a double[], DoubleArray or a Collection
  and hand it in, but would be able to feed-in values
  the way we like it w/o any restriction.

-Anton

"getResult" might be conceptually clearer and more appealing than "getValue" too.

thanks,

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [math] stat package design

Reply via email to