Greetings!

Recently I stumbled into the Commons math project; nice design, good
abstractions, "smart updates" and even unit tests! :-)

the Smart updates are a key feature for event stream processing / time
series simulation.  The only piece that is missing from a time series
analysis and simulation perspective is the ability to supply a lag that
defines a fixed sample size and perform rolling calculations.

I was very happy to see this as an item on the wish list.

A ThoughtWorks colleague (Yaxin Wang) and I are prototyping a java time
series simulation engine and we are considering the commons math as the base
of our numerical libraries.  In order to do this we need to complete the
rolling calculations, so here is our first spike (spike means prototype that
can be thrown away / not a real patch.)  We thought we would start with an
easy case; mean, which uses sum.

We have already combined the rolling calculations with the smart update
algorithms before in the numerical libraries for our previous time series
simulation engine.  As you have mentioned in the wish list notes, our past
experience is that some of the algorithms can not avoid using queues for
rolling updates case.  Obviously it is something pretty fundamental to the
design and requires a bit of work across a lot of places to do this for all
the statistics (at least starting with summary statistics.)

Please give feedback on the design, any issues with performance (better data
structure than the queue we used), etc!

If the community is OK with this initial spike, then we can start submitting
patches. :-)


/brad

Attachment: statistics.tar.gz
Description: GNU Zip compressed data

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to