Just to note again, these changes are to the Univariate Implementations to get them working with the UnivariateStatistic library. If we do decide to move away from using the Univariate Interfaces, this is a stepping stone in that direction. I would welcome others to explore alternate strategies for UnivariateStatistic "containers/facades".
-Mark
Mark R. Diggory wrote:
Phil Steitz wrote:
Given the consensus to move in the direction of disaggregated statistics, IPhil, I really value your input and work, you really help keep us on track and to keep adventures like me from going "too far overboard". I am approaching the contents of this patch in an attempt show how the usage of the individual UnvariateStatistics initially relates back to what we have already implemented.
would agree that there is no internal need for StatUtils.
As a final comment on this, I would like to point out that my opposition to
this approach was based on what I now see was a naive view that we could
actually agree on a set of commonly used univariate statistics and limit our
support to these. I never envisioned Univariate as a "large, monolithic
interface." I see now that this is an inherently limiting perspective and I
should not have proposed it. I was relying too much on my biased practical
experience/observation that once you get past the basic stuff, practical
applications drop off quickly. I was also overly concerned about performance
and overhead, again largely due to my own experience and application needs.
The one thing that I don't understand about the new approach and I would
suggest reconsidering is why you want to retain the Univariate interfaces at
all. As long as you have these and people depend on them, I don't think that
you will really have the full extensibility that you want and you will have
added complexity and overhead to deal with. Sort of the worst of both worlds.
The only thing that you *need* is a way to aggregate data (actually you have
this already -- just need shared aggregation). Why not just move to a model
where a Univariate has a dynamic List of Statistics and do away with the getXXX
methods in the Univariate interfaces altogether?
Phil
Do you think that Store/Univariate still provides a good example of how stats can be aggregated together under a "beanlike" interface? Maybe as such they are good initial examples of library usage. I do still believe you are right that here is a logical subset of statistics that could be categorized as "Descriptive Statistics", and that we could place such a set within Univariate and still keep it simple and light weight.
I also recognize theres always going to be an interest in "expanding" capabilities and having this modular UnviariateStatistic strategy at the core of the implementations makes various aggregations of statistics much more flexible and dynamic. I think the Store/Univariate Interface/Implementations show us an initial strategy for aggregation of the individual statistics under a bean-like interface. As such they are very useful still for immediate usage of a "subset" of statistics. Maybe we should keep them but document them as "front end tools" for users. Then also begin to work on something along the lines that Brent recommended for an Aggregation Container, but as a separate set of Interface/Implementations for now. What do you think?
-Mark
-- Mark Diggory Software Developer Harvard MIT Data Center http://www.hmdc.harvard.edu
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
