Phil Steitz wrote:
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
Well, I for one would prefer to have the simple computational methods in
one
place. I would support making the class require instantiation, however,
i.e.
making the methods non-static.
Yes, but again is a question of having big flat monolithic classes vs having extensible implementations that can easily be expanded on. I'm not particularly thrilled at the idea of being totally locked into such an interface like Univariate or StatUtils. It is just totally inflexible and there always too much restriction and argument about what do we want to put in it vs, not put in it.
I think that it is a good idea to have these discussions and I don't understand
what you mean by "inflexible".
Inflexability in that a "well designed" interface doesn't really need to grow or change over time, with an interface that can grows as the project grows, theres always going to be ALOT of growing pains (between developers, and in terms of orgaization of the code).
In retrospect, we probably should have named StoreUnivariate "ExtendedUnivariate", since it really represents a statistical object supporting more statistics. Univariate can always be extended --
This is the problem, every time we implement another stat are we going to Extend Univariate and create a whole new set of implementations just to support that stat.
statistics can be added to the base interface as well as to the abstract and concrete classes that implement the base interface. Some of these statistics can be based on computational methods in StatUtils. If we eliminate the static methods in StatUtils, then we can make the computational strategies pluggable.
This was my whole intention with the separate statistics. These eliminate the need to delegate to the static StatUtils and provide both plugability and individual implementations, as such establishing an organized library with room for growth. If you want to use an individual implementation you can, if you want to use a facade, you can. The facades just delegate to the individual stats in the same fashion we currently have UnivariateImpl delegating to StatUtils.
One more sort of philosphical point that makes me want to keep Univariates as
objects with statistics as properties: to me a Univariate is in fact a java
bean. It's state is the data that it is characterizing and its properties are
the statistics describing these data.
And this will still be the case, I'm just modularizing the Statistical Implementations so there's more room for alternate implmentation and alternate usage.
Univariates that support only a limited
set of statistics don't have to hold all of the individual data values
comprising their state internally.
Extended Univariates require more overhead.
It is natural, therefore, to define the extended statistics in an extended
interface.
Yes, simple, but not very organized, and not as extensible as a framework like "solvers" is. You can implement any new "solver" we could desire right now without much complaint, but try to implement a new statistic and blam, all this argument starts up as to whether its appropriate or not in the Univariate interface.
You are confusing strategies with implementations. The rootfinding framework exists to support multiple strategies to do rootfinding, not to support arbitrary numerical methods. A better analogy would be to the distribution framework which supports creation of different probability distributions. You could argue that a "statistic" is as natural an abstraction as a probability distribution. I disagree with that. There is lots of structure in a probability distribution, very little in a statistic from an abstract standpoint.
Ok, I do like your analogy better. And I agree that a statistic does not have as much "structure" as a probability distribution. But I disagree that this is grounds for not approaching my strategy.
There's not room for
growth here! If I decide to go down the road an try to implement things like auto-correlation coefficients (which would be a logical addition someday) then I end up having to get permission just to "add" the implementation, whereas if there's a logical framework, theres more room for growth without stepping on each others toes so much. This is very logical to me.
I disagree. Extending a class or adding a method to an interface is no harder than adding a new class (actually easier). It seems ridiculous to me
> to add a new class for each univariate statistic that we want to > support.
So far any time the Univariate interface is modified, it usually results in a disagreement from someone that the change was not appropriate, usually this is based on opinion and not the functional capabilities of the particular method. This is not easily extendable because any new experimental development is in a constant battle with the conservation of the interface.
Unfortunately we come from different schools, I have to defend that I will always find adding a class to be much easier than "redefining and interface" on any day of the week.
If the stats
are going to be meaningfully integrated, they will have to be used/defined by
the core univariate classes any way, unless your idea is to eliminate these and
force users think about statistics one at a time instead of as part of a
univariate statistical summary. This may be the crux of our disagreement. I
see the statistics as natural properties of a set of data, not meaningful
objects in their own right.
This is not my intent at all. I continue a defense of this statement below, just keep in mind, a statistic can be both a functional object in its own right and be part of an "bean like interface". Simply look at Univariate delegating methods to the static methods in StatUtils, here statistical methods are both bean properties and "objects" of a sort. I just more clearly defined the "object" characteristics of the statistics. If you look back at the version of StatUtils I rolled back from you can clearly see this dualistic state of methods as "objects" and that it does work well.
I would like to propose the following compromise solution that allows the kind of flexibility that you want without breaking things apart as much.
1. Rename StoreUnivariate to ExtendedUnivariate and change all other "Store"
names to "Extended".
The naming is a trivial aspect of what is going on here.
2. Make the methods in StatUtils non-static. Continue to use these for basic
computational methods shared by Univariate and ExtendedUnivariate
implementations and for direct use by applications and elsewhere in
commons-math. These methods do not have to be used by all Univariate
implementation strategies.
This is exactly what I have accomplished in the UnivariateStatistic package I have developed. These classes can easily be delegated to from within UnvariateImpl, StoreUnivariateImpl, StatUtils or any other interface of your choosing.
3. Add addValues methods to Univariate that accept double[], List and Collection with property name and eliminate ListUnivariate and BeanListUnivariate.
This is the difficult point, The current examples act as "wrappers" around a specific Collection/Data Structure. The type of that structure is independent of Statistics, I don't see how adding methods that support a particular Object type are going to benefit us if the underlying Data Structure is not already capable of support Objects vs. double[]. This is where providing different implementations of Univariates that polymorphically support different internal data structures becomes critical.
I have examples of all the Univariates implemented to support both the UnivariateStatistic approach and to support the various internal Collection types, and to support the "Transformation" of the objects stored in these collections to double primitive values in such a way that the statistical implementations do not need modification to support so many different input types.
4. Rename UnivariateImpl to SimpleUnivariate and add a UnivariateFactory with factory methods to create Simple, Extended and whatever other sorts of Univariates we may define.
I feel the same about factories as I do about renaming, I don't feel its part of this topic.
To add new statistics or computational strategies in this environment, we can
a) add to the Univariate interface if we think that they are really basic -- I
think that t-based confidence interval half-width for the mean is a basic stat
that is now missing, for example b) add to the ExtendedUnivariate interface
Here we are again, I really do believe that in an ideal design, one should be able to add a particular statistical approach to the project without having to "modify" interfaces, and thus incur the cost of argument between conservative and experimental viewpoints. Plus altering interfaces leads to problems down the road with different versions having differnt methods in the interface. Imagine if you picked up JDK 1.5 and the Collection Interface had been altered to remove a method. You would be very frustrated in having to rewrite all your current code. No, I don't think altering interfaces is a viable means of extensability. That will create headaches for our users.
c) extend an existing Univariate implementation to add the new statistic or d) create a new Univariate including the new statistic or computational strategy.
This is all somewhat messy, it doesn't lend will to organized extensibility. I'm trying to provide a solid framework for extending the statistical capabilities of the project without this constant interface expansion (1) because it is not scalable, (2) because it limits development to the LCD (least common denominator) of what the group can actually agree upon. (3) because without a framework and a few rules for implementation of a statistic, the resulting codebase will grow in a disorganized fashion.
Lastly, I do not see how having an instantiable version of StatUtils as monolithic class of methods and having Univariate and StoreUnivariate Facades delegate to it is of any benefit over having each Statistic implemented separately and have the methods in Univarates delegate to the individual stat?
To show you the benefit of this approach I've attached my new AbstractStoreUnivariate and AbstractUnivariate implementations which do delegate to the frame work, I've also added my new UnivariateImpl, StoreUnivariateImpl and other Univariate Impl's to show how easily it is to extend off the Univariates to create implementations that support different Data Structures at the core.
If you look through the classes you can see the benefits of using the Facades as polymorphic implementations on top of various data structures. Separating the Statistical implementations further releases these algorithms from being restricted to a specific implementation.
-Mark
/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2003 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, if * any, must include the following acknowlegement: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowlegement may appear in the software itself, * if and wherever such third-party acknowlegements normally appear. * * 4. The names "The Jakarta Project", "Commons", and "Apache Software * Foundation" must not be used to endorse or promote products derived * from this software without prior written permission. For written * permission, please contact [EMAIL PROTECTED] * * 5. Products derived from this software may not be called "Apache" * nor may "Apache" appear in their names without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. */ package org.apache.commons.math.stat;
import java.util.List; import org.apache.commons.math.util.DefaultTransformer; import org.apache.commons.math.util.NumberTransformer; /** * @author <a href="mailto:[EMAIL PROTECTED]">Tim O'Brien</a> */ public class ListUnivariateImpl extends AbstractStoreUnivariate implements StoreUnivariate { /** * Holds a reference to a list - GENERICs are going to make * out lives easier here as we could only accept List<Number> */ protected List list; /** Number Transformer maps Objects to Number for us. */ protected NumberTransformer transformer; /** * Construct a ListUnivariate with a specific List. * @param list The list that will back this Univariate */ public ListUnivariateImpl(List list) { super(); this.list = list; transformer = new DefaultTransformer(); } /** * @see org.apache.commons.math.StoreUnivariate#getValues() */ public double[] getValues() { int length = list.size(); // If the window size is not INFINITE_WINDOW AND // the current list is larger that the window size, we need to // take into account only the last n elements of the list // as definied by windowSize if (windowSize != Univariate.INFINITE_WINDOW && windowSize < list.size()) { length = list.size() - Math.max(0, list.size() - windowSize); } // Create an array to hold all values double[] copiedArray = new double[length]; for (int i = 0; i < copiedArray.length; i++) { copiedArray[i] = getElement(i); } return copiedArray; } /** * @see org.apache.commons.math.StoreUnivariate#getElement(int) */ public double getElement(int index) { double value = Double.NaN; int calcIndex = index; if (windowSize != Univariate.INFINITE_WINDOW && windowSize < list.size()) { calcIndex = (list.size() - windowSize) + index; } try { value = transformer.transform(list.get(calcIndex)); } catch (Exception e) { e.printStackTrace(); } return value; } /** * @see org.apache.commons.math.Univariate#getN() */ public int getN() { int n = 0; if (windowSize != Univariate.INFINITE_WINDOW) { if (list.size() > windowSize) { n = windowSize; } else { n = list.size(); } } else { n = list.size(); } return n; } /** * @see org.apache.commons.math.Univariate#addValue(double) */ public void addValue(double v) { list.add(new Double(v)); } /** * @see org.apache.commons.math.Univariate#clear() */ public void clear() { super.clear(); list.clear(); } /** * @see org.apache.commons.math.stat.AbstractUnivariate#internalValues() */ protected double[] internalValues() { return getValues(); } /** * @see org.apache.commons.math.stat.AbstractUnivariate#start() */ protected int start() { return 0; } /** * @see org.apache.commons.math.stat.AbstractUnivariate#size() */ protected int size() { return getN(); } }
/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2003 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, if * any, must include the following acknowlegement: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowlegement may appear in the software itself, * if and wherever such third-party acknowlegements normally appear. * * 4. The names "The Jakarta Project", "Commons", and "Apache Software * Foundation" must not be used to endorse or promote products derived * from this software without prior written permission. For written * permission, please contact [EMAIL PROTECTED] * * 5. Products derived from this software may not be called "Apache" * nor may "Apache" appear in their names without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. */ package org.apache.commons.math.stat; import org.apache.commons.math.stat.univariate.moment.GeometricMean; import org.apache.commons.math.stat.univariate.moment.Kurtosis; import org.apache.commons.math.stat.univariate.moment.Mean; import org.apache.commons.math.stat.univariate.moment.Skewness; import org.apache.commons.math.stat.univariate.moment.Variance; import org.apache.commons.math.stat.univariate.rank.Max; import org.apache.commons.math.stat.univariate.rank.Min; import org.apache.commons.math.stat.univariate.summary.Sum; import org.apache.commons.math.stat.univariate.summary.SumOfLogs; import org.apache.commons.math.stat.univariate.summary.SumOfSquares; /** * Provides univariate measures for an array of doubles. * * @author <a href="mailto:[EMAIL PROTECTED]">Tim O'Brien</a> * @author <a href="mailto:[EMAIL PROTECTED]">Mark R. Diggory</a> * @author <a href="mailto:[EMAIL PROTECTED]">Phil Steitz</a> */ public abstract class AbstractUnivariate implements Univariate { /** hold the window size **/ protected int windowSize = Univariate.INFINITE_WINDOW; /** count of values that have been added */ protected int n = 0; /** sum of values that have been added */ protected Sum sum = new Sum(); /** sum of the square of each value that has been added */ protected SumOfSquares sumsq = new SumOfSquares(); /** min of values that have been added */ protected Min min = new Min(); /** max of values that have been added */ protected Max max = new Max(); /** sumLog of values that have been added */ protected SumOfLogs sumLog = new SumOfLogs(); /** geoMean of values that have been added */ protected GeometricMean geoMean = new GeometricMean(); /** mean of values that have been added */ protected Mean mean = new Mean(); /** variance of values that have been added */ protected Variance variance = new Variance(); /** skewness of values that have been added */ protected Skewness skewness = new Skewness(); /** kurtosis of values that have been added */ protected Kurtosis kurtosis = new Kurtosis(); /** * Construct an AbstractUnivariate */ public AbstractUnivariate() { super(); } /** * Construct an AbstractUnivariate with a window * @param window The Window Size */ public AbstractUnivariate(int window) { super(); setWindowSize(window); } /** * Returns the internalValues array. * @return the array */ protected abstract double[] internalValues(); /** * Returns the start index of the array * @return start index */ protected abstract int start(); /** * Returns the size of the array appropriate for doing calculations. * @return Usually this is just numElements. */ protected abstract int size(); /** * If windowSize is set to Infinite, * statistics are calculated using the following * <a href="http://www.spss.com/tech/stat/Algorithms/11.5/descriptives.pdf"> * recursive strategy * </a>. * @see org.apache.commons.math.stat.Univariate#addValue(double) */ public abstract void addValue(double value); /** * @see org.apache.commons.math.stat.Univariate#getN() */ public int getN() { return n; } /** * @see org.apache.commons.math.stat.Univariate#getSum() */ public double getSum() { double[] v = internalValues(); if (v != null) { return sum.evaluate(v, this.start(), this.size()); } return sum.getValue(); } /** * @see org.apache.commons.math.stat.Univariate#getSumsq() */ public double getSumsq() { double[] v = internalValues(); if (v != null) { return sumsq.evaluate(v, this.start(), this.size()); } return sumsq.getValue(); } /** * @see org.apache.commons.math.stat.Univariate#getMean() */ public double getMean() { double[] v = internalValues(); if (v != null) { return mean.evaluate(v, this.start(), this.size()); } return mean.getValue(); } /** * Returns the standard deviation for this collection of values * @see org.apache.commons.math.stat.Univariate#getStandardDeviation() */ public double getStandardDeviation() { double stdDev = Double.NaN; if (getN() > 0) { if (getN() > 1) { stdDev = Math.sqrt(getVariance()); } else { stdDev = 0.0; } } return (stdDev); } /** * Returns the variance of the values that have been added via West's * algorithm as described by * <a href="http://doi.acm.org/10.1145/359146.359152">Chan, T. F. and * J. G. Lewis 1979, <i>Communications of the ACM</i>, * vol. 22 no. 9, pp. 526-531.</a>. * * @return The variance of a set of values. * Double.NaN is returned for an empty * set of values and 0.0 is returned for * a <= 1 value set. */ public double getVariance() { double[] v = internalValues(); if (v != null) { return variance.evaluate(v, this.start(), this.size()); } return variance.getValue(); } /** * Returns the skewness of the values that have been added as described by * <a href="http://mathworld.wolfram.com/k-Statistic.html"> * Equation (6) for k-Statistics</a>. * @return The skew of a set of values. Double.NaN is returned for * an empty set of values and 0.0 is returned for a * <= 2 value set. */ public double getSkewness() { double[] v = internalValues(); if (v != null) { return skewness.evaluate(v, this.start(), this.size()); } return skewness.getValue(); } /** * Returns the kurtosis of the values that have been added as described by * <a href="http://mathworld.wolfram.com/k-Statistic.html"> * Equation (7) for k-Statistics</a>. * * @return The kurtosis of a set of values. Double.NaN is returned for * an empty set of values and 0.0 is returned for a <= 3 * value set. */ public double getKurtosis() { double[] v = internalValues(); if (v != null) { return kurtosis.evaluate(v, this.start(), this.size()); } return kurtosis.getValue(); } /** * @see org.apache.commons.math.stat.StoreUnivariate#getKurtosisClass() */ public int getKurtosisClass() { int kClass = StoreUnivariate.MESOKURTIC; double kurtosis = getKurtosis(); if (kurtosis > 0) { kClass = StoreUnivariate.LEPTOKURTIC; } else if (kurtosis < 0) { kClass = StoreUnivariate.PLATYKURTIC; } return (kClass); } /** * @see org.apache.commons.math.stat.Univariate#getMax() */ public double getMax() { double[] v = internalValues(); if (v != null) { return max.evaluate(v, this.start(), this.size()); } return max.getValue(); } /** * @see org.apache.commons.math.stat.Univariate#getMin() */ public double getMin() { double[] v = internalValues(); if (v != null) { return min.evaluate(v, this.start(), this.size()); } return min.getValue(); } /** * @see org.apache.commons.math.stat.Univariate#getGeometricMean() */ public double getGeometricMean() { double[] v = internalValues(); if (v != null) { return geoMean.evaluate(v, this.start(), this.size()); } return geoMean.getValue(); } /** * Generates a text report displaying * univariate statistics from values that * have been added. * @return String with line feeds displaying statistics */ public String toString() { StringBuffer outBuffer = new StringBuffer(); outBuffer.append("UnivariateImpl:\n"); outBuffer.append("n: " + n + "\n"); outBuffer.append("min: " + min + "\n"); outBuffer.append("max: " + max + "\n"); outBuffer.append("mean: " + getMean() + "\n"); outBuffer.append("std dev: " + getStandardDeviation() + "\n"); outBuffer.append("skewness: " + getSkewness() + "\n"); outBuffer.append("kurtosis: " + getKurtosis() + "\n"); return outBuffer.toString(); } /** * @see org.apache.commons.math.Univariate#clear() */ public void clear() { this.n = 0; min.clear(); max.clear(); sum.clear(); sumLog.clear(); sumsq.clear(); geoMean.clear(); mean.clear(); variance.clear(); skewness.clear(); kurtosis.clear(); } /** * @see org.apache.commons.math.Univariate#getWindowSize() */ public int getWindowSize() { return windowSize; } /** * @see org.apache.commons.math.Univariate#setWindowSize(int) */ public void setWindowSize(int windowSize) { clear(); this.windowSize = windowSize; } }
/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2003 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, if * any, must include the following acknowlegement: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowlegement may appear in the software itself, * if and wherever such third-party acknowlegements normally appear. * * 4. The names "The Jakarta Project", "Commons", and "Apache Software * Foundation" must not be used to endorse or promote products derived * from this software without prior written permission. For written * permission, please contact [EMAIL PROTECTED] * * 5. Products derived from this software may not be called "Apache" * nor may "Apache" appear in their names without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. */ package org.apache.commons.math.stat; import java.util.List; import org.apache.commons.math.util.BeanTransformer; /** * This implementation of StoreUnivariate uses commons-beanutils to gather * univariate statistics for a List of Java Beans by property. This * implementation uses beanutils' PropertyUtils to get a simple, nested, * indexed, mapped, or combined property from an element of a List. * * @author <a href="mailto:[EMAIL PROTECTED]">Tim O'Brien</a> */ public class BeanListUnivariateImpl extends ListUnivariateImpl { /** * propertyName of the property to get from the bean */ private String propertyName; /** * Construct a BeanListUnivariate with specified * backing list * @param list Backing List */ public BeanListUnivariateImpl(List list) { super(list); } /** * Construct a BeanListUnivariate with specified * backing list and propertyName * @param list Backing List * @param propertyName Bean propertyName */ public BeanListUnivariateImpl(List list, String propertyName) { super(list); setPropertyName(propertyName); this.transformer = new BeanTransformer(propertyName); } /** * @return propertyName */ public String getPropertyName() { return propertyName; } /** * @param propertyName Name of Property */ public void setPropertyName(String propertyName) { System.out.println("Set prop name; " + propertyName); this.propertyName = propertyName; this.transformer = new BeanTransformer(propertyName); } /** * @see org.apache.commons.math.Univariate#addValue(double) */ public void addValue(double v) { String msg = "The BeanListUnivariateImpl does not accept values " + "through the addValue method. Because elements of this list " + "are JavaBeans, one must be sure to set the 'propertyName' " + "property and add new Beans to the underlying list via the " + "addBean(Object bean) method"; throw new UnsupportedOperationException(msg); } /** * Adds a bean to this list. * * @param bean Bean to add to the list */ public void addObject(Object bean) { list.add(bean); } }
/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2003 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, if * any, must include the following acknowlegement: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowlegement may appear in the software itself, * if and wherever such third-party acknowlegements normally appear. * * 4. The names "The Jakarta Project", "Commons", and "Apache Software * Foundation" must not be used to endorse or promote products derived * from this software without prior written permission. For written * permission, please contact [EMAIL PROTECTED] * * 5. Products derived from this software may not be called "Apache" * nor may "Apache" appear in their names without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. */ package org.apache.commons.math.stat; import java.util.Arrays; import org.apache.commons.math.stat.univariate.rank.Percentile; /** * Provides univariate measures for an array of doubles. * * @author <a href="mailto:[EMAIL PROTECTED]">Tim O'Brien</a> * @author <a href="mailto:[EMAIL PROTECTED]">Mark R. Diggory</a> * @author <a href="mailto:[EMAIL PROTECTED]">Phil Steitz</a> */ public abstract class AbstractStoreUnivariate extends AbstractUnivariate implements StoreUnivariate { /** Percentile */ protected Percentile percentile = new Percentile(50); /** * Create an AbstractStoreUnivariate */ public AbstractStoreUnivariate() { super(); } /** * Create an AbstractStoreUnivariate with a specific Window * @param window WindowSIze for stat calculation */ public AbstractStoreUnivariate(int window) { super(window); } /** * @see org.apache.commons.math.stat.StoreUnivariate#getPercentile(double) */ public double getPercentile(double p) { percentile.setPercentile(p); return percentile.evaluate(this.getValues(), this.start(), this.size()); } /** * @see org.apache.commons.math.stat2.AbstractStoreUnivariate#getSortedValues() */ public double[] getSortedValues() { double[] sort = getValues(); Arrays.sort(sort); return sort; } /** * @see org.apache.commons.math.stat.Univariate#addValue(double) */ public abstract void addValue(double value); /** * @see org.apache.commons.math.stat.StoreUnivariate#getValues() */ public abstract double[] getValues(); /** * @see org.apache.commons.math.stat.StoreUnivariate#getElement(int) */ public abstract double getElement(int index); }
/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2003 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, if * any, must include the following acknowlegement: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowlegement may appear in the software itself, * if and wherever such third-party acknowlegements normally appear. * * 4. The names "The Jakarta Project", "Commons", and "Apache Software * Foundation" must not be used to endorse or promote products derived * from this software without prior written permission. For written * permission, please contact [EMAIL PROTECTED] * * 5. Products derived from this software may not be called "Apache" * nor may "Apache" appear in their names without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. */ package org.apache.commons.math.stat; import java.util.List; import org.apache.commons.math.util.TransformerMap; /** * A mixed List Univariate that Accepts a TransformerMap to map * objects to doubles. * @author <a href="mailto:[EMAIL PROTECTED]">Mark R. Diggory</a> */ public class MixedListUnivariateImpl extends ListUnivariateImpl implements StoreUnivariate { /** * Construct a MixedListUnivariate backed by an existing list * and using a specified TransformerMap * * @param list Existing List * @param transformer TransformerMap */ public MixedListUnivariateImpl(List list, TransformerMap transformer) { super(list); this.setTransformerMap(transformer); } /** * Adds an object to this list. * @param o Object to add to the list */ public void addObject(Object o) { list.add(o); } /** * Get the TransformerMap * @return NumberTransformer */ public TransformerMap getTransformerMap() { return (TransformerMap) transformer; } /** * Set the TransformerMap * @param map TransformerMap */ public void setTransformerMap(TransformerMap map) { transformer = map; } }
/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2003 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, if * any, must include the following acknowlegement: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowlegement may appear in the software itself, * if and wherever such third-party acknowlegements normally appear. * * 4. The names "The Jakarta Project", "Commons", and "Apache Software * Foundation" must not be used to endorse or promote products derived * from this software without prior written permission. For written * permission, please contact [EMAIL PROTECTED] * * 5. Products derived from this software may not be called "Apache" * nor may "Apache" appear in their names without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. */ package org.apache.commons.math.stat; import org.apache.commons.math.util.ContractableDoubleArray; /** * @author <a href="mailto:[EMAIL PROTECTED]">Tim O'Brien</a> */ public class StoreUnivariateImpl extends AbstractStoreUnivariate { /** A contractable double array is used. memory is reclaimed when * the storage of the array becomes too empty. */ protected ContractableDoubleArray eDA; /** * Construct a StoreUnivariateImpl */ public StoreUnivariateImpl() { eDA = new ContractableDoubleArray(); } /** * @see org.apache.commons.math.StoreUnivariate#getValues() */ public double[] getValues() { double[] copiedArray = new double[eDA.getNumElements()]; System.arraycopy( eDA.getElements(), 0, copiedArray, 0, eDA.getNumElements()); return copiedArray; } /** * @see org.apache.commons.math.StoreUnivariate#getElement(int) */ public double getElement(int index) { return eDA.getElement(index); } /** * @see org.apache.commons.math.Univariate#getN() */ public int getN() { return eDA.getNumElements(); } /** * @see org.apache.commons.math.Univariate#addValue(double) */ public synchronized void addValue(double v) { if (windowSize != Univariate.INFINITE_WINDOW) { if (getN() == windowSize) { eDA.addElementRolling(v); } else if (getN() < windowSize) { eDA.addElement(v); } else { String msg = "A window Univariate had more element than " + "the windowSize. This is an inconsistent state."; throw new RuntimeException(msg); } } else { eDA.addElement(v); } } /** * @see org.apache.commons.math.Univariate#clear() */ public synchronized void clear() { eDA.clear(); } /** * @see org.apache.commons.math.Univariate#setWindowSize(int) */ public synchronized void setWindowSize(int windowSize) { this.windowSize = windowSize; // We need to check to see if we need to discard elements // from the front of the array. If the windowSize is less than // the current number of elements. if (windowSize < eDA.getNumElements()) { eDA.discardFrontElements(eDA.getNumElements() - windowSize); } } /** * @see org.apache.commons.math.stat.AbstractUnivariate#internalValues() */ protected double[] internalValues() { return eDA.getValues(); } /** * @see org.apache.commons.math.stat.AbstractUnivariate#start() */ protected int start() { return eDA.start(); } /** * @see org.apache.commons.math.stat.AbstractUnivariate#size() */ protected int size() { return eDA.getNumElements(); } }
/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2003 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, if * any, must include the following acknowlegement: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowlegement may appear in the software itself, * if and wherever such third-party acknowlegements normally appear. * * 4. The names "The Jakarta Project", "Commons", and "Apache Software * Foundation" must not be used to endorse or promote products derived * from this software without prior written permission. For written * permission, please contact [EMAIL PROTECTED] * * 5. Products derived from this software may not be called "Apache" * nor may "Apache" appear in their names without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. */ package org.apache.commons.math.stat; import java.io.Serializable; import org.apache.commons.math.util.FixedDoubleArray; /** * * Accumulates univariate statistics for values fed in * through the addValue() method. Does not store raw data values. * All data are represented internally as doubles. * Integers, floats and longs can be added, but they will be converted * to doubles by addValue(). * * @author Phil Steitz * @author <a href="mailto:[EMAIL PROTECTED]">Tim O'Brien</a> * @author <a href="mailto:[EMAIL PROTECTED]">Mark Diggory</a> * @author Brent Worden * @author <a href="mailto:[EMAIL PROTECTED]">Albert Davidson Chou</a> * @version $Revision: 1.1 $ $Date: 2003/07/04 05:58:16 $ * */ public class UnivariateImpl extends AbstractUnivariate implements Univariate, Serializable { /** fixed storage */ private FixedDoubleArray storage = null; /** Creates new univariate with an infinite window */ public UnivariateImpl() { super(); } /** * Creates a new univariate with a fixed window * @param window Window Size */ public UnivariateImpl(int window) { super(window); storage = new FixedDoubleArray(window); } /** * If windowSize is set to Infinite, moments * are calculated using the following * <a href="http://www.spss.com/tech/stat/Algorithms/11.5/descriptives.pdf"> * recursive strategy * </a>. * Otherwise, stat methods delegate to StatUtils. * @see org.apache.commons.math.stat.Univariate#addValue(double) */ public void addValue(double value) { if (storage != null) { /* then all getters deligate to StatUtils * and this clause simply adds/rolls a value in the storage array */ if (getWindowSize() == n) { storage.addElementRolling(value); } else { n++; storage.addElement(value); } } else { /* If the windowSize is infinite don't store any values and there * is no need to discard the influence of any single item. */ n++; min.increment(value); max.increment(value); sum.increment(value); sumsq.increment(value); sumLog.increment(value); mean.increment(value); geoMean.increment(value); variance.increment(value); skewness.increment(value); kurtosis.increment(value); } } /** * Generates a text report displaying * univariate statistics from values that * have been added. * @return String with line feeds displaying statistics */ public String toString() { StringBuffer outBuffer = new StringBuffer(); outBuffer.append("UnivariateImpl:\n"); outBuffer.append("n: " + getN() + "\n"); outBuffer.append("min: " + getMin() + "\n"); outBuffer.append("max: " + getMax() + "\n"); outBuffer.append("mean: " + getMean() + "\n"); outBuffer.append("std dev: " + getStandardDeviation() + "\n"); outBuffer.append("skewness: " + getSkewness() + "\n"); outBuffer.append("kurtosis: " + getKurtosis() + "\n"); return outBuffer.toString(); } /** * @see org.apache.commons.math.Univariate#clear() */ public void clear() { super.clear(); if (getWindowSize() != INFINITE_WINDOW) { storage = new FixedDoubleArray(getWindowSize()); } } /** * @see org.apache.commons.math.stat.AbstractUnivariate#internalValues() */ protected double[] internalValues() { return storage == null ? null : storage.getValues(); } /** * @see org.apache.commons.math.stat.AbstractUnivariate#start() */ protected int start() { return storage.start(); } /** * @see org.apache.commons.math.stat.AbstractUnivariate#size() */ protected int size() { return storage.getNumElements(); } }
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
