+1 from me, on all counts! -Greg
On Fri, Aug 12, 2011 at 11:30 PM, Phil Steitz <[email protected]> wrote:
> On 8/12/11 7:16 PM, Greg Sterijevski wrote:
> > Hello All,
> >
> > Before I chum the water with more JIRA tickets, I thought I would see
> > whether people thought this was important.
> >
> > In the SimpleRegression you have two methods:
> >
> > public void addData(double x, double y) {
> > ...some code that is not germane to discussion......
> >
> > if (n > 2) {
> > distribution = new TDistributionImpl(n - 2);
> > }
> > }
> >
> > public void removeData(double x, double y) {
> > ...some code that is not germane......
> >
> > if (n > 2) {
> > distribution = new TDistributionImpl(n - 2);
> > }
> > }
> > }
> >
> > >From the perspective of a user, you are likely to call add/remove
> repeatedly
> > before you ever need to check for statistical significance. Wouldn't it
> be
> > better to instantiate the TDistribution when it is needed?
> >
> > So you would have to make the following two methods a bit more
> complicated:
> >
> > public double getSlopeConfidenceInterval(double alpha)
> > throws MathException {
> > if (alpha >= 1 || alpha <= 0) {
> > throw new
> > OutOfRangeException(LocalizedFormats.SIGNIFICANCE_LEVEL,
> > alpha, 0, 1);
> > }
> > if( distribution == null || distribution.getDegreesOfFreedom() !=
> > n-2){
> > distribution = new TDistributionImpl(n - 2);
> > }
> > return getSlopeStdErr() *
> > distribution.inverseCumulativeProbability(1d - alpha / 2d);
> > }
> >
> > Similarly getSignificance() would have to be modified with the check of
> the
> > degrees of freedom of the distribution.
> >
> > There is nothing wrong with the current code, but making the change means
> > faster updates.
>
> Slightly, yes. There is not much code at all in the distribution
> constructor, but you are correct. Moreover, I can see now that the
> "immutability-everywhere" changes in trunk have made the code in the
> class a little funny. In versions <=2.2, the TDistribution used by
> the class was pluggable - i.e., there was a constructor that took at
> TDistribution as an argument, so if for some reason you wanted to
> use an impl different from TDistributionImpl, you could do that.
> There was also a setter for the distribution. In addition,
> TDistributionImpl itself was mutable, exposing a setDegreesOfFreedom
> method. So the distribution member was set at construction time and
> the data update methods called the setter for DF on the
> distribution. We decided to make the distributions immutable in 3.0
> (search the archives for discussion), so the current mods were done
> to basically accomplish that. But the code should be cleaned up.
> The constructor taking an int is meaningless and should be
> deprecated or removed (unfortunately, we added that in 2.2 and
> advertised it as a deprecation replacement for the version that took
> a distribution as parameter. We should have realized then that it
> was meaningless. My bad for missing that. I would favor dropping it
> in 3.0, since even if anyone is using it, it isn't doing anything
> meaningful for them.) Given that constructing a TDistributionImpl
> is trivial, we might as well eliminate the member field altogether
> and just create one when needed. If you agree and no one else
> objects, I will make these changes. Thanks for reviewing the code.
>
> Phil
> >
> > Thoughts?
> >
> > -Greg
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>