Opened a ticket. Submitted patches. -Greg

On Sat, Aug 20, 2011 at 4:47 PM, Phil Steitz <phil.ste...@gmail.com> wrote:

> On 8/12/11 9:30 PM, Phil Steitz wrote:
> > On 8/12/11 7:16 PM, Greg Sterijevski wrote:
> >> Hello All,
> >>
> >> Before I chum the water with more JIRA tickets, I thought I would see
> >> whether people thought this was important.
> >>
> >> In the SimpleRegression you have two methods:
> >>
> >> public void addData(double x, double y) {
> >>   ...some code that is not germane to discussion......
> >>
> >>         if (n > 2) {
> >>             distribution = new TDistributionImpl(n - 2);
> >>         }
> >>     }
> >>
> >>     public void removeData(double x, double y) {
> >>   ...some code that is not germane......
> >>
> >>             if (n > 2) {
> >>                 distribution = new TDistributionImpl(n - 2);
> >>             }
> >>         }
> >>     }
> >>
> >> >From the perspective of a user, you are likely to call add/remove
> repeatedly
> >> before you ever need to check for statistical significance. Wouldn't it
> be
> >> better to instantiate the TDistribution when it is needed?
> >>
> >> So you would have to make the following two methods a bit more
> complicated:
> >>
> >>  public double getSlopeConfidenceInterval(double alpha)
> >>         throws MathException {
> >>         if (alpha >= 1 || alpha <= 0) {
> >>             throw new
> >> OutOfRangeException(LocalizedFormats.SIGNIFICANCE_LEVEL,
> >>                                           alpha, 0, 1);
> >>         }
> >>         if( distribution == null || distribution.getDegreesOfFreedom()
> !=
> >> n-2){
> >>           distribution = new TDistributionImpl(n - 2);
> >> }
> >>         return getSlopeStdErr() *
> >>             distribution.inverseCumulativeProbability(1d - alpha / 2d);
> >>     }
> >>
> >> Similarly getSignificance() would have to be modified with the check of
> the
> >> degrees of freedom of the distribution.
> >>
> >> There is nothing wrong with the current code, but making the change
> means
> >> faster updates.
> > Slightly, yes.  There is not much code at all in the distribution
> > constructor, but you are correct.  Moreover, I can see now that the
> > "immutability-everywhere" changes in trunk have made the code in the
> > class a little funny.  In versions <=2.2, the TDistribution used by
> > the class was pluggable - i.e., there was a constructor that took at
> > TDistribution as an argument, so if for some reason you wanted to
> > use an impl different from TDistributionImpl, you could do that.
> > There was also a setter for the distribution. In addition,
> > TDistributionImpl itself was mutable, exposing a setDegreesOfFreedom
> > method.  So the distribution member was set at construction time and
> > the data update methods called the setter for DF on the
> > distribution.  We decided to make the distributions immutable in 3.0
> > (search the archives for discussion), so the current mods were done
> > to basically accomplish that.  But the code should be cleaned up.
> > The constructor taking an int is meaningless and should be
> > deprecated or removed (unfortunately, we added that in 2.2 and
> > advertised it as a deprecation replacement for the version that took
> > a distribution as parameter.  We should have realized then that it
> > was meaningless.  My bad for missing that. I would favor dropping it
> > in 3.0, since even if anyone is using it, it isn't doing anything
> > meaningful for them.)  Given that constructing a TDistributionImpl
> > is trivial, we might as well eliminate the member field altogether
> > and just create one when needed.  If you agree and no one else
> > objects, I will make these changes.  Thanks for reviewing the code.
>
> Tracked as MATH-648, fixed in r1159918.
>
> Phil
> >
> > Phil
> >> Thoughts?
> >>
> >> -Greg
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Reply via email to