ok.

Wanted to understand advantage of having a container class for all
storeless stats (just as DescriptiveStats is for Univariate). I could open
another email thread.
Also wanted to understand whats a abstract interface problem that you were
refering

thanks
murthy

On Tue, Oct 14, 2014 at 9:47 AM, Phil Steitz <phil.ste...@gmail.com> wrote:

> On 10/13/14 8:55 PM, venkatesha murthy wrote:
> > On Tue, Oct 14, 2014 at 6:05 AM, Phil Steitz <phil.ste...@gmail.com>
> wrote:
> >
> >> On 10/13/14 1:04 PM, venkatesha murthy wrote:
> >>> Adding a bit more on this:
> >>> a) The DescriptiveStatisticalSummary actually handles the rest of the
> >>> functions such as addValue, getPercentile etc.
> >>> b) I have added addValue() as it is important to see either storeless
> or
> >>> store variants as interfaces.
> >>> c) A case in point being (for b); i was actually trying out a lockfull
> >> and
> >>> a lockfree based variants for descriptive statistical summary and it
> was
> >>> very concise/consistent with an interface to use that has all common
> >>> functions across all variants.
> >>> d) well lock based or lock free variants are not a part of this patch
> as
> >>> iam still working through
> >>>
> >>> However i feel the getPercentile can definitely add value. Please let
> me
> >>> know if i could turn in all the relevant methods of
> >>> DescriptiveStorelessStatistics  into statistical summary (such as
> >> kurtosis,
> >>> skewness etc..) and then we could just use SummaryStatistics.
> >> I am not sure I understand what you are proposing.  Currently, we
> >> have two statistical "aggregates" for descriptive univariate stats:
> >> SummaryStatistics - aggregates "storeless" statistics over a stream
> >> of data that is not stored in memory
> >> DescriptiveStatistics - provides an extended set of statistics, some
> >> of which require that the full set of data be stored in memory
> >>
> >> OK. I am sorry for the confusion here. I understand the intent now.
> > However what i wanted to convey was all the statistics that
> > is supported in current DescriptiveStatistics can be supported in
> Storeless
> > variant as well. (For eg: skewness, kurtosis, percentile)
>
> No, for example exact percentiles, or even arbitrary percentiles
> (without the quantile - e.g. quartile) specified in advance, can't
> be computed without storing the data.  Also, DescriptiveStatistics
> supports a rolling window and stats it implements can make use of
> multi-pass algorithms.
>
> >
> > Therefore; what i was proposing is to have a common interface that can
> have
> > all these methods too. for eg: (we can change the name if it is needed)
> >
> > DescriptiveStatisticalSummary<S extends UnivariateStatistics> extends
> > StatisticalSummary{
> >      getKurtosis();
> >      getPercentile();
> >      getSkewness();
> >      // Add Mutation methods as well
> >      addValue(double d);
> >      //Provide additional builder methods for injecting custom
> percentile,
> > kurtosis, skewness, variance etc.
> >      withPercentile(S Percentile);
> >      withKurtosis(S kurtosis);
> > }
>
> Per comments above, the contracts of these aggregates are
> different.  We have also moved away from defining abstract
> interfaces as these end up creating problems when we want to add
> things (as in the subject of this thread).
>
> Phil
> >
> >> The subject of this thread was a proposal to add quartiles to
> >> SummaryStatistics, as the new(ish) PSquarePercentile allows those
> >> statistics to be computed without storing the data.
> >>
> >> Agreed. I was just adding points on how we can bring both
> > DescriptiveStatistics and SummaryStatistics under a common interface for
> > all the stats.
> >
> >> Phil
> >>> On Tue, Oct 14, 2014 at 1:15 AM, venkatesha murthy <
> >>> venkateshamurth...@gmail.com> wrote:
> >>>
> >>>> Hi Phil,
> >>>>
> >>>> Though i did not add to StatisticalSummary i was actually working on a
> >>>> DescriptiveStatisticalSummary for all the Storeless variants inclusive
> >> of
> >>>> PSquarePercentile. Would it help if you can actually implement
> >>>> SummaryStatisitcs with an extended interface such as
> >>>> DescriptiveStatisticalSummary ? below.
> >>>>
> >>>> That said i actually wanted to discuss the new storelessvariant of
> >>>> descriptive statistics.
> >>>> a) DescriptiveStatisticalSummary - an extended interface for
> >>>> StatisticalSummary (adds a Generic type that can cater for store full
> >> and
> >>>> storeless)
> >>>> b) DescriptiveStorelessStatistics - Storeless variant of
> >>>> DescriptiveStatisitcs
> >>>> c) SynchronizedDescriptiveStorelessStatistics - a synchronized
> wrapper.
> >>>>
> >>>> Test case classes added to the same.
> >>>>
> >>>> Please let me know on this i could also accomodate the changes to
> >> summary
> >>>> stats based on this change here.
> >>>> Also please let me know if this could be raised as a jira ticket to
> >> pursue.
> >>>> Thanks
> >>>> Murthy
> >>>>
> >>>> On Sat, Oct 11, 2014 at 1:10 AM, Phil Steitz <phil.ste...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Now that we have a "storeless" percentile estimator, we can add
> >>>>> quartile computation to SummaryStatistics.  Any objections to my
> >>>>> adding this?  I could optionally add a boolean constructor argument
> >>>>> to avoid the overhead of maintaining these stats.  Or more
> >>>>> generally, add a bitfield encoding the exact set of stats the user
> >>>>> wants to maintain.  If there are no objections to the addition, I
> >>>>> will open a JIRA.
> >>>>>
> >>>>> Phil
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>>
> >>>>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Reply via email to