+1
-- Michael W. Dusenberry GitHub: github.com/dusenberrymw LinkedIn: linkedin.com/in/mikedusenberry On Sat, Feb 18, 2017 at 10:04 PM, Niketan Pansare <npan...@us.ibm.com> wrote: > +1 > > Thanks, > > Niketan > > > On Feb 18, 2017, at 10:01 PM, Arvind Surve <ac...@yahoo.com.INVALID> > wrote: > > > > +1 ------------------ Arvind Surve Spark Technology Center > http://www.spark.tc/ > > > > From: Felix Schüler <fschue...@posteo.de> > > To: dev@systemml.incubator.apache.org > > Sent: Saturday, February 18, 2017 9:42 PM > > Subject: Re: Weighted Statistical Estimates > > > > Sounds good! > > > > -Felix > > > >> On 18.02.2017 21:20, Matthias Boehm wrote: > >> Going toward to our 1.0 release, I'd like to create consistency across > our > >> weighted statistics. Conceptually, theses weights represent frequency > >> counts, i.e., multiplicities of input values. > >> > >> So far, our documentation does not state any restrictions on these > weights > >> but some runtime operations require integer data (I), while others allow > >> arbitrary floating point data as indicated below: > >> > >> * moment > >> * cov > >> * aggregate > >> * table > >> * median (I) > >> * quantile (I) > >> * interQuartileMean (I) > >> > >> This can lead to unexpected errors as shown by recent issues such as > >> SYSTEMML-1265. Looking back to R and its packages like Hmisc or > reldist, it > >> turns out that they all allow arbitrary weights. > >> > >> So, relaxing any restrictions of integer weights seems like the right > >> choice. As this changes the external behavior - albeit in a generalizing > >> manner - we should make this change now. If you have any concerns, let > me > >> know. > >> > >> Regards, > >> Matthias > >> > > > > > > > >