Re: [Numpy-discussion] Adding weights to cov and corrcoef

Sebastian Berg Thu, 06 Mar 2014 16:33:11 -0800

On Do, 2014-03-06 at 19:51 +0000, Nathaniel Smith wrote:
> On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg
> <[email protected]> wrote:
> >
> > Hi all,
> >
> > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe
> > suggested adding new parameters to our `cov` and `corrcoef` functions to
> > implement weights, which already exists for `average` (the PR still
> > needs to be adapted).
> >
> > The idea right now would be to add a `weights` and a `frequencies`
> > keyword arguments to these functions.
> >
> > In more detail: The situation is a bit more complex for `cov` and
> > `corrcoef` than `average`, because there are different types of weights.
> > The current plan would be to add two new keyword arguments:
> >   * weights: Uncertainty weights which causes `N` to be recalculated
> >     accordingly (This is R's `cov.wt` default I believe).
> >   * frequencies: When given, `N = sum(frequencies)` and the values
> >     are weighted by their frequency.
> 
> I don't understand this description at all. One them recalculates N,
> and the other sets N according to some calculation?
> 
> Is there a standard reference on how these are supposed to be
> interpreted? When you talk about per-value uncertainties, I start
> imagining that we're trying to estimate a population covariance given
> a set of samples each corrupted by independent measurement noise, and
> then there's some natural hierarchical Bayesian model one could write
> down and get an ML estimate of the latent covariance via empirical
> Bayes or something. But this requires a bunch of assumptions and is
> that really what we want to do? (Or maybe it collapses down into
> something simpler if the measurement noise is gaussian or something?)
>


I had really hoped someone who knows this stuff very well would show
up ;).

I think these weights were uncertainties under gaussian assumption and
the other types of weights different, see `aweights` here:
http://www.stata.com/support/faqs/statistics/weights-and-summary-statistics/, 
but I did not check a statistics book or have one here right now (e.g. 
wikipedia is less than helpful).
Frankly unless there is some "obviously right" thing (for a
statistician), I would be careful add such new features. And while I
thought before that this might be the case, it isn't clear to me.

- Sebastian


> -n
> 


_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding weights to cov and corrcoef

Reply via email to