Hi Dan,
I would have thought that it is the relative scaling that is important, not
the overall scaling. I.e. each feature of your data set should have zero
mean and unit variance.
Martin
On 31 October 2012 16:09, bthirion <[email protected]> wrote:
> On 10/31/2012 04:50 PM, Dan Stowell wrote:
> > Hi all,
> >
> > I'm still getting odd results using mixture.GMM depending on data
> > scaling. In the following code example, I change the overall scaling but
> > I do NOT change the relative scaling of the dimensions. Yet under the
> > three different scaling settings I get completely different results:
> >
> > ------------
> > from sklearn.mixture import GMM
> > from numpy import array, shape
> > from numpy.random import randn
> > from random import choice
> >
> > # centroids will be normally-distributed around zero:
> > truelumps = randn(20, 5) * 10
> >
> > # data randomly sampled from the centroids:
> > data = array([choice(truelumps) + randn(5) for _ in xrange(1000)])
> >
> > for scaler in [0.01, 1, 100]:
> > scdata = data * scaler
> > thegmm = GMM(n_components=10)
> > thegmm.fit(scdata, n_iter=1000)
> > ll = thegmm.score(scdata)
> > print sum(ll)
> > ------------
> >
> > Here's the output I get:
> >
> > GMM(cvtype='diag', n_components=10)
> > 7094.87886779
> > GMM(cvtype='diag', n_components=10)
> > -14681.566456
> > GMM(cvtype='diag', n_components=10)
> > -37576.4496656
> >
> >
> > In principle, I don't think the overall data scaling should matter, but
> > maybe there's an implementation issue I'm overlooking?
> >
> > Thanks
> > Dan
> Hi Dan,
>
> But even if the solution is the same, you expect the likelihood value to
> change, i.e; it offseted by something like 0.5 * n_dim * n_samples *
> log(scale). I'm not suprised by your result.
>
> Bertrand
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general