Hi sklearn community, I'm new on this list, Python user of many years, and maybe an advanced beginner with scikit-learn, which I've used for a previous project. I'll just jump in with my question.
I'm trying to use sklearn.mixture.GMM to fit (fairly) bimodal scalar data. The data values can theoretically vary between 0 and 1. They're represented as a float32 Numpy arras. The scalar value is in fact calculated using a combination of two spectral bands (infrared remote sensing), and I'm trying to find the band combination that produces an index that best separates the two modes. I find that for some band combinations (and therefore histograms), GMM very nicely fits two Gaussians. Example plots of very good fits: https://dl.dropboxusercontent.com/u/372734/IMG/boundary_HFDI_GMM_193_216.png https://dl.dropboxusercontent.com/u/372734/IMG/boundary_HFDI_GMM_191_219.png Examples of bad fits (that is, one Gaussian dominates with a weight of approx. 99%, the other one is flat): https://dl.dropboxusercontent.com/u/372734/IMG/boundary_HFDI_GMM_192_216.png https://dl.dropboxusercontent.com/u/372734/IMG/boundary_HFDI_GMM_193_212.png I'm calling the model as follows. The scalar index is called hfdi, and it lives on a 2D grid. > from sklearn.mixture import GMM > ... > g = GMM(n_components=2) > g.fit(hfdi.flatten()) g.converged_ nearly always returns True. I also tried to play with some of the arguments: > g = GMM(n_components=2, thresh=0.0001, n_init=5, n_iter=1000) ... but with no improvement other than if I reduce the threshold too much I produce division-by-zero errors (I think). I only have about 200 samples. Maybe that's not enough. Any advice? Thanks, Chris Waigl -- Chris Waigl - cwa...@alaska.edu - +1-907-474-5483 - Skype: cwaigl_work Geophysical Institute, UAF, 903 Koyukuk Drive, Fairbanks, AK 99775-7320, USA ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general