Yes thank you! As you stated you should be able to have a single column with n components (having enough data samples of course).. Having your fix would allow me to port this code over from what I see.
Thx! ________________________________ From: Alex Herbert <alex.d.herb...@gmail.com> Sent: Monday, March 4, 2024 5:06 PM To: Commons Users List <user@commons.apache.org> Subject: [External] - Re: MultivariateNormalMixtureExpectationMaximization only 1 dimension [You don't often get email from alex.d.herb...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi, I think this is a bug in the MultivariateNormalMixtureExpectationMaximization class. When I update the code to allow 1 column in the rows it outputs a similar fit to matlab. Here's an example of Matlab: X = [normrnd(0, 1, 100, 1); normrnd(2, 2, 100, 1)] GMModel = fitgmdist(X,2); >> GMModel.mu ans = 0.0737 3.0914 >> GMModel.ComponentProportion ans = 0.6750 0.3250 >> GMModel.Sigma ans(:,:,1) = 1.0505 ans(:,:,2) = 1.6593 I pasted the same X data into a test for MultivariateNormalMixtureExpectationMaximization that had been updated to allow data with a single column and get the following fit: MultivariateNormalMixtureExpectationMaximization fitter = new MultivariateNormalMixtureExpectationMaximization(data); MixtureMultivariateNormalDistribution initialMix = MultivariateNormalMixtureExpectationMaximization.estimate(data, 2); fitter.fit(initialMix); MixtureMultivariateNormalDistribution fittedMix = fitter.getFittedModel(); List<Pair<Double, MultivariateNormalDistribution>> components = fittedMix.getComponents(); for (Pair<Double, MultivariateNormalDistribution> component : components) { final double weight = component.getFirst(); final MultivariateNormalDistribution mvn = component.getSecond(); final double[] mean = mvn.getMeans(); final RealMatrix covMat = mvn.getCovariances(); System.out.printf("%s : %s : %s%n", weight, Arrays.toString(mean), covMat.toString()); } 0.6420433138817465 : [0.016942587744259194] : Array2DRowRealMatrix{{0.9929681356}} 0.3579566861182536 : [2.9152176347671754] : Array2DRowRealMatrix{{1.8940290549}} The numbers are close enough to indicate that the fit is valid. I think the error has been in assuming that because you require 2 components to have a mixture model then you must have 2 columns in the input data. However this is not true. You can fit single dimension data with a mixture of single Gaussians. Is this the functionality that you are expecting? Regards, Alex On Mon, 4 Mar 2024 at 20:48, Craig Brautigam <cbrauti...@icr-team.com> wrote: > Forgive me if this comes in twice... I did not subscribe first before > sending the message below. > > > ________________________________ > From: Craig Brautigam > Sent: Monday, March 4, 2024 1:33 PM > To: user@commons.apache.org <user@commons.apache.org> > Subject: MultivariateNormalMixtureExpectationMaximization only 1 dimension > > Hi, > > Full disclosure, I'm not a mathematician so I can not go into the weeds > into the math. However I am tasked with porting some matlab code that is > doing gaussian mixed model to java. I really want to use apache common > math if possible. However the code that I'm porting has 1 dimension ( a > single variable/attribute/property) that GMMs are being created from. > > MultivariateNormalMixtureExpectationMaximization looks to be a pretty > close drop in replacement for the matlab functions > https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7C66844f07674e412ff28f08dc3ca845eb%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638451940630725298%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=XjMm%2FRepVwQdEtNTzDEAu0xxltECt6plR2sLXmt3Q44%3D&reserved=0<<https://www.mathworks.com/help/stats/fitgmdist.html> > https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7C66844f07674e412ff28f08dc3ca845eb%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638451940630725298%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=XjMm%2FRepVwQdEtNTzDEAu0xxltECt6plR2sLXmt3Q44%3D&reserved=0><https://www.mathworks.com/help/stats/fitgmdist.html> > andhttps:// > https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Fgmdistribution.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7C66844f07674e412ff28f08dc3ca845eb%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638451940630725298%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=dxjQmvgLCqoDqbPh3ZDybn%2FZkOGfd0cKfT73xhK5upc%3D&reserved=0<http://www.mathworks.com/help/stats/gmdistribution.html>, > however the constructor > for MultivariateNormalMixtureExpectationMaximization clearly states the the > number of columns in the double[][]data array MUST be no less thatn2 > columns. I'm completely baffled as to why this is the case if I want to > try to fit data with 1 dimension in it. Is there a workaround I can use > like provide a dummy column of data with all 0s to pacify the constructor? > Is there another class I should be using? > > Any help would be greatly appreciated. > > Thx! > > > ________________________________ > The information contained in this e-mail and any attachments from ICR, > Inc. may contain confidential and/or proprietary information, and is > intended only for the named recipient to whom it was originally addressed. > If you are not the intended recipient, any disclosure, distribution, or > copying of this e-mail or its attachments is strictly prohibited. If you > have received this e-mail in error, please notify the sender immediately by > return e-mail and permanently delete the e-mail and any attachments. > Y ________________________________ The information contained in this e-mail and any attachments from ICR, Inc. may contain confidential and/or proprietary information, and is intended only for the named recipient to whom it was originally addressed. If you are not the intended recipient, any disclosure, distribution, or copying of this e-mail or its attachments is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by return e-mail and permanently delete the e-mail and any attachments.