On Mon, 11 Mar 2024 at 18:06, Craig Brautigam <cbrauti...@icr-team.com>
wrote:

> Just bumping this up...Would it be possible to get a fix for this?
>

You are requesting the estimate method of a mixture to support 1 component.
This is not a mixture. IIUC this is the equivalent of fitting a normal
distribution to your data with maximum likelihood estimation.

Validated in Matlab:

>> X = normrnd(4.125, 0.25, 100, 1);
>> fitdist(X,'norm')
ans =
  NormalDistribution

  Normal distribution
       mu = 4.13821   [4.08781, 4.1886]
    sigma = 0.25396   [0.222979, 0.29502]

>> GMModel = fitgmdist(X,1)
GMModel =
Gaussian mixture distribution with 1 components in 1 dimensions
Component 1:
Mixing proportion: 1.000000
Mean:    4.1382
>> sqrt(GMModel.Sigma)
ans =
    0.2527

I've tried this with a few different input X and the fit is slightly
different for the sigma so Matlab is not simply calling fitdist from within
fitgmdist (at least not with the defaults).

So you can get around this issue by fitting the data with a Normal
distribution. It will be a lot faster than the
MultivariateNormalMixtureExpectationMaximization class.

You could try this:

org.apache.commons.math4.legacy.fitting.GaussianCurveFitter

However that class does fit a normalisation factor in addition to the mean
and standard deviation. If you only wish to fit mean and standard deviation
then you could create your own fitter based on the CurveFitter class that
is extended by GaussianCurveFitter.

See if this works for you.

Regards,

Alex



>
>
> Thx!
>
>
> ________________________________
> From: Craig Brautigam <cbrauti...@icr-team.com>
> Sent: Thursday, March 7, 2024 2:47 PM
> To: Commons Users List <user@commons.apache.org>
> Subject: Re: [External] - Re:
> MultivariateNormalMixtureExpectationMaximization only 1 dimension
>
> Alex,
>
> Your fix seems to be working however, there is a similar problem in
> MultivariateNormalMixtureExpectationMaximization.estimate().  The number of
> components must be at least 2.  I think that you should be able to try to
> estimate with 1 component if you want to.  The matlab function fitgmdist
> does allow for  1 component, and much of our data does in fact best fit to
> only 1 component.
>
> Thoughts on fixing that restriction as well?
>
>
> Thx!
> Craig
>
>
> ________________________________
> From: Alex Herbert <alex.d.herb...@gmail.com>
> Sent: Tuesday, March 5, 2024 11:35 AM
> To: Commons Users List <user@commons.apache.org>
> Subject: [External] - Re: MultivariateNormalMixtureExpectationMaximization
> only 1 dimension
>
> [You don't often get email from alex.d.herb...@gmail.com. Learn why this
> is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.
>
>
> I have updated the master branch with a change to allow fitting a mixture
> with 1-column data.
>
> You should be able to pick up the 4.0-SNAPSHOT from the ASF snapshots repo
> if you configure your build to add the snapshot repository (see [1]).
>
> Let us know if this works for you. Note that if you only require fitting 1
> column data then you would be able to optimise the implementation as it
> will no longer require matrix inversion to compute the mixture probability
> distribution. The CM implementation can act as a reference point for your
> own implementation if desired.
>
> Regards,
>
> Alex
>
> [1]
>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Fsnapshots%2Forg%2Fapache%2Fcommons%2Fcommons-math4-legacy%2F4.0-SNAPSHOT%2F&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=pV5bELVx3%2FwNJ0LADZVQHv4Mf0UZEWq5GdwTFJTTyP0%3D&reserved=0
> <
> https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-math4-legacy/4.0-SNAPSHOT/
> >
>
> On Tue, 5 Mar 2024 at 00:06, Alex Herbert <alex.d.herb...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I think this is a bug in the
> > MultivariateNormalMixtureExpectationMaximization class. When I update the
> > code to allow 1 column in the rows it outputs a similar fit to matlab.
> > Here's an example of Matlab:
> >
> > X = [normrnd(0, 1, 100, 1); normrnd(2, 2, 100, 1)]
> > GMModel = fitgmdist(X,2);
> >
> > >> GMModel.mu
> > ans =
> >     0.0737
> >     3.0914
> > >> GMModel.ComponentProportion
> > ans =
> >     0.6750    0.3250
> > >> GMModel.Sigma
> > ans(:,:,1) =
> >     1.0505
> > ans(:,:,2) =
> >     1.6593
> >
> > I pasted the same X data into a test for
> > MultivariateNormalMixtureExpectationMaximization that had been updated to
> > allow data with a single column and get the following fit:
> >
> > MultivariateNormalMixtureExpectationMaximization fitter
> >     = new MultivariateNormalMixtureExpectationMaximization(data);
> >
> > MixtureMultivariateNormalDistribution initialMix
> >     = MultivariateNormalMixtureExpectationMaximization.estimate(data, 2);
> > fitter.fit(initialMix);
> > MixtureMultivariateNormalDistribution fittedMix =
> fitter.getFittedModel();
> > List<Pair<Double, MultivariateNormalDistribution>> components =
> > fittedMix.getComponents();
> >
> > for (Pair<Double, MultivariateNormalDistribution> component :
> components) {
> >     final double weight = component.getFirst();
> >     final MultivariateNormalDistribution mvn = component.getSecond();
> >     final double[] mean = mvn.getMeans();
> >     final RealMatrix covMat = mvn.getCovariances();
> >     System.out.printf("%s : %s : %s%n", weight, Arrays.toString(mean),
> > covMat.toString());
> > }
> >
> > 0.6420433138817465 : [0.016942587744259194] :
> > Array2DRowRealMatrix{{0.9929681356}}
> > 0.3579566861182536 : [2.9152176347671754] :
> > Array2DRowRealMatrix{{1.8940290549}}
> >
> > The numbers are close enough to indicate that the fit is valid.
> >
> > I think the error has been in assuming that because you require 2
> > components to have a mixture model then you must have 2 columns in the
> > input data. However this is not true. You can fit single dimension data
> > with a mixture of single Gaussians.
> >
> > Is this the functionality that you are expecting?
> >
> > Regards,
> >
> > Alex
> >
> >
> > On Mon, 4 Mar 2024 at 20:48, Craig Brautigam <cbrauti...@icr-team.com>
> > wrote:
> >
> >> Forgive me if this comes in twice... I did not subscribe first before
> >> sending the message below.
> >>
> >>
> >> ________________________________
> >> From: Craig Brautigam
> >> Sent: Monday, March 4, 2024 1:33 PM
> >> To: user@commons.apache.org <user@commons.apache.org>
> >> Subject: MultivariateNormalMixtureExpectationMaximization only 1
> dimension
> >>
> >> Hi,
> >>
> >> Full disclosure, I'm not a mathematician so I can not go into the weeds
> >> into the math.  However I am tasked with porting some matlab code that
> is
> >> doing gaussian mixed model to java.  I really want to use apache common
> >> math if possible.  However the code that I'm porting has 1 dimension ( a
> >> single variable/attribute/property) that GMMs are being created from.
> >>
> >> MultivariateNormalMixtureExpectationMaximization looks to be a pretty
> >> close drop in replacement for the matlab functions
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> <<https://www.mathworks.com/help/stats/fitgmdist.html>
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> ><https://www.mathworks.com/help/stats/fitgmdist.html> andhttps://
> >>
> https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Fgmdistribution.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=zsj4iQQmeOUd9ZmleDuu8TB5AM%2BU82hoGBg0kJD541w%3D&reserved=0
> <http://www.mathworks.com/help/stats/gmdistribution.html>, however the
> >> constructor for MultivariateNormalMixtureExpectationMaximization clearly
> >> states the the number of columns in the double[][]data array MUST be no
> >> less thatn2 columns.  I'm completely baffled as to why this is the case
> if
> >> I want to try to fit data with 1 dimension in it.  Is there a
> workaround I
> >> can use like provide a dummy column of data with all 0s to pacify the
> >> constructor? Is there another class I should be using?
> >>
> >> Any help would be greatly appreciated.
> >>
> >> Thx!
> >>
> >>
> >> ________________________________
> >> The information contained in this e-mail and any attachments from ICR,
> >> Inc. may contain confidential and/or proprietary information, and is
> >> intended only for the named recipient to whom it was originally
> addressed.
> >> If you are not the intended recipient, any disclosure, distribution, or
> >> copying of this e-mail or its attachments is strictly prohibited. If you
> >> have received this e-mail in error, please notify the sender
> immediately by
> >> return e-mail and permanently delete the e-mail and any attachments.
> >>
> >
>
> ________________________________
> From: Alex Herbert <alex.d.herb...@gmail.com>
> Sent: Tuesday, March 5, 2024 11:35 AM
> To: Commons Users List <user@commons.apache.org>
> Subject: [External] - Re: MultivariateNormalMixtureExpectationMaximization
> only 1 dimension
>
> [You don't often get email from alex.d.herb...@gmail.com. Learn why this
> is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.
>
>
> I have updated the master branch with a change to allow fitting a mixture
> with 1-column data.
>
> You should be able to pick up the 4.0-SNAPSHOT from the ASF snapshots repo
> if you configure your build to add the snapshot repository (see [1]).
>
> Let us know if this works for you. Note that if you only require fitting 1
> column data then you would be able to optimise the implementation as it
> will no longer require matrix inversion to compute the mixture probability
> distribution. The CM implementation can act as a reference point for your
> own implementation if desired.
>
> Regards,
>
> Alex
>
> [1]
>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Fsnapshots%2Forg%2Fapache%2Fcommons%2Fcommons-math4-legacy%2F4.0-SNAPSHOT%2F&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=pV5bELVx3%2FwNJ0LADZVQHv4Mf0UZEWq5GdwTFJTTyP0%3D&reserved=0
> <
> https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-math4-legacy/4.0-SNAPSHOT/
> >
>
> On Tue, 5 Mar 2024 at 00:06, Alex Herbert <alex.d.herb...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I think this is a bug in the
> > MultivariateNormalMixtureExpectationMaximization class. When I update the
> > code to allow 1 column in the rows it outputs a similar fit to matlab.
> > Here's an example of Matlab:
> >
> > X = [normrnd(0, 1, 100, 1); normrnd(2, 2, 100, 1)]
> > GMModel = fitgmdist(X,2);
> >
> > >> GMModel.mu
> > ans =
> >     0.0737
> >     3.0914
> > >> GMModel.ComponentProportion
> > ans =
> >     0.6750    0.3250
> > >> GMModel.Sigma
> > ans(:,:,1) =
> >     1.0505
> > ans(:,:,2) =
> >     1.6593
> >
> > I pasted the same X data into a test for
> > MultivariateNormalMixtureExpectationMaximization that had been updated to
> > allow data with a single column and get the following fit:
> >
> > MultivariateNormalMixtureExpectationMaximization fitter
> >     = new MultivariateNormalMixtureExpectationMaximization(data);
> >
> > MixtureMultivariateNormalDistribution initialMix
> >     = MultivariateNormalMixtureExpectationMaximization.estimate(data, 2);
> > fitter.fit(initialMix);
> > MixtureMultivariateNormalDistribution fittedMix =
> fitter.getFittedModel();
> > List<Pair<Double, MultivariateNormalDistribution>> components =
> > fittedMix.getComponents();
> >
> > for (Pair<Double, MultivariateNormalDistribution> component :
> components) {
> >     final double weight = component.getFirst();
> >     final MultivariateNormalDistribution mvn = component.getSecond();
> >     final double[] mean = mvn.getMeans();
> >     final RealMatrix covMat = mvn.getCovariances();
> >     System.out.printf("%s : %s : %s%n", weight, Arrays.toString(mean),
> > covMat.toString());
> > }
> >
> > 0.6420433138817465 : [0.016942587744259194] :
> > Array2DRowRealMatrix{{0.9929681356}}
> > 0.3579566861182536 : [2.9152176347671754] :
> > Array2DRowRealMatrix{{1.8940290549}}
> >
> > The numbers are close enough to indicate that the fit is valid.
> >
> > I think the error has been in assuming that because you require 2
> > components to have a mixture model then you must have 2 columns in the
> > input data. However this is not true. You can fit single dimension data
> > with a mixture of single Gaussians.
> >
> > Is this the functionality that you are expecting?
> >
> > Regards,
> >
> > Alex
> >
> >
> > On Mon, 4 Mar 2024 at 20:48, Craig Brautigam <cbrauti...@icr-team.com>
> > wrote:
> >
> >> Forgive me if this comes in twice... I did not subscribe first before
> >> sending the message below.
> >>
> >>
> >> ________________________________
> >> From: Craig Brautigam
> >> Sent: Monday, March 4, 2024 1:33 PM
> >> To: user@commons.apache.org <user@commons.apache.org>
> >> Subject: MultivariateNormalMixtureExpectationMaximization only 1
> dimension
> >>
> >> Hi,
> >>
> >> Full disclosure, I'm not a mathematician so I can not go into the weeds
> >> into the math.  However I am tasked with porting some matlab code that
> is
> >> doing gaussian mixed model to java.  I really want to use apache common
> >> math if possible.  However the code that I'm porting has 1 dimension ( a
> >> single variable/attribute/property) that GMMs are being created from.
> >>
> >> MultivariateNormalMixtureExpectationMaximization looks to be a pretty
> >> close drop in replacement for the matlab functions
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> <<https://www.mathworks.com/help/stats/fitgmdist.html>
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> ><https://www.mathworks.com/help/stats/fitgmdist.html> andhttps://
> >>
> https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Fgmdistribution.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=zsj4iQQmeOUd9ZmleDuu8TB5AM%2BU82hoGBg0kJD541w%3D&reserved=0
> <http://www.mathworks.com/help/stats/gmdistribution.html>, however the
> >> constructor for MultivariateNormalMixtureExpectationMaximization clearly
> >> states the the number of columns in the double[][]data array MUST be no
> >> less thatn2 columns.  I'm completely baffled as to why this is the case
> if
> >> I want to try to fit data with 1 dimension in it.  Is there a
> workaround I
> >> can use like provide a dummy column of data with all 0s to pacify the
> >> constructor? Is there another class I should be using?
> >>
> >> Any help would be greatly appreciated.
> >>
> >> Thx!
> >>
> >>
> >> ________________________________
> >> The information contained in this e-mail and any attachments from ICR,
> >> Inc. may contain confidential and/or proprietary information, and is
> >> intended only for the named recipient to whom it was originally
> addressed.
> >> If you are not the intended recipient, any disclosure, distribution, or
> >> copying of this e-mail or its attachments is strictly prohibited. If you
> >> have received this e-mail in error, please notify the sender
> immediately by
> >> return e-mail and permanently delete the e-mail and any attachments.
> >>
> >
> ________________________________
> The information contained in this e-mail and any attachments from ICR,
> Inc. may contain confidential and/or proprietary information, and is
> intended only for the named recipient to whom it was originally addressed.
> If you are not the intended recipient, any disclosure, distribution, or
> copying of this e-mail or its attachments is strictly prohibited. If you
> have received this e-mail in error, please notify the sender immediately by
> return e-mail and permanently delete the e-mail and any attachments.
>

Reply via email to