Hi Craig,

In general, the fastest way to address an issue you care about is to
provide a PR in GitHub, with a unit test 😉

Gary


On Mon, Mar 11, 2024, 2:06 PM Craig Brautigam <cbrauti...@icr-team.com>
wrote:

> Just bumping this up...Would it be possible to get a fix for this?
>
>
> Thx!
>
>
> ________________________________
> From: Craig Brautigam <cbrauti...@icr-team.com>
> Sent: Thursday, March 7, 2024 2:47 PM
> To: Commons Users List <user@commons.apache.org>
> Subject: Re: [External] - Re:
> MultivariateNormalMixtureExpectationMaximization only 1 dimension
>
> Alex,
>
> Your fix seems to be working however, there is a similar problem in
> MultivariateNormalMixtureExpectationMaximization.estimate().  The number of
> components must be at least 2.  I think that you should be able to try to
> estimate with 1 component if you want to.  The matlab function fitgmdist
> does allow for  1 component, and much of our data does in fact best fit to
> only 1 component.
>
> Thoughts on fixing that restriction as well?
>
>
> Thx!
> Craig
>
>
> ________________________________
> From: Alex Herbert <alex.d.herb...@gmail.com>
> Sent: Tuesday, March 5, 2024 11:35 AM
> To: Commons Users List <user@commons.apache.org>
> Subject: [External] - Re: MultivariateNormalMixtureExpectationMaximization
> only 1 dimension
>
> [You don't often get email from alex.d.herb...@gmail.com. Learn why this
> is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.
>
>
> I have updated the master branch with a change to allow fitting a mixture
> with 1-column data.
>
> You should be able to pick up the 4.0-SNAPSHOT from the ASF snapshots repo
> if you configure your build to add the snapshot repository (see [1]).
>
> Let us know if this works for you. Note that if you only require fitting 1
> column data then you would be able to optimise the implementation as it
> will no longer require matrix inversion to compute the mixture probability
> distribution. The CM implementation can act as a reference point for your
> own implementation if desired.
>
> Regards,
>
> Alex
>
> [1]
>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Fsnapshots%2Forg%2Fapache%2Fcommons%2Fcommons-math4-legacy%2F4.0-SNAPSHOT%2F&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=pV5bELVx3%2FwNJ0LADZVQHv4Mf0UZEWq5GdwTFJTTyP0%3D&reserved=0
> <
> https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-math4-legacy/4.0-SNAPSHOT/
> >
>
> On Tue, 5 Mar 2024 at 00:06, Alex Herbert <alex.d.herb...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I think this is a bug in the
> > MultivariateNormalMixtureExpectationMaximization class. When I update the
> > code to allow 1 column in the rows it outputs a similar fit to matlab.
> > Here's an example of Matlab:
> >
> > X = [normrnd(0, 1, 100, 1); normrnd(2, 2, 100, 1)]
> > GMModel = fitgmdist(X,2);
> >
> > >> GMModel.mu
> > ans =
> >     0.0737
> >     3.0914
> > >> GMModel.ComponentProportion
> > ans =
> >     0.6750    0.3250
> > >> GMModel.Sigma
> > ans(:,:,1) =
> >     1.0505
> > ans(:,:,2) =
> >     1.6593
> >
> > I pasted the same X data into a test for
> > MultivariateNormalMixtureExpectationMaximization that had been updated to
> > allow data with a single column and get the following fit:
> >
> > MultivariateNormalMixtureExpectationMaximization fitter
> >     = new MultivariateNormalMixtureExpectationMaximization(data);
> >
> > MixtureMultivariateNormalDistribution initialMix
> >     = MultivariateNormalMixtureExpectationMaximization.estimate(data, 2);
> > fitter.fit(initialMix);
> > MixtureMultivariateNormalDistribution fittedMix =
> fitter.getFittedModel();
> > List<Pair<Double, MultivariateNormalDistribution>> components =
> > fittedMix.getComponents();
> >
> > for (Pair<Double, MultivariateNormalDistribution> component :
> components) {
> >     final double weight = component.getFirst();
> >     final MultivariateNormalDistribution mvn = component.getSecond();
> >     final double[] mean = mvn.getMeans();
> >     final RealMatrix covMat = mvn.getCovariances();
> >     System.out.printf("%s : %s : %s%n", weight, Arrays.toString(mean),
> > covMat.toString());
> > }
> >
> > 0.6420433138817465 : [0.016942587744259194] :
> > Array2DRowRealMatrix{{0.9929681356}}
> > 0.3579566861182536 : [2.9152176347671754] :
> > Array2DRowRealMatrix{{1.8940290549}}
> >
> > The numbers are close enough to indicate that the fit is valid.
> >
> > I think the error has been in assuming that because you require 2
> > components to have a mixture model then you must have 2 columns in the
> > input data. However this is not true. You can fit single dimension data
> > with a mixture of single Gaussians.
> >
> > Is this the functionality that you are expecting?
> >
> > Regards,
> >
> > Alex
> >
> >
> > On Mon, 4 Mar 2024 at 20:48, Craig Brautigam <cbrauti...@icr-team.com>
> > wrote:
> >
> >> Forgive me if this comes in twice... I did not subscribe first before
> >> sending the message below.
> >>
> >>
> >> ________________________________
> >> From: Craig Brautigam
> >> Sent: Monday, March 4, 2024 1:33 PM
> >> To: user@commons.apache.org <user@commons.apache.org>
> >> Subject: MultivariateNormalMixtureExpectationMaximization only 1
> dimension
> >>
> >> Hi,
> >>
> >> Full disclosure, I'm not a mathematician so I can not go into the weeds
> >> into the math.  However I am tasked with porting some matlab code that
> is
> >> doing gaussian mixed model to java.  I really want to use apache common
> >> math if possible.  However the code that I'm porting has 1 dimension ( a
> >> single variable/attribute/property) that GMMs are being created from.
> >>
> >> MultivariateNormalMixtureExpectationMaximization looks to be a pretty
> >> close drop in replacement for the matlab functions
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> <<https://www.mathworks.com/help/stats/fitgmdist.html>
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> ><https://www.mathworks.com/help/stats/fitgmdist.html> andhttps://
> >>
> https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Fgmdistribution.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=zsj4iQQmeOUd9ZmleDuu8TB5AM%2BU82hoGBg0kJD541w%3D&reserved=0
> <http://www.mathworks.com/help/stats/gmdistribution.html>, however the
> >> constructor for MultivariateNormalMixtureExpectationMaximization clearly
> >> states the the number of columns in the double[][]data array MUST be no
> >> less thatn2 columns.  I'm completely baffled as to why this is the case
> if
> >> I want to try to fit data with 1 dimension in it.  Is there a
> workaround I
> >> can use like provide a dummy column of data with all 0s to pacify the
> >> constructor? Is there another class I should be using?
> >>
> >> Any help would be greatly appreciated.
> >>
> >> Thx!
> >>
> >>
> >> ________________________________
> >> The information contained in this e-mail and any attachments from ICR,
> >> Inc. may contain confidential and/or proprietary information, and is
> >> intended only for the named recipient to whom it was originally
> addressed.
> >> If you are not the intended recipient, any disclosure, distribution, or
> >> copying of this e-mail or its attachments is strictly prohibited. If you
> >> have received this e-mail in error, please notify the sender
> immediately by
> >> return e-mail and permanently delete the e-mail and any attachments.
> >>
> >
>
> ________________________________
> From: Alex Herbert <alex.d.herb...@gmail.com>
> Sent: Tuesday, March 5, 2024 11:35 AM
> To: Commons Users List <user@commons.apache.org>
> Subject: [External] - Re: MultivariateNormalMixtureExpectationMaximization
> only 1 dimension
>
> [You don't often get email from alex.d.herb...@gmail.com. Learn why this
> is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.
>
>
> I have updated the master branch with a change to allow fitting a mixture
> with 1-column data.
>
> You should be able to pick up the 4.0-SNAPSHOT from the ASF snapshots repo
> if you configure your build to add the snapshot repository (see [1]).
>
> Let us know if this works for you. Note that if you only require fitting 1
> column data then you would be able to optimise the implementation as it
> will no longer require matrix inversion to compute the mixture probability
> distribution. The CM implementation can act as a reference point for your
> own implementation if desired.
>
> Regards,
>
> Alex
>
> [1]
>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Fsnapshots%2Forg%2Fapache%2Fcommons%2Fcommons-math4-legacy%2F4.0-SNAPSHOT%2F&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=pV5bELVx3%2FwNJ0LADZVQHv4Mf0UZEWq5GdwTFJTTyP0%3D&reserved=0
> <
> https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-math4-legacy/4.0-SNAPSHOT/
> >
>
> On Tue, 5 Mar 2024 at 00:06, Alex Herbert <alex.d.herb...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I think this is a bug in the
> > MultivariateNormalMixtureExpectationMaximization class. When I update the
> > code to allow 1 column in the rows it outputs a similar fit to matlab.
> > Here's an example of Matlab:
> >
> > X = [normrnd(0, 1, 100, 1); normrnd(2, 2, 100, 1)]
> > GMModel = fitgmdist(X,2);
> >
> > >> GMModel.mu
> > ans =
> >     0.0737
> >     3.0914
> > >> GMModel.ComponentProportion
> > ans =
> >     0.6750    0.3250
> > >> GMModel.Sigma
> > ans(:,:,1) =
> >     1.0505
> > ans(:,:,2) =
> >     1.6593
> >
> > I pasted the same X data into a test for
> > MultivariateNormalMixtureExpectationMaximization that had been updated to
> > allow data with a single column and get the following fit:
> >
> > MultivariateNormalMixtureExpectationMaximization fitter
> >     = new MultivariateNormalMixtureExpectationMaximization(data);
> >
> > MixtureMultivariateNormalDistribution initialMix
> >     = MultivariateNormalMixtureExpectationMaximization.estimate(data, 2);
> > fitter.fit(initialMix);
> > MixtureMultivariateNormalDistribution fittedMix =
> fitter.getFittedModel();
> > List<Pair<Double, MultivariateNormalDistribution>> components =
> > fittedMix.getComponents();
> >
> > for (Pair<Double, MultivariateNormalDistribution> component :
> components) {
> >     final double weight = component.getFirst();
> >     final MultivariateNormalDistribution mvn = component.getSecond();
> >     final double[] mean = mvn.getMeans();
> >     final RealMatrix covMat = mvn.getCovariances();
> >     System.out.printf("%s : %s : %s%n", weight, Arrays.toString(mean),
> > covMat.toString());
> > }
> >
> > 0.6420433138817465 : [0.016942587744259194] :
> > Array2DRowRealMatrix{{0.9929681356}}
> > 0.3579566861182536 : [2.9152176347671754] :
> > Array2DRowRealMatrix{{1.8940290549}}
> >
> > The numbers are close enough to indicate that the fit is valid.
> >
> > I think the error has been in assuming that because you require 2
> > components to have a mixture model then you must have 2 columns in the
> > input data. However this is not true. You can fit single dimension data
> > with a mixture of single Gaussians.
> >
> > Is this the functionality that you are expecting?
> >
> > Regards,
> >
> > Alex
> >
> >
> > On Mon, 4 Mar 2024 at 20:48, Craig Brautigam <cbrauti...@icr-team.com>
> > wrote:
> >
> >> Forgive me if this comes in twice... I did not subscribe first before
> >> sending the message below.
> >>
> >>
> >> ________________________________
> >> From: Craig Brautigam
> >> Sent: Monday, March 4, 2024 1:33 PM
> >> To: user@commons.apache.org <user@commons.apache.org>
> >> Subject: MultivariateNormalMixtureExpectationMaximization only 1
> dimension
> >>
> >> Hi,
> >>
> >> Full disclosure, I'm not a mathematician so I can not go into the weeds
> >> into the math.  However I am tasked with porting some matlab code that
> is
> >> doing gaussian mixed model to java.  I really want to use apache common
> >> math if possible.  However the code that I'm porting has 1 dimension ( a
> >> single variable/attribute/property) that GMMs are being created from.
> >>
> >> MultivariateNormalMixtureExpectationMaximization looks to be a pretty
> >> close drop in replacement for the matlab functions
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> <<https://www.mathworks.com/help/stats/fitgmdist.html>
> >>
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0
> ><https://www.mathworks.com/help/stats/fitgmdist.html> andhttps://
> >>
> https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Fgmdistribution.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=zsj4iQQmeOUd9ZmleDuu8TB5AM%2BU82hoGBg0kJD541w%3D&reserved=0
> <http://www.mathworks.com/help/stats/gmdistribution.html>, however the
> >> constructor for MultivariateNormalMixtureExpectationMaximization clearly
> >> states the the number of columns in the double[][]data array MUST be no
> >> less thatn2 columns.  I'm completely baffled as to why this is the case
> if
> >> I want to try to fit data with 1 dimension in it.  Is there a
> workaround I
> >> can use like provide a dummy column of data with all 0s to pacify the
> >> constructor? Is there another class I should be using?
> >>
> >> Any help would be greatly appreciated.
> >>
> >> Thx!
> >>
> >>
> >> ________________________________
> >> The information contained in this e-mail and any attachments from ICR,
> >> Inc. may contain confidential and/or proprietary information, and is
> >> intended only for the named recipient to whom it was originally
> addressed.
> >> If you are not the intended recipient, any disclosure, distribution, or
> >> copying of this e-mail or its attachments is strictly prohibited. If you
> >> have received this e-mail in error, please notify the sender
> immediately by
> >> return e-mail and permanently delete the e-mail and any attachments.
> >>
> >
> ________________________________
> The information contained in this e-mail and any attachments from ICR,
> Inc. may contain confidential and/or proprietary information, and is
> intended only for the named recipient to whom it was originally addressed.
> If you are not the intended recipient, any disclosure, distribution, or
> copying of this e-mail or its attachments is strictly prohibited. If you
> have received this e-mail in error, please notify the sender immediately by
> return e-mail and permanently delete the e-mail and any attachments.
>

Reply via email to