Hi, I had almost the same problem you have. I had a subset of the features believed to be the true features but with unknown coefficients. The other features may or may not be involved. There is a way to do it by reducing the penalty term of the selected features, or even make it close to zero. This means to have different penalty term for each coefficient. This method is not yet available in scikit-learn but it is available in the R package glmnet. It is already published in the context of Bioinformatics for adding prior knowledge to gene regulatory network inference.
I implemented this feature in the elastic net in scikit-learn and I made a pull request to add it. You can find the code in the pull request if you would like to try it. On Jan 3, 2016 12:23 PM, <scikit-learn-general-requ...@lists.sourceforge.net> wrote: > Send Scikit-learn-general mailing list submissions to > scikit-learn-general@lists.sourceforge.net > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > or, via email, send a message with subject or body 'help' to > scikit-learn-general-requ...@lists.sourceforge.net > > You can reach the person managing the list at > scikit-learn-general-ow...@lists.sourceforge.net > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Scikit-learn-general digest..." > > > Today's Topics: > > 1. Re: BIC/AIC for Feature Selection (Gael Varoquaux) > 2. LASSO, Constrained coefficient matrix with some independent > elements (Guoqiang Lan, Mr) > 3. Re: LASSO, Constrained coefficient matrix with some > independent elements (Michael Eickenberg) > 4. Re: LASSO, Constrained coefficient matrix with some > independent elements (Guoqiang Lan, Mr) > 5. Re: LASSO, Constrained coefficient matrix with some > independent elements (Alexandre Gramfort) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 2 Jan 2016 13:39:11 +0100 > From: Gael Varoquaux <gael.varoqu...@normalesup.org> > Subject: Re: [Scikit-learn-general] BIC/AIC for Feature Selection > To: scikit-learn-general@lists.sourceforge.net > Message-ID: <20160102123911.gg3630...@phare.normalesup.org> > Content-Type: text/plain; charset=iso-8859-1 > > On Fri, Jan 01, 2016 at 08:41:56PM +0100, Marco De Nadai wrote: > > I would expose it through a score function. In this way it can be called > to > > evaluate 2 models (let's say model A with 4 params and model B with 10). > > Moreover, this could also be called by feature_selection.RFECV. > > OK, but BIC is defined for a specific likelihood. I guess that what you > want is the likelihood associated to linear model with Gaussian > dstributions? > > Ga?l > > > > ------------------------------ > > Message: 2 > Date: Sat, 2 Jan 2016 21:19:01 +0000 > From: "Guoqiang Lan, Mr" <guoqiang....@mail.mcgill.ca> > Subject: [Scikit-learn-general] LASSO, Constrained coefficient matrix > with some independent elements > To: "scikit-learn-general@lists.sourceforge.net" > <scikit-learn-general@lists.sourceforge.net> > Message-ID: > < > blupr03mb13949f173c336d69ced06644b1...@blupr03mb1394.namprd03.prod.outlook.com > > > > Content-Type: text/plain; charset="iso-8859-1" > > Dear all, > > I am using the LASSO model to optimize a huge sparse coefficient-matrix, > W. Luckily, I have known how many independent elements and how they > distribute in the coefficient matrix. What I want to obtain now is just the > values of these independent elements. Is there a way to define such a > constrained coefficient matrix (only constructed from some independent > elements) and use it to do the optimization with LASSO method in 'sklearn'? > Or is there any suggestion to figure out this problem? > > Happy new years. > > Best > > Guoqiang > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 3 > Date: Sat, 2 Jan 2016 22:34:24 +0100 > From: Michael Eickenberg <michael.eickenb...@gmail.com> > Subject: Re: [Scikit-learn-general] LASSO, Constrained coefficient > matrix with some independent elements > To: "scikit-learn-general@lists.sourceforge.net" > <scikit-learn-general@lists.sourceforge.net> > Message-ID: > < > cadxjn648ce0bcc174rcrzhnrt4ao6m085bbt_k9lblyh3yr...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Dear Guoqiang, > > it sounds as though you could just throw all the irrelevant variables away > and then do an ordinary least squares or ridge regression on what you keep. > That is if I understand correctly that you have already successfully > identified the support. > If this is not the case, could you try re-explaining, detailing exactly the > nature of the information you have given for your problem? > > Michael > > On Saturday, January 2, 2016, Guoqiang Lan, Mr < > guoqiang....@mail.mcgill.ca> > wrote: > > > Dear all, > > > > I am using the LASSO model to optimize a huge sparse coefficient-matrix, > > W. Luckily, I have known how many independent elements and how they > > distribute in the coefficient matrix. What I want to obtain now is just > the > > values of these independent elements. Is there a way to define such a > > constrained coefficient matrix (only constructed from some independent > > elements) and use it to do the optimization with LASSO method in > 'sklearn'? > > Or is there any suggestion to figure out this problem? > > > > Happy new years. > > > > Best > > > > Guoqiang > > > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 4 > Date: Sun, 3 Jan 2016 05:25:11 +0000 > From: "Guoqiang Lan, Mr" <guoqiang....@mail.mcgill.ca> > Subject: Re: [Scikit-learn-general] LASSO, Constrained coefficient > matrix with some independent elements > To: "scikit-learn-general@lists.sourceforge.net" > <scikit-learn-general@lists.sourceforge.net> > Message-ID: > < > blupr03mb139495cb35a93eecdeea96e7b1...@blupr03mb1394.namprd03.prod.outlook.com > > > > Content-Type: text/plain; charset="iso-8859-1" > > Dear Michael, > > Thanks for your reply. In my case, the original dimension of the > coefficient matrix is very large, including ~10,000 elements, but actually > there are only several hundred of independent elements in the coefficient > matrix based on the some symmetric nature of my data. > > I know how to build the coefficient matrix with independent elements and > do the ordinary least-square fitting. However, an over-fitting issue may > arise unless the number of individual reference data is fairly large > compared with the number of parameters. So I am wondering if there is a way > to use LASSO method to deal with this problem. And I think the efficiency > would also increase if we can define a constrained coefficient matrix (only > constructed from some independent elements) for LASSO method. > > But it seem to be not possible to define such a constrained coefficient > matrix in "sklearn". Am I right? > > Best > > Guoqiang > > ________________________________ > From: Guoqiang Lan, Mr > Sent: January 2, 2016 4:19 PM > To: scikit-learn-general@lists.sourceforge.net > Subject: LASSO, Constrained coefficient matrix with some independent > elements > > > Dear all, > > I am using the LASSO model to optimize a huge sparse coefficient-matrix, > W. Luckily, I have known how many independent elements and how they > distribute in the coefficient matrix. What I want to obtain now is just the > values of these independent elements. Is there a way to define such a > constrained coefficient matrix (only constructed from some independent > elements) and use it to do the optimization with LASSO method in 'sklearn'? > Or is there any suggestion to figure out this problem? > > Happy new years. > > Best > > Guoqiang > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 5 > Date: Sun, 3 Jan 2016 18:22:17 +0100 > From: Alexandre Gramfort <alexandre.gramf...@m4x.org> > Subject: Re: [Scikit-learn-general] LASSO, Constrained coefficient > matrix with some independent elements > To: scikit-learn-general <scikit-learn-general@lists.sourceforge.net> > Message-ID: > < > cadeotzoqgj-+h4yo5qysebi-pg7sgtayushdxg5oe1tv7sx...@mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Sun, Jan 3, 2016 at 6:25 AM, Guoqiang Lan, Mr > <guoqiang....@mail.mcgill.ca> wrote: > > But it seem to be not possible to define such a constrained coefficient > > matrix in "sklearn". Am I right? > > indeed. You'll need to recode. sklearn lasso only works with in memory > ndarray or sparse matrices. > > A > > > > ------------------------------ > > > ------------------------------------------------------------------------------ > > > ------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > End of Scikit-learn-general Digest, Vol 72, Issue 2 > *************************************************** >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general