Re: [Scikit-learn-general] Planning on implementing sample_weight option for PLSRegression.fit()

Deepak Subburam Mon, 13 Jul 2015 13:19:44 -0700

I've gone ahead with the sample_weight implementation, along with some
amount of code refactoring (mostly where I had to touch it).
I have submitted a pull request.


Let me know if I can help fix any specific other concerns -- I've spent
some time with the referenced paper, and the code; so less ramp
up time to understand what's going on. The code is working fine for
my real-world use case (weighted, 140 rows and 12 columns of data).

Regards,
Deepak

Date: Fri, 10 Jul 2015 14:23:16 -0500
From: Andreas Mueller<t3k...@gmail.com>
Subject: Re: [Scikit-learn-general] Planning on implementing
        sample_weight option for PLSRegression.fit()
To:scikit-learn-general@lists.sourceforge.net
Message-ID:<55a01ba4.1050...@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed

I think there are also some concerns about the actual implementation of
the algorithm, and
whether the NIPALS implementation is fast and stable enough, and the
appropriate choice in all cases.


On 07/07/2015 04:51 PM, Deepak Subburam wrote:

> Cool. I spent some time inspecting the pls_.py source file. There are a
> whole lot of if conditions prevalent in the code (on
> deflation_mode="regression"|"canonical", mode="A"|"B",
> algorithm="nipals"|"svd", Y.ndim==1|>1, etc.), and I wonder if that's
> what you are thinking could be refactored.
>
> I can minimize adding another set of if statements ("if sample_weight is
> not None") in my implementation. Or I can take a shot at refactoring the
> code first, though that would involve significant changes you might
> worry about a new contributor making. For example, I think the
> _nipals_twoblocks...() function should be a class method attached to the
> object, so you don't have to pass all the self.XYZ parameters to it and
> it can update self.n_iter within itself, ca. line 286.
>
> Other comments I have on the code are that it can use more empty lines
> (between separate blocks of logic) for readability, and that the
> mode="A"|"B" parameter can be made more descriptive.
>
> Let me know your views on how I might proceed.
>
> Best,
> Deepak
>
>> Date: Tue, 7 Jul 2015 02:03:41 +0200
>> From: Gael Varoquaux<gael.varoqu...@normalesup.org>
>> Subject: Re: [Scikit-learn-general] Planning on implementing
>>       sample_weight option for PLSRegression.fit()
>> To:scikit-learn-general@lists.sourceforge.net
>> Message-ID:<20150707000341.ga2996...@phare.normalesup.org>
>> Content-Type: text/plain; charset=us-ascii
>>
>> On Mon, Jul 06, 2015 at 04:55:07PM -0400, Deepak Subburam wrote:
>>
>>> I have some time to contribute, and would like to implement a
>>> sample_weight option to the fit() method of
>>> sklearn.cross_decomposition.PLSRegression, handling any other knock-on
>>> effects. Let me know if you think this is not a good idea, or have any
>>> other thoughts to guide me.
>> In general, I think that this is a great idea. One question is: is the
>> quality of the PLS code ready for these changes?
>>
>> G


------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Planning on implementing sample_weight option for PLSRegression.fit()

Reply via email to