[Scikit-learn-general] Theil-Sen estimator for a multiple linear regression problem

2014-01-10 Thread florian.wilh...@gmail.com
Hi,

I'd like to add a Theil-Sen estimator for a multiple linear regression
problem to Scikit-Learn as described in the paper:
http://home.olemiss.edu/~xdang/papers/MTSE.pdf
Is anyone already working on this or are there any objections
regarding the inclusion of a Theil-Sen estimator into Scikit-Learn?

Best regards,

Florian Wilhelm

--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
___
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


Re: [Scikit-learn-general] Theil-Sen estimator for a multiple linear regression problem

2014-01-11 Thread florian.wilh...@gmail.com
Hi,

at Blue Yonder we often use Scikit-Learn but are sometimes missing
more robust regression methods that are not based on the L2 norm.
So far I only knew Theil-Sen as a linear regression method with only a
single explanatory variable. The work of Xin Dang, Hanxiang Peng,
Xueqin Wang and Heping Zhang extend the method to n explanatory
variables. So it should perfectly fit into the sklearn.linear_model
subpackage I think. Where is the line drawn between functionality that
should go into StatsModels and into Scikit-Learn with respect to
regression methods?

Florian

On 10 January 2014 19:18, Skipper Seabold  wrote:
> Hi,
>
> There have been some implementations of Theil-Sen floating around for 
> inclusion in statsmodels, but no PRs yet. IMO it might fit in a little better 
> in statsmodels.robust than sklearn unless their are some aspects of Theil-Sen 
> I'm not familiar with.
>
> Skipper
>
> Sent from my mobile
>
>> On Jan 10, 2014, at 12:16 PM, "[email protected]" 
>>  wrote:
>>
>> Hi,
>>
>> I'd like to add a Theil-Sen estimator for a multiple linear regression
>> problem to Scikit-Learn as described in the paper:
>> http://home.olemiss.edu/~xdang/papers/MTSE.pdf
>> Is anyone already working on this or are there any objections
>> regarding the inclusion of a Theil-Sen estimator into Scikit-Learn?
>>
>> Best regards,
>>
>> Florian Wilhelm
>>
>> --
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> ___
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> --
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> ___
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
___
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


Re: [Scikit-learn-general] Theil-Sen estimator for a multiple linear regression problem

2014-01-13 Thread florian.wilh...@gmail.com
@Alexandre, @Mathieu: Thanks for these hints. I'll give it a try.

So setting epsilon=0 and C to a large value should result in a
regression in the L1 norm with almost no regularization of w, right?.
One thing that just crossed my mind. Would it be possible in a linear
SVR setting to let the norm(w) term [in the primal objective funtion]
be in the L1 norm in order to get some sparsity like in Lasso?

Florian

On 13 January 2014 08:15, Mathieu Blondel  wrote:
> Here's an example that illustrates the use of LinearSVR for doing robust
> regression with lightning:
> https://github.com/mblondel/lightning/blob/master/examples/plot_robust_regression.py
>
> Regarding epsilon=0, it is a good choice for LinearSVR but less so for
> (kernel) SVR. epsilon=0 leads to completely dense solutions in the dual and
> so the kernel expansion (used in the predict method) might be slow to
> evaluate for large datasets.
>
> Mathieu
>
>
>
> On Sun, Jan 12, 2014 at 3:33 AM, Alexandre Gramfort
>  wrote:
>>
>> hi,
>>
>> did you try SVR ? eventually setting epsilon to 0.?
>>
>> if it's too slow have a look at lightning new LinearSVR estimator.
>>
>> Alex
>>
>>
>>
>>
>> On Sat, Jan 11, 2014 at 7:28 PM, [email protected]
>>  wrote:
>>>
>>> Hi,
>>>
>>> at Blue Yonder we often use Scikit-Learn but are sometimes missing
>>> more robust regression methods that are not based on the L2 norm.
>>> So far I only knew Theil-Sen as a linear regression method with only a
>>> single explanatory variable. The work of Xin Dang, Hanxiang Peng,
>>> Xueqin Wang and Heping Zhang extend the method to n explanatory
>>> variables. So it should perfectly fit into the sklearn.linear_model
>>> subpackage I think. Where is the line drawn between functionality that
>>> should go into StatsModels and into Scikit-Learn with respect to
>>> regression methods?
>>>
>>> Florian
>>>
>>> On 10 January 2014 19:18, Skipper Seabold  wrote:
>>> > Hi,
>>> >
>>> > There have been some implementations of Theil-Sen floating around for
>>> > inclusion in statsmodels, but no PRs yet. IMO it might fit in a little
>>> > better in statsmodels.robust than sklearn unless their are some aspects of
>>> > Theil-Sen I'm not familiar with.
>>> >
>>> > Skipper
>>> >
>>> > Sent from my mobile
>>> >
>>> >> On Jan 10, 2014, at 12:16 PM, "[email protected]"
>>> >>  wrote:
>>> >>
>>> >> Hi,
>>> >>
>>> >> I'd like to add a Theil-Sen estimator for a multiple linear regression
>>> >> problem to Scikit-Learn as described in the paper:
>>> >> http://home.olemiss.edu/~xdang/papers/MTSE.pdf
>>> >> Is anyone already working on this or are there any objections
>>> >> regarding the inclusion of a Theil-Sen estimator into Scikit-Learn?
>>> >>
>>> >> Best regards,
>>> >>
>>> >> Florian Wilhelm
>>> >>
>>> >>
>>> >> --
>>> >> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>> >> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>> >> Critical Workloads, Development Environments & Everything In Between.
>>> >> Get a Quote or Start a Free Trial Today.
>>> >>
>>> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>> >> ___
>>> >> Scikit-learn-general mailing list
>>> >> [email protected]
>>> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>> >
>>> >
>>> > --
>>> > CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>> > Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>> > Critical Workloads, Development Environments & Everything In Between.
>>> > Get a Quote or Start a Free Trial Today.
>>> >
>>> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>> > ___
>>> > Scikit-learn-general mailing list
>>> > Scikit-learn-general@l