[Scikit-learn-general] Theil-Sen estimator for a multiple linear regression problem
Hi, I'd like to add a Theil-Sen estimator for a multiple linear regression problem to Scikit-Learn as described in the paper: http://home.olemiss.edu/~xdang/papers/MTSE.pdf Is anyone already working on this or are there any objections regarding the inclusion of a Theil-Sen estimator into Scikit-Learn? Best regards, Florian Wilhelm -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk ___ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Theil-Sen estimator for a multiple linear regression problem
Hi, at Blue Yonder we often use Scikit-Learn but are sometimes missing more robust regression methods that are not based on the L2 norm. So far I only knew Theil-Sen as a linear regression method with only a single explanatory variable. The work of Xin Dang, Hanxiang Peng, Xueqin Wang and Heping Zhang extend the method to n explanatory variables. So it should perfectly fit into the sklearn.linear_model subpackage I think. Where is the line drawn between functionality that should go into StatsModels and into Scikit-Learn with respect to regression methods? Florian On 10 January 2014 19:18, Skipper Seabold wrote: > Hi, > > There have been some implementations of Theil-Sen floating around for > inclusion in statsmodels, but no PRs yet. IMO it might fit in a little better > in statsmodels.robust than sklearn unless their are some aspects of Theil-Sen > I'm not familiar with. > > Skipper > > Sent from my mobile > >> On Jan 10, 2014, at 12:16 PM, "[email protected]" >> wrote: >> >> Hi, >> >> I'd like to add a Theil-Sen estimator for a multiple linear regression >> problem to Scikit-Learn as described in the paper: >> http://home.olemiss.edu/~xdang/papers/MTSE.pdf >> Is anyone already working on this or are there any objections >> regarding the inclusion of a Theil-Sen estimator into Scikit-Learn? >> >> Best regards, >> >> Florian Wilhelm >> >> -- >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> ___ >> Scikit-learn-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > -- > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > ___ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk ___ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Theil-Sen estimator for a multiple linear regression problem
@Alexandre, @Mathieu: Thanks for these hints. I'll give it a try. So setting epsilon=0 and C to a large value should result in a regression in the L1 norm with almost no regularization of w, right?. One thing that just crossed my mind. Would it be possible in a linear SVR setting to let the norm(w) term [in the primal objective funtion] be in the L1 norm in order to get some sparsity like in Lasso? Florian On 13 January 2014 08:15, Mathieu Blondel wrote: > Here's an example that illustrates the use of LinearSVR for doing robust > regression with lightning: > https://github.com/mblondel/lightning/blob/master/examples/plot_robust_regression.py > > Regarding epsilon=0, it is a good choice for LinearSVR but less so for > (kernel) SVR. epsilon=0 leads to completely dense solutions in the dual and > so the kernel expansion (used in the predict method) might be slow to > evaluate for large datasets. > > Mathieu > > > > On Sun, Jan 12, 2014 at 3:33 AM, Alexandre Gramfort > wrote: >> >> hi, >> >> did you try SVR ? eventually setting epsilon to 0.? >> >> if it's too slow have a look at lightning new LinearSVR estimator. >> >> Alex >> >> >> >> >> On Sat, Jan 11, 2014 at 7:28 PM, [email protected] >> wrote: >>> >>> Hi, >>> >>> at Blue Yonder we often use Scikit-Learn but are sometimes missing >>> more robust regression methods that are not based on the L2 norm. >>> So far I only knew Theil-Sen as a linear regression method with only a >>> single explanatory variable. The work of Xin Dang, Hanxiang Peng, >>> Xueqin Wang and Heping Zhang extend the method to n explanatory >>> variables. So it should perfectly fit into the sklearn.linear_model >>> subpackage I think. Where is the line drawn between functionality that >>> should go into StatsModels and into Scikit-Learn with respect to >>> regression methods? >>> >>> Florian >>> >>> On 10 January 2014 19:18, Skipper Seabold wrote: >>> > Hi, >>> > >>> > There have been some implementations of Theil-Sen floating around for >>> > inclusion in statsmodels, but no PRs yet. IMO it might fit in a little >>> > better in statsmodels.robust than sklearn unless their are some aspects of >>> > Theil-Sen I'm not familiar with. >>> > >>> > Skipper >>> > >>> > Sent from my mobile >>> > >>> >> On Jan 10, 2014, at 12:16 PM, "[email protected]" >>> >> wrote: >>> >> >>> >> Hi, >>> >> >>> >> I'd like to add a Theil-Sen estimator for a multiple linear regression >>> >> problem to Scikit-Learn as described in the paper: >>> >> http://home.olemiss.edu/~xdang/papers/MTSE.pdf >>> >> Is anyone already working on this or are there any objections >>> >> regarding the inclusion of a Theil-Sen estimator into Scikit-Learn? >>> >> >>> >> Best regards, >>> >> >>> >> Florian Wilhelm >>> >> >>> >> >>> >> -- >>> >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> >> Critical Workloads, Development Environments & Everything In Between. >>> >> Get a Quote or Start a Free Trial Today. >>> >> >>> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> >> ___ >>> >> Scikit-learn-general mailing list >>> >> [email protected] >>> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> > >>> > >>> > -- >>> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> > Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> > Critical Workloads, Development Environments & Everything In Between. >>> > Get a Quote or Start a Free Trial Today. >>> > >>> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> > ___ >>> > Scikit-learn-general mailing list >>> > Scikit-learn-general@l
