[ 
https://issues.apache.org/jira/browse/MATH-278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov updated MATH-278:
----------------------------------

    Attachment: loess.patch.v2

Attached a patch that does not change the AbstractIntegrator class, the $Date$ 
argument is replaced with '???', and parameters are made final and initialized 
in two constructors. Tests and Javadocs updated accordingly.

Actually, I don't know what the $Revision$ and $Date$ are for and where they 
come from. Are they filled in automatically by a pre-commit hook? If so, should 
I leave them like '???' in the patch?
If I omit them altogether, I get a checkstyle error about the missing @version 
tag.

> Robust locally weighted regression (Loess / Lowess)
> ---------------------------------------------------
>
>                 Key: MATH-278
>                 URL: https://issues.apache.org/jira/browse/MATH-278
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Eugene Kirpichov
>         Attachments: loess.patch, loess.patch.v2
>
>
> Attached is a patch that implements the robust Loess procedure for smoothing 
> univariate scatterplots with local linear regression ( 
> http://en.wikipedia.org/wiki/Local_regression) described by William Cleveland 
> in http://www.math.tau.ac.il/~yekutiel/MA%20seminar/Cleveland%201979.pdf , 
> with tests.
> (Also, the patch fixes one missing-javadoc checkstyle warning in the 
> AbstractIntegrator class: I wanted to make it so that the code with my patch 
> does not generate any checkstyle warnings at all)
> I propose to include the procedure into commons-math because commons-math, as 
> of now, does not possess a method for robust smoothing of noisy data: there 
> is  interpolation (which virtually can't be used for noisy data at all) and 
> there's regression, which has quite different goals. 
> Loess allows one to build a smooth curve with a controllable degree of 
> smoothness that approximates the overall shape of the data.
> I tried to follow the code requirements as strictly as possible: the tests 
> cover the code completely, there are no checkstyle warnings, etc. The code is 
> completely written by myself from scratch, with no borrowings of third-party 
> licensed code.
> The method is pretty computationally intensive (10000 points with a bandwidth 
> of 0.3 and 4 robustness iterations take about 3.7sec on my machine; generally 
> the complexity is O(robustnessIters * n^2 * bandwidth)), but I don't know how 
> to optimize it further; all implementations that I have found use exactly the 
> same algorithm as mine for the unidimensional case.
> Some TODOs, in vastly increasing order of complexity:
>  - Make the weight function customizable: according to Cleveland, this is 
> needed in some exotic cases only, like, where the desired approximation is 
> non-continuous, for example.
>  - Make the degree of the locally fitted polynomial customizable: currently 
> the algorithm does only a linear local regression; it might be useful to make 
> it also use quadratic regression. Higher degrees are not worth it, according 
> to Cleveland.
>  - Generalize the algorithm to the multidimensional case: this will require A 
> LOT of hard work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to