[jira] [Commented] (MATH-1482) Pull request for GLSMultipleLinearRegression

Gilles (JIRA) Wed, 08 May 2019 15:22:35 -0700


    [ 
https://issues.apache.org/jira/browse/MATH-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835942#comment-16835942
 ]


Gilles commented on MATH-1482:
------------------------------

Thanks for your proposal.
It just happens that this part of the code is going to be refactored and ported 
to a new ["Commons 
Statistics"|http://commons.apache.org/proper/commons-statistics/] component.
This a GSoC project being discussed right now on the "dev" mailing list.  And 
your use-case is certainly welcome in order to shape the new design.  You can 
make sure that it will be taken into account by subscribing to the ML and start 
a discussion over there.

> Pull request for GLSMultipleLinearRegression
> --------------------------------------------
>
>                 Key: MATH-1482
>                 URL: https://issues.apache.org/jira/browse/MATH-1482
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Elena Kartysheva
>            Priority: Trivial
>
> I would like to propose a pull request implementing an option to use variance 
> vector instead of covariance matrix. It allows users to avoid unnecessary 
> memory usage and excessive computation in case of uncorrelated but 
> heteroscedastic errors thus making it possible to work with huge input 
> matrices. Using variance vector in such cases allows to reduce time 
> complexity from O(n^2) to just O(n) (where n is a number of observations) and 
> dramatically reduce memory usage. For example, in my practice arose a need to 
> train generalized linear model. Usage of Iteratively reweighted least squares 
> algorithm requires weighted regression with more than a million observations. 
> Current implementation would require approximately 12 terabytes of memory 
> while patched version needs only 8 megabytes. Since IRLS is iterative 
> algorithm a million-times complexity reduction is also pretty handy.
> https://github.com/apache/commons-math/pull/106



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (MATH-1482) Pull request for GLSMultipleLinearRegression

Reply via email to