[ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088719#comment-13088719
 ] 

Patrick Meyer commented on MATH-449:
------------------------------------

I like all of these ideas. When I wrote the patch, I didn't know if forcing a 
square matrix was preferred, so I wrote it more generally. A square matrix is 
fine with me. 

Incrementing the full vector of new values is definitely the safest way to do 
it. However, it forces the user into listwise deletion if a case has any 
missing data. The more granular version allows a user to implement pairwise 
deletion. Nether option is a great way to handle missing data, but do we want 
to force one approach on the user? Is there way to increment the full vector of 
values and account for missing data on one or more variables?

Thanks,
Patrick

> Storeless covariance
> --------------------
>
>                 Key: MATH-449
>                 URL: https://issues.apache.org/jira/browse/MATH-449
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Patrick Meyer
>            Assignee: Phil Steitz
>             Fix For: 3.1
>
>         Attachments: MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
>     private double deltaX = 0.0;
>     private double deltaY = 0.0;
>     private double meanX = 0.0;
>     private double meanY = 0.0;
>     private double N=0;
>     private Double covarianceNumerator=0.0;
>     private boolean unbiased=true;
>     public Covariance(boolean unbiased){
>       this.unbiased = unbiased;
>     }
>     public void increment(Double x, Double y){
>         if(x!=null & y!=null){
>             N++;
>             deltaX = x - meanX;
>             deltaY = y - meanY;
>             meanX += deltaX/N;
>             meanY += deltaY/N;
>             covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
>         }
>         
>     }
>     public Double getResult(){
>         if(unbiased){
>             return covarianceNumerator/(N-1.0);
>         }else{
>             return covarianceNumerator/N;
>         }
>     }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to