[ 
https://issues.apache.org/jira/browse/MATH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273620#comment-13273620
 ] 

Thomas Neidhart commented on MATH-789:
--------------------------------------

Thanks for the test.

My first investigation is as follows:

in the RectangularCholeskyDecomposition class, the following code does not 
actually produce the maximal diagonal element:

{noformat}
   // find maximal diagonal element
   swap[r] = r;
   for (int i = r + 1; i < order; ++i) {
       int ii = index[i];
       int isi = index[swap[i]];
       if (c[ii][ii] > c[isi][isi]) {
         swap[r] = i;
       }
   }
{noformat}

thus the rank of the matrix is computed wrongly as the ordering of the columns 
is wrong and as a consequence the loop finishes too early. This can be fixed 
quite easily by changing index[swap[i]] to index[swap[r]].

The increment of r seems also to be wrong in the case the diagonal element is 
smaller than the user-defined limit.

When making the changes, the rank is correct, but the resulting root matrix is 
not very good (root * root.transpose() != covariance), thus the transformation 
of the matrix has to be further reviewed (I did not figure it out yet).

Unfortunately there is no unit test for the RectangularCholeskyDecomposition 
yet, so this should be added in the process of fixing this issue. 
                
> Correlated random vector generator fails (silently) when faced with zero rows 
> in covariance matrix
> --------------------------------------------------------------------------------------------------
>
>                 Key: MATH-789
>                 URL: https://issues.apache.org/jira/browse/MATH-789
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.0
>         Environment: JDK 1.6 / Eclipse Indigo on Ubuntu 10.04
>            Reporter: Gert van Valkenhoef
>         Attachments: MultivariateGaussianGeneratorTest.java
>
>
> The following three matrices (which are basically permutations of each other) 
> produce different results when sampling a multi-variate Gaussian with the 
> help of CorrelatedRandomVectorGenerator (sample covariances calculated in R, 
> based on 10,000 samples):
> Array2DRowRealMatrix{
> {0.0,0.0,0.0,0.0,0.0},
> {0.0,0.013445532,0.01039469,0.009881156,0.010499559},
> {0.0,0.01039469,0.023006616,0.008196856,0.010732709},
> {0.0,0.009881156,0.008196856,0.019023866,0.009210099},
> {0.0,0.010499559,0.010732709,0.009210099,0.019107243}}
> > cov(data1)
>    V1 V2 V3 V4 V5
> V1 0 0.000000000 0.00000000 0.000000000 0.000000000
> V2 0 0.013383931 0.01034401 0.009913271 0.010506733
> V3 0 0.010344006 0.02309479 0.008374730 0.010759306
> V4 0 0.009913271 0.00837473 0.019005488 0.009187287
> V5 0 0.010506733 0.01075931 0.009187287 0.019021483
> Array2DRowRealMatrix{
> {0.013445532,0.01039469,0.0,0.009881156,0.010499559},
> {0.01039469,0.023006616,0.0,0.008196856,0.010732709},
> {0.0,0.0,0.0,0.0,0.0},
> {0.009881156,0.008196856,0.0,0.019023866,0.009210099},
> {0.010499559,0.010732709,0.0,0.009210099,0.019107243}}
> > cov(data2)
>             V1 V2 V3 V4 V5
> V1 0.006922905 0.010507692 0 0.005817399 0.010330529
> V2 0.010507692 0.023428918 0 0.008273152 0.010735568
> V3 0.000000000 0.000000000 0 0.000000000 0.000000000
> V4 0.005817399 0.008273152 0 0.004929843 0.009048759
> V5 0.010330529 0.010735568 0 0.009048759 0.018683544 
> Array2DRowRealMatrix{
> {0.013445532,0.01039469,0.009881156,0.010499559},
> {0.01039469,0.023006616,0.008196856,0.010732709},
> {0.009881156,0.008196856,0.019023866,0.009210099},
> {0.010499559,0.010732709,0.009210099,0.019107243}}
> > cov(data3)
>             V1          V2          V3          V4
> V1 0.013445047 0.010478862 0.009955904 0.010529542
> V2 0.010478862 0.022910522 0.008610113 0.011046353
> V3 0.009955904 0.008610113 0.019250975 0.009464442
> V4 0.010529542 0.011046353 0.009464442 0.019260317
> I've traced this back to the RectangularCholeskyDecomposition, which does not 
> seem to handle the second matrix very well (decompositions in the same order 
> as the matrices above):
> CorrelatedRandomVectorGenerator.getRootMatrix() = 
> Array2DRowRealMatrix{{0.0,0.0,0.0,0.0,0.0},{0.0759577418122063,0.0876125188474239,0.0,0.0,0.0},{0.07764443622513505,0.05132821221460752,0.11976381821791235,0.0,0.0},{0.06662930527909404,0.05501661744114585,0.0016662506519307997,0.10749324207653632,0.0},{0.13822895138139477,0.0,0.0,0.0,0.0}}
> CorrelatedRandomVectorGenerator.getRank() = 5
> CorrelatedRandomVectorGenerator.getRootMatrix() = 
> Array2DRowRealMatrix{{0.0759577418122063,0.034512751379448724,0.0},{0.07764443622513505,0.13029949164628746,0.0},{0.0,0.0,0.0},{0.06662930527909404,0.023203936694855674,0.0},{0.13822895138139477,0.0,0.0}}
> CorrelatedRandomVectorGenerator.getRank() = 3
> CorrelatedRandomVectorGenerator.getRootMatrix() = 
> Array2DRowRealMatrix{{0.0759577418122063,0.034512751379448724,0.033913748226348225,0.07303890149947785},{0.07764443622513505,0.13029949164628746,0.0,0.0},{0.06662930527909404,0.023203936694855674,0.11851573313229945,0.0},{0.13822895138139477,0.0,0.0,0.0}}
> CorrelatedRandomVectorGenerator.getRank() = 4
> Clearly, the rank of each of these matrices should be 4. The first matrix 
> does not lead to incorrect results, but the second one does. Unfortunately, I 
> don't know enough about the Cholesky decomposition to find the flaw in the 
> implementation, and I could not find documentation for the "rectangular" 
> variant (also not at the links provided in the javadoc).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to