Re: Canonical Correlation Question (From Eigenvectors to Canonical

Gottfried Helms Sat, 15 Nov 2003 03:55:01 -0800

Hallo Toby -

 I couldn't follow your formulae in detail. There is a good explanation
 for the use of the covariance-matrices for the purpose of computing 
 canonical correlations in


    The Foundation of Factor-Analysis, Stanley Mulaik [1972 (??)] 


 From that I derived a method which essentially only uses loadings-matrices
 and rotations to find the coefficients of canonical correlations. I found
 the 1:1-bijection between the covariance-analysis and my rotational
 approach always a bit complicated, and I don't want engage myself to
 uncover it didactically here - but may be you can manage to find it
 yourself - my proceeding is rather simple, if the idea is understood
 first. See the protocol at the bottom of the posting. 

 It's a bit lengthy - hope it helps, anyway... 


Gottfried Helms 


=================================================
Protocol:
--------------------------------------------------


First, I start with an orthogonal factor-loadingsmatrix, where all
items are engaged. This is simply a cholesky-decomposition of
their correlation-matrix R, consisting of the four blocks:
    [Rxx,Rxy]
    [Ryx,Ryy]

LAD is then the cholesky-loadings-matrix, with the common loadings
and partial variances

    [Lxx,Lxy]  = [Lxx,    0  ]
    [Lyx,Lyy]    [Lyx, Lyy.yx]


      LAD: 
X1      |    1.000          .               .               .               .    |
X2      |    0.218         0.976            .               .               .    |
X3      |    0.623        -0.029           0.782            .               .    |
--------+------------------------------------------------------------------------+
Y1      |    0.236         0.268           0.153           0.921            .    |
Y2      |    0.673         0.553           0.354           0.123           0.319 |
--------+------------------------------------------------------------------------+
     
     
    
;h Overview 
     ; Set-X  (in the Notation of MULAIK ("[MU]") : Z)
     ; Set-Y   ([MU] :Y )          
;i Find orthogonal components
;i and work with these components instead of original items
;i        OX as orthogonal components of X  ([MU]: ->X)
;i        OY as orthogonal components of Y  ([MU]: ->W) 
;i  find also the parts of the original (and of these orthogonal) items' (co)variance,
;i     which is predictable by the others in
;i        �X as predictable part of OX (the first 2 columns/factors)
;i        �Y as predictable part of OY (the first 3 columns/factors (in fact there
;i           are only 2 factors meaningful)
;i �X and �Y, rotated to PC-position mark then the canonical factors
;i CX and CY of the original items X and Y 
;i
;i to have the loadings of the original items on these factors this must
;i all be done in a common vectorspace; thus if the new orthogonal components
;i are created, they have to be appended to the given loadingsmatrix     
     
;t find orthogonal components and append them as new composite items 
     
;i finding OX; since the initial cholesky-position has the variance
;i of X in the first three factors, we only need three new lines for
;i loadings of the new variables: 

[62]          l2 = insertzl(l2,einh(3),6)  // add three new lines representing the
                                           // new three orthogonal OX-items as
                                           // representants of X
     
      l2 : 
X1      |    1.000          .               .               .               .    |
X2      |    0.218         0.976            .               .               .    |
X3      |    0.623        -0.029           0.782            .               .    |
--------+------------------------------------------------------------------------+
Y1      |    0.236         0.268           0.153           0.921            .    |
Y2      |    0.673         0.553           0.354           0.123           0.319 |
--------+------------------------------------------------------------------------+
OX1     |    1.000          .               .               .               .    |
OX2     |     .            1.000            .               .               .    |
OX3     |     .             .              1.000            .               .    |
--------+------------------------------------------------------------------------+



     
;i find OY
;i for finding OY as orthogonal components of Y we have to rotate the vector-
space into triangular position for Y1 and Y2 as first entries.
[63] l2 = rot(l2,"drei",4..5)     // rotate Y-variance into first factors
     
[66]          l2 = insertzl(l2,einh(2),9)  // add two new lines representing the
                                           // new two orthogonal OY-items as
                                           // representants of Y
     
      l2 : 
X1      |    0.236         0.637          -0.276          -0.679           0.038 |
X2      |    0.313         0.611          -0.234           0.425          -0.542 |
X3      |    0.259         0.633           0.563          -0.460           0.060 |
--------+------------------------------------------------------------------------+
Y1      |    1.000          .               .               .               .    |
Y2      |    0.475         0.880            .               .               .    |
--------+------------------------------------------------------------------------+
OX1     |    0.236         0.637          -0.276          -0.679           0.038 |
OX2     |    0.268         0.483          -0.178           0.587          -0.564 |
OX3     |    0.153         0.320           0.934          -0.026           0.025 |
--------+------------------------------------------------------------------------+
OY1     |    1.000          .               .               .               .    |
OY2     |     .            1.000            .               .               .    |
--------+------------------------------------------------------------------------+
     
;h After having found orthogonal representants for X and Y, the canonical
;h factors can be found as principal components of the mutually predicted variance
;h of these orthogonal representants.
;h That needs to identify the "predicted part" of OX and OY, and then to find their
;i principal components.
;i This predicted part relates to the Rxy-part of the whole correlation-matrix.
;i And in a loadingsmatrix it is located in the first factors after an appropriate
;i triangular/cholesky rotation

In the above laodings-matrix one of these parts is already obvious: since it
is in triangular positions with Y leading, the first two factors show the
predicted parts of X as well of OX in the respective rows.

      l2 : (cut from the above L2):
X1      |    0.236         0.637
X2      |    0.313         0.611
X3      |    0.259         0.633
--------+-----------------------
Y1      |    1.000          .   
Y2      |    0.475         0.880
--------+-----------------------
OX1     |    0.236         0.637
OX2     |    0.268         0.483
OX3     |    0.153         0.320
--------+-----------------------

The canonical factors are the principal components of the
mutually predicted parts.
So this needs to rotated this for principal components-position
*of this part* of the orthogonal representations of the original
items, thus we do a PC-rotation on the first two factors only
and for the OX-items only. We add new rows (the new items �X1,�X2)
as markers for further needs.

[68] l2 = rot(l2,"pca",6..8,1..2) // PCs of predicted part of OX
    
     
      l2 : 
X1      |    0.678         0.044          -0.276          -0.679           0.038 |
X2      |    0.685        -0.037          -0.234           0.425          -0.542 |
X3      |    0.683         0.021           0.563          -0.460           0.060 |
--------+------------------------------------------------------------------------+
Y1      |    0.407        -0.913            .               .               .    |
Y2      |    0.997        -0.075            .               .               .    |
--------+------------------------------------------------------------------------+
OX1     |    0.678         0.044          -0.276          -0.679           0.038 |
OX2     |    0.551        -0.048          -0.178           0.587          -0.564 |
OX3     |    0.354        -0.009           0.934          -0.026           0.025 |
--------+------------------------------------------------------------------------+
OY1     |    0.407        -0.913            .               .               .    |
OY2     |    0.913         0.407            .               .               .    |
--------+------------------------------------------------------------------------+
�X1     |    1.000          .               .               .               .    |
�X2     |     .            1.000            .               .               .    |
--------+------------------------------------------------------------------------+

The loadings of Y1 and Y2 on these factorial position are already their loadings
on the canonical factors of X , and the loadings of X1, X2 and X3 respectively
on their own canonical factors. 


   Note: the appending of �X1 and �X2 and the use of them instead of OX1 etc "wraps" 
        the scaling of the components, which you mention in your post, btw, since it
        means, that only parts of OX1, OX2,OX3 are used; and their partial variances/
        covariances which sum up to a value<1 are implicitely thought as scaled to 
        sum up to 1, if �X1 etc are engaged.


To have it opposite for the Y-canonical-factors, we find the common variance
of Y, explained by X, and mark such items with new rows.
This relates to the Ryx-part of correlationm or the Lyx-part of the cholesky-
loadings.

First the laodingsmatrix must be rotated to the appropriate triangular
position, so that Lyx is in the first three factors, then this means,
that we can work on that part of variance/covariance of Y specifically, 
which is common/predicted by X.
The PCs of this part are the canonical factors of Y, so we have to rotate
this part to PC-position (and mark the found factors in new rows as new
items �Y)
     
;t Set �Y :
;i  by Set OX Predictable Part of Set OY
[72] l2 = rot(l2,"drei",6..8)    // triangular with OX-loadings in first 3 factors
;i bring it to PC-position
[73] l2 = rot(l2,"pca",9..10,1..3) // pc of predictable part of OY
     
[76]          l2 = insertzl(l2,einh(2),13)  // add rows marking that factors
     
      l2 : 
X1      |    0.719        -0.668          -0.191            .               .    |
X2      |    0.727         0.567          -0.387            .               .    |
X3      |    0.725        -0.325           0.607            .               .    |
--------+------------------------------------------------------------------------+
Y1      |    0.384         0.060            .              0.076          -0.918 |
Y2      |    0.940         0.005            .              0.328          -0.097 |
--------+------------------------------------------------------------------------+
OX1     |    0.719        -0.668          -0.191            .               .    |
OX2     |    0.584         0.730          -0.354            .               .    |
OX3     |    0.376         0.143           0.916            .               .    |
--------+------------------------------------------------------------------------+
OY1     |    0.384         0.060            .              0.076          -0.918 |
OY2     |    0.861        -0.027            .              0.332           0.385 |
--------+------------------------------------------------------------------------+
�X1     |    0.942          .               .              0.334          -0.022 |
�X2     |     .           -0.066            .              0.066           0.996 |
--------+------------------------------------------------------------------------+
�Y1     |    1.000          .               .               .               .    |
�Y2     |     .            1.000            .               .               .    |
--------+------------------------------------------------------------------------+

The loadings of Y1 and Y2 on these factorial position are already their loadings
on the canonical factors of Y now, and the loadings of X1, X2 and X3 respectively
on the Y-canonical factors. 

The canonical correlations are the correlations of their canonical factors,
which can be found as sum of the products of their loadings, summed along the
rows, for instance the products of the loadings of �X1 and �Y1 
     1.000*0.942 + 1.0*0 + 0*0 + 0*0 + 0*0.334 + 0*-0.022 
 =   1.000*0.942 
 =   0.942




Results:--------------------------------------------------------------------
;h loading of orthogonal factors on given items are already thei 
;  correlations with them, so documenting the loadings shows also
;  the correlations of items with the canonical factors

      lx_SetY :  (crosscorrelations)
             �Y1           �Y2
--------+-----------------------
X1      |    0.719        -0.668 |
X2      |    0.727         0.567 |
X3      |    0.725        -0.325 |
      ly_SetY
Y1      |    0.384         0.060 |
Y2      |    0.940         0.005 |



;i  loadings on first 2 canonical factors of X 

      lx_SetX : 
X1      |    0.678         0.044 |
X2      |    0.685        -0.037 |
X3      |    0.683         0.021 |
--------+-----------------------
      ly_SetX (crosscorrelations):
Y1      |    0.407        -0.913 |
Y2      |    0.997        -0.075 |


;i canonical correlations (loadings of canonical factors on each other)
     
�X1     |    0.942          .    |
�X2     |     .           -0.066 |
--------+------------------------+
�Y1     |    1.000          .    |
�Y2     |     .            1.000 |
--------+------------------------+


===========================================================================

toby989 wrote:
> 
> Hi All
> 
> Since few days I am trying to figure out what in my derivation of the loadings 
> matrix does
> not work. The question is most likely how the matrix of eigenvectors needs to be 
> scaled
> (and how to scale them) in order that the loadings matrix comes out right. In the 
> code
> below, the eigenvalues seem correct since SAS's proc cancorr also gives me these 
> values
> that I get from calculating them manually with SAS/IML.
> 
> read all var {PENET PCYCLE PRICE PVTSH PURHH} into X;                           
> /*the IV*/
> read all var {FEAT DISP PCUT SCOUP MCOUP} into Y;                       /*the 
> dependents*/
> n = nrow (X);
> o = nrow (Y);                                                /*same as n in this 
> dataset*/
> p = ncol (X);
> q = ncol (Y);                                                /*same as p in this 
> dataset*/
> J = shape (1, n, n);
> K = shape (1, o, o);
> Xd = X - (J / n)` * X;
> Sx = (1 / (n - 1)) * Xd` * Xd;
> Xs = Xd * (sqrt (diag (Sx))` ** (-1))`;                   /*n x p standardized IV 
> matrix*/
> Yd = Y - (J / o)` * Y;
> Sy = (1 / (o - 1)) * Yd` * Yd;
> Ys = Yd * (sqrt (diag (Sy))` ** (-1))`;                   /*o x q standardized DV 
> matrix*/
> reset print;
> Rxx = (1 / (n - 1)) * Xs` * Xs;
> Rxy = (1 / (n - 1)) * Xs` * Ys;
> Ryy = (1 / (n - 1)) * Ys` * Ys;
> Ryx = (1 / (n - 1)) * Ys` * Xs;                             /*p x p correlation 
> matrices*/
> A = eigvec (Rxx ** (-1) * Rxy * Ryy ** (-1) * Ryx);
> B = eigvec (Ryy ** (-1) * Ryx * Rxx ** (-1) * Rxy);       /*p x p matrix of 
> eigenvectors*/
> aa = eigval (Rxx ** (-1) * Rxy * Ryy ** (-1) * Ryx);
> Da = sqrt (diag (aa[1:p, 1]));
> bb = eigval (Ryy ** (-1) * Ryx * Rxx ** (-1) * Rxy);
> Db = sqrt (diag (bb[1:p, 1]));  /*p x p diag matrix of canonical correlations same 
> as Da*/
> reset noprint;
> U = Xs * B;
> T = Ys * A;                                         /*n x p matrix of canonical 
> variates*/
> reset print;
> F = Rxx * B;
> G = Ryy * A;       /*p x p canonical loadings where A and B probably need to be 
> rescaled*/
> /*F = (1 / (n - 1)) * Xs` * U;*/
> /*G = (1 / (n - 1)) * Ys` * T;*/            /*where U and T probably need to be 
> rescaled*/
> /*F = (1 / (n - 1)) * Xs` * Xs * B;*/
> /*G = (1 / (n - 1)) * Ys` * Ys * A;*/            /*F and G can also be computed this 
> way*/
> 
> I really would appreciate your help. Thanks in advance.
> 
> Bye Toby
> 
> John describes canonical correlation also, but does not mention how the eigenvectors 
> are
> scaled to yield the canonical weights.
> http://www.biostat.wustl.edu/archives/html/s-news/1999-01/msg00104.html


-- 
----------------------------------------------------------------
Gottfried Helms                                Soz.Paed./Soz.Arb. 
Universitaet Kassel 
FB04 (Sozialwesen)    und        FG Praevention & Rehabilitation  
D-34109 Kassel                              Moenchebergstr. 19 B
----------------------------------------------------------------
email: mailto:[EMAIL PROTECTED]
www:   http://www.uni-kassel.de/~helms
================================================================
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Canonical Correlation Question (From Eigenvectors to Canonical

Reply via email to