I have some simple questions regarding PLS (Partial Least Squares) Linear Prediction.
Say that X is an n by k matrix where k is the number of dimensions of the data and n is the number of trials. (Have I transposed the standard coordinates?) Say that Y is an n dimensional column vector that encodes a value for each trial. We will only be predicting one value in the below. We want to produce a predictor P a k dimensional row vector. Say PLS(X,Y) produces a predictor P using the PLS algorithm. Say A is an n by n orthogonal matrix and say B is an k by k orthogonal matrix. Is it true that PLS(AX, AY)=PLS(X,Y)? Is it true that PLS(XB,Y) = B PLS(X,Y) (using matrix multiplication)? (Or are these group invariance properties not exactly true because of the approximations and iterative estimation of the algorithm?) If not please explain. Do we have to mean center the data first? What is the correct understanding from the highest abstract point of view of the relation of PLS to Principle Component Analysis? Is there an easy formula for PLS(X,Y) if X is diagonal? (This would be the case in somebody PCA-ed X first and the left and right orthogonal properties above were true. How do I write PLS(X,Y) iteratively using the PLS algorithm in a simple way that related it to PLS( other simpler X's, other simpler Y's)? Is there an n dimensional vector Y' such that PLS(X,Y)=ExactPredictor(X,Y')? Is there an easy formula for Y' if X is diagonal? Why isn't the average of the true and predicted stuff the same? Is there some sort of L2 like average relating them? What is the formula relating them? In what sense is PLS the optimal predictor? Is PLS stable with respect to noise in X and Y? If Y1 and Y2 are Y-like vectors is PLS(X,Y1+Y2) = PLS(X,Y1)+PLS(X,Y2)? If not what is the correct relation between them? Do you have to center Y1 and Y2 both first? If you apply PCA to the k+1 by n matrix made up of X and Y is there any relation between these eigenvalues and the new coordinates and PLS(X,Y) i.e. can the predictor be thought of as a new coordinated of this PCA decomposition? Is not what is the formula? A similar question is in order for the Exact Predictor instead of PLS? If you apply PCA to the k+1 matrix made up both X and Y is there any relation between these eigenvalues and the new coordinates of Exact Predictor(X,Y) . . . If PLS(X,Y) = 0 what does this mean conceptually? Is PLS(X,0) = 0? If not why? If W is a k to k2 matrix (possibly with some restrictions like sub orthogonal?) Is PLS(XW,YW) = PLS(X,Y) W? If not what is the relation? Is there any special relation if W is a projection matrix i.e. some of the data is zeroed? Here k2 may be bigger or smaller than k! What is the best book explaining PCA and PLS from the point of view of the kind of abstract questions and conceptual thought processes in the spirit of these questions? Who are the world's leading authorities both applied and practice for this stuff? . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
