[R] assessment of validity of PLS predictions

dx 5212 Wed, 18 Dec 2013 17:18:34 -0800

Hello,

I've built a PLSR model to predict the concentrations of mixture components
from experimental data using the 'pls' library. I calculated the Q residual
(or lack of fit) and T squared value for each of the samples used to build
the model in order to assess how well each sample is described by the
model. This is straightforward to do for these data because their X scores
are returned by the 'plsr' function in addition to the X loadings of the
model- both of these are needed to calculate the Q residual and T squared
value.


I'd like to calculate values for Q residual and T squared for predictions
for samples for which the concentrations aren't known. However, the 'plsr'
function doesn't calculate the X scores for predictions. I can solve for
them if I solve the equation (in R code):

X = T%*%trans(P)

for T:

trans(T) = (P*)%*%trans(X)

where X is the data matrix from prediction samples, T is X scores matrix
for prediction samples, P is X loadings matrix from the model and P* is the
pseudo-inverse of this matrix. From these calculated X scores of the
prediction samples and the X loadings of the model, I can calculate T
squared and the Q residual values.

My main question: Is my approach a reasonable one to identify samples that
may not be well described by a given model? If not, can anyone direct me to
a resource that describes better methods?

Thanks

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] assessment of validity of PLS predictions

Reply via email to