Dear R-helpers,
I have performed a PLS regression with the mvr function from the pls.pcr package an I
have 2 questions :
1- do you know if mvr automatically centers the data ? It seems to me that it does
so...
2- why in the situation below does the output say that the optimal number of latent
variables is 4 ? In my humble opinion, it is 2 because the RMS increases and the R2
decreases when 3 LVs are considered :
> summary(maturityCondor.raw.mvr)
Data: X dimension: 8 1050
Y dimension: 8 1
Method: SIMPLS
Number of latent variables considered: 1-7
TRAINING:
RMS table:
[,1]
1 LV's 1.23e+01
2 LV's 6.79e+00
3 LV's 5.00e+00
4 LV's 2.17e+00
5 LV's 1.93e+00
6 LV's 7.79e-01
7 LV's 1.01e-09
Cumulative fraction of variance explained:
X Y
1 LV's 0.848 0.499
2 LV's 0.930 0.846
3 LV's 0.979 0.917
4 LV's 0.992 0.984
5 LV's 0.999 0.988
6 LV's 1.000 0.998
7 LV's 1.000 1.000
VALIDATION
Optimal number of latent variables: 4
RMS table (10-fold crossvalidation):
[,1]
1 LV's 16.21
2 LV's 12.15
3 LV's 13.81
4 LV's 6.68
5 LV's 6.38
6 LV's 5.91
7 LV's 13.38
Coefficient of multiple determination (R2):
[,1]
1 LV's 0.20
2 LV's 0.51
3 LV's 0.41
4 LV's 0.88
5 LV's 0.87
6 LV's 0.90
7 LV's 0.77
Thanks for your help,
Arnaud
*************************
Arnaud DOWKIW
Department of Primary Industries
J. Bjelke-Petersen Research Station
KINGAROY, QLD 4610
Australia
T : + 61 7 41 600 700
T : + 61 7 41 600 728 (direct)
F : + 61 7 41 600 760
**************************
********************************DISCLAIMER******************...{{dropped}}
______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help