Re: [R] PCA IN R

2007-09-10 Thread Bjørn-Helge Mevik
prcomp() in stats handles matrices with n  p well, IMO.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xeon processor and ATLAS

2007-08-31 Thread Bjørn-Helge Mevik
Jeffrey J. Hallman wrote:

 I've been doing econometrics for nearly 20 years, and have not yet run across
 a situation that called for looking at a 1000 x 1000 matrix.  I tend not to
 believe analyses with more than a dozen explanatory variables.

In NIR spectroscopy, it is common to have at least 1000 variables, and
in NMR or MS, you easily get spectra with 2 variables.

You seldom get 1000 spectra, though. :-)  10 to 100 is more common.

-- 
B/H

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Streamlining Prcomp Data

2007-08-03 Thread Bjørn-Helge Mevik
Try this

result - summary(prcomp(USArrests))
names(result)
M - result$importance
M[2,]

The labels are the dimnames of the importance matrix.  They only
show up when the matrix is printed.  If you wish, you can remove
them with dimnames(M) - NULL.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about R, RMSEP, R2, PCR

2007-07-06 Thread Bjørn-Helge Mevik
Nitish Kumar Mishra wrote:

 I want to calculate PLS package in R. Now I want to calculate R, MSEP,
 RMSEP and R2 of PLSR and PCR using this.
 I also add this in library of R. How I can calculate R, MSEP, RMSEP and R2
 of PLSR and PCR in R.
 I s any other method then please also suggest me. Simply I want to
 calculate these value.

I'm not entirely sure what you are asking about, but if you want to
calculate R, MSEP, RMSEP and R2 for PLSRs and PCRs with the pls
package, this should work:

library(pls)
data(yarn)
mymodel - plsr(density ~ NIR, ncomp = 10, data = yarn) # or pcr()

See ?plsr for further options, especially validation for using
cross-validation.

## R2:
R2(mymodel)

## MSEP:
MSEP(mymodel)

## RMSEP:
RMSEP(mymodel)

See ?R2, etc. for further arguments, especially estimate for
selecting the estimator (test set, CV, or train).

The objects returned by these functions have a plot method, som
plot(RMSEP(...)) does what you'd expect.  See ?R2 for details about
the returned objects.

To get R (I presume you mean the correlation between predicted and
measured values), you can use

sqrt(R2(mymodel)$val)


-- 
HTH,
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-About PLSR

2007-05-29 Thread Bjørn-Helge Mevik
Nitish Kumar Mishra wrote:

 I have installed PLS package in R and use it for princomp  prcomp
 commands for calculating PCA using its example file(USArrests example).

Uhm.  These functions and data sets are not in the pls package; they
are in the stats and datasets packages that come with R.

 But How I can use PLS for Partial least square, R square, mvrCv one more
 think how i can import external file in R. When I use plsr, R2, RMSEP it
 show error could not find function plsr, RMSEP etc.
 How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R.

There is an Rnews article describing the package¹, and a paper in
Journal of Statistical Software².

¹Mevik, B.-H. (2006); The pls package; R News  6(3), 12-17.
http://cran.r-project.org/doc/Rnews

²Mevik, B.-H., Wehrens, R. (2007); The pls Package: Principal
Component and Partial Least Squares Regression in R; Journal of
Statistical Software  18(2), 1--24.
http://www.jstatsoft.org/v18/i02/v18i02.pdf

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About PLR

2007-04-17 Thread Bjørn-Helge Mevik
From within R, you can give the command

 install.packages(pls)

and R will download and install it for you (as long as you have access
to the Internet).

To install an already downloaded package, you can use

R CMD INSTALL pls_2.0-0.tar.gz

in a terminal window.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA (prcomp) details info.

2007-01-14 Thread Bjørn-Helge Mevik
Francesco Savorani wrote:

 I'm handling a matrix dataset composed by a number of variables much
 higher than the objects (900 vs 100) and performing a prcomp
 (centered and scaled) PCA on it. What I get is a Loadings (rotation)
 matrix limited by my lower number of objects and thus 900x100
 instead of 900x900. If I try to manually calculate the matrix scores
 multiplying the original variables (centered and scaled) for such a
 loadings matrix I cannot obtain the same values calculated by R and
 stored on the prcomp$x matrix (100x100).

This works for me:

M - matrix(rnorm(900*100), ncol = 900)
pca - prcomp(M, scale = TRUE)
S - scale(M) %*% pca$rotation
all.equal(S, pca$x) ## = TRUE

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] pls version 2.0-0

2007-01-02 Thread Bjørn-Helge Mevik and Ron Wehrens
Version 2.0-0 of the pls package is now available on CRAN.

The pls package implements partial least squares regression (PLSR) and
principal component regression (PCR).  Features of the package include

- Several plsr algorithms: orthogonal scores, kernel pls and simpls
- Flexible cross-validation
- A formula interface, with traditional methods like predict, coef,
  plot and summary
- Functions for extraction of scores and loadings, and calculation of
  (R)MSEP and R2
- Functions for plotting predictions, validation statistics,
  coefficients, scores, loadings, biplots and correlation loadings.

The main changes since 1.2-0 are

- There is now an options mechanism for selecting default fit algorithms.
  See ?pls.options.
- loadingplot() and coefplot() now try to be more intelligent when plotting
  x axis labels.
- The handling of factors in X has been improved, by changing the way the
  intercept is removed from the model matrix.
- All PLSR and PCR algorithms, as well as mvrCv(), have been optimised.
  Depending on the algorithm used, the size of the matrices, and the number
  of components used, one can expect from 5% to 65% reduction in
  computation time.
- Scaling of scores and loadings of kernel PLS and svd PCR algorithm has
  changed.  They are now scaled using the `classic' scaling found in
  oscorespls.
- The arguments `ncomp' now always means number of components, and `comps'
  always means component number.  The argument `cumulative' has been
  removed.
- A new data set 'gasoline' has been included.
- The 'NIR' and 'sensory' data sets have been renamed to 'yarn' and 'oliveoil'.


See the file CHANGES in the sources for all changes.

-- 
Bjørn-Helge Mevik

___
R-packages mailing list
R-packages@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why the factor levels returned by cut() are not ordered?

2006-11-29 Thread Bjørn-Helge Mevik
Wolfram Fischer wrote:

 What is the reason, that the levels of the factor
 returned by cut() are not marked as ordered levels?

I don't know, but you can always make it ordered with

ordered(cut(breaks = 3, sample(10)))

 help(factor)
 ...
 If 'ordered' is 'TRUE', the factor levels are assumed to be ordered.
 ...

The help file for factor() probably doesn't tell you much about how
cut() works. :-)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Look for a R Package to locate peak data point in one dimensional data set

2006-11-07 Thread Bjørn-Helge Mevik
I think the Bioconductor package PROcess has functions for that.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can lm() automatically take in the independent variables without knowing the names in advance

2006-10-09 Thread Bjørn-Helge Mevik
HelponR wrote:

 I am trying to use lm to do a simple regression but on a batch of
 different files.

 Each file has different column names.

 I know the first column is the dependent variable and all the rest are
 explanatory variables.

I believe

 lm(data = thedataframe)

(i.e. with no formula!) will use the first column as response and the
rest as predictors.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A statement over multiple lines (i.e. the ... feature in Matlab)

2006-10-05 Thread Bjørn-Helge Mevik
Robin Hankin wrote:

 For the line breaking, R deals with incomplete lines by not
 executing the statement until you finish it.

Beware, however, that syntactically valid lines do get executed
immediately (at least at the prompt).  So

1 + 2
- 3

will be interpreted as two commands (returning 3 and -3,
respectively), while

1 + 2 -
3

will be interpreted as a single command (returnig 0).

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New package ffmanova for 50-50 MANOVA released

2006-08-31 Thread Bjørn-Helge Mevik and Øyvind Langsrud
Version 0.1-0 of a new package `ffmanova' is now available on CRAN.

Comments, suggestions, etc. are welcome.  Please use the email address
ffmanova (at) mevik.net.

The package implements 50-50 MANOVA (Langsrud, 2002) with p-value
adjustment based on rotation testing (Langsrud, 2005).

The 50-50 MANOVA method is a modified variant of classical MANOVA made
to handle several highly correlated responses.  Classical MANOVA
performs poorly in such cases and it collapses when the number of
responses exceeds the number of observations.  The 50-50 MANOVA method
is suggested as a general method that will handle all types of data.
Principal component analysis is an integrated part of the algorithm.

The single response special case is ordinary general linear modeling.
Type II sums of squares are used to handle unbalanced designs
(Langsrud, 2003).  Furthermore, the Type II philosophy is extended to
continuous design variables.  This means that the method is invariant
to scale changes.  Centering of design variables is not needed.  The
Type II approach ensures that common pitfalls are avoided.

A univariate F-test p-value for each response can be reported when
several responses are present.  However, with a large number of
response variables, these results are questionable since we will
expect a lot of type I errors (incorrect significance).  Therefore
the p-values need to be adjusted.

By using rotation testing it is possible to adjust the single response
p-values according to the familywise error rate criterion in an exact
and non-conservative (unlike Bonferroni) way.

It is also possible to adjust p-values according to a false discovery
rate criterion.  Our method is based on rotation testing and allows
any kind of dependence among the responses (Moen et al., 2005).

Note that rotation testing is closely related to permutation testing.
One difference is that rotation testing relies on the multinormal
assumption.  All the classical tests (t-test, F-test, Hotelling T^2
test, ...) can be viewed as special cases of rotation testing.


REFERENCES 

Langsrud, Ø. (2002), 50-50 Multivariate Analysis of Variance for
Collinear Responses, Journal of the Royal Statistical Society SERIES D
- The Statistician, 51, 305-317.

Langsrud, Ø. (2003), ANOVA for Unbalanced Data: Use Type II Instead of
Type III Sums of Squares, Statistics and Computing, 13, 163-167.

Langsrud, Ø. (2005), Rotation Tests, Statistics and Computing, 15, 53-60. 

Moen, B., Oust, A., Langsrud, Ø., Dorrell, N., Gemma, L., Marsden,
G.L., Hinds, J., Kohler, A., Wren, B.W. and Rudi, K. (2005), An
explorative multifactor approach for investigating global survival
mechanisms of Campylobacter jejuni under environmental conditions,
Applied and Environmental Microbiology, 71, 2086-2094.


-- 
Bjørn-Helge Mevik and Øyvind Langsrud

___
R-packages mailing list
R-packages@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding Grid lines

2006-08-22 Thread Bjørn-Helge Mevik
Perhaps ?grid will help you.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] princomp and eigen

2006-07-15 Thread Bjørn-Helge Mevik
Murray Jorgensen wrote:

   set.seed(160706)
   X - matrix(rnorm(40),nrow=10,ncol=4)
   Xpc - princomp(X,cor=FALSE)
   summary(Xpc,loadings=TRUE, cutoff=0)
 Importance of components:
Comp.1Comp.2Comp.3 Comp.4
 Standard deviation 1.2268300 0.9690865 0.7918504 0.55295970
[...]

 I would have expected the princomp component standard deviations to be 
 the square roots of the eigen() $values and they clearly are not.

It's an 1/n vs. 1/(n-1) thing:

 eX - eigen(var(X))
 sqrt(eX$values)
[1] 1.2931924 1.0215069 0.8346836 0.5828707
 sqrt(9/10 * eX$values)
[1] 1.2268300 0.9690865 0.7918504 0.5529597

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problem with code/documentation mismatch

2006-06-26 Thread Bjørn-Helge Mevik
Just an idea:  how about using the \usage for the formal syntax, and
\synopsis for the user syntax, i.e. x/y ?

Not sure it will work, but it might be worth a try... :-)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R ``literal'' comand

2006-06-14 Thread Bjørn-Helge Mevik
capture.output(...)

If you want a single string, with newlines: 
paste(capture.output(...), collapse = \n)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Status of data.table package

2006-06-05 Thread Bjørn-Helge Mevik
Liaw, Andy wrote:

 I don't see it on CRAN, either, nor could I find mention of it in the R News
 you cited.

p. 66

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] PLS

2006-05-16 Thread Bjørn-Helge Mevik
jivan parab wrote:

 can u please give me the code written in matlab for partial least square
 regression

This is an email list about R, not about Matlab.

 will you pls provide me the code written in matlab and also the exlanatiion
 to each step

You not only want someone to do the work for you, but explain what
they did as well? :-)

My suggestion is: read a good introduction to PLSR (for instance the
one found in Martens  Næs (1989) Multivariate Calibration, Wiley).
If you are not able to implement the algorithm from the description
there, you are probably better off using an existing implementation,
for instance Barry Wise's chemometric toolbox.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Keeping scientific format on assignment

2006-05-11 Thread Bjørn-Helge Mevik
Joe Byers wrote:

 For example, the box.test object has a p-value of 2e-14 when I do
 a-box.test.object$p-value;
 a;
 the value of a is 0 not 2e-14.

The _value_ is still 2e-14 (up to machine precision).

 How do I keep the precision and format of the p-value.

You can format p-values in different ways using format.pval(), which
returns a string with the formatted value.  E.g.,

 format.pval(2e-14)
[1] 2e-14

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] pls package: bugfix release 1.2-1

2006-04-27 Thread Bjørn-Helge Mevik
Version 1.2-1 of the pls package is now available on CRAN.

This is mainly a bugfix-release.  If you fit multi-response models,
you are strongly engouraged to upgrade!


The main changes since 1.2-0 are

- Fixed bug in kernelpls.fit() that resulted in incorrect results when fitting
  mulitresponse models with fewer responses than predictors
- Changed default radii in corrplot()
- It is now possible to select the radii of the circles in corrplot

See the file CHANGES in the sources for all changes.


The pls package implements partial least squares regression (PLSR) and
principal component regression (PCR).  Features of the package include

- Several plsr algorithms: orthogonal scores, kernel pls and simpls
- Flexible cross-validation
- A formula interface, with traditional methods like predict, coef,
  plot and summary
- Functions for extraction of scores and loadings, and calculation of
  (R)MSEP and R2
- A simple multiplicative scatter correction (msc) implementation
- Functions for plotting predictions, validation statistics,
  coefficients, scores, loadings, biplots and correlation loadings.

-- 
Bjørn-Helge Mevik

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Question about PLS regression

2006-04-19 Thread Bjørn-Helge Mevik
Andris Jankevics wrote:

 But I have a sligthy different results with my real data. Where can the 
 problem be?

I think you have to supply some details for anyone to be able to
answer.  At least what you did (the code), what you got (the results)
and what you expected to get.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] question about Principal Component Analysis in R?

2006-02-28 Thread Bjørn-Helge Mevik
Michael wrote:

 pca=prcomp(training_data, center=TRUE, scale=FALSE, retx=TRUE);

 Then I want to rotate the test data set using the

 d1=scale(test_data, center=TRUE, scale=FALSE) %*% pca$rotation;
 d2=predict(pca, test_data, center=TRUE, scale=FALSE);

 these two values are different

 min(d2-d1)
 [1] -1.976152
 max(d2-d1)
 [1] 1.535222

This is because you have subtracted a different means vector.  You
should use the coloumn means of the training data (as predict does;
see the last line of stats:::predict.prcomp):

d1=scale(test_data, center=pca$center, scale=FALSE) %*% pca$rotation;


-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] pls version 1.2-0

2006-02-23 Thread Bjørn-Helge Mevik and Ron Wehrens
Version 1.2-0 of the pls package is now available on CRAN.

The pls package implements partial least squares regression (PLSR) and
principal component regression (PCR).  Features of the package include

- Several plsr algorithms: orthogonal scores, kernel pls and simpls
- Flexible cross-validation
- A formula interface, with traditional methods like predict, coef,
  plot and summary
- Functions for extraction of scores and loadings, and calculation of
  (R)MSEP and R2
- A simple multiplicative scatter correction (msc) implementation
- Functions for plotting predictions, validation statistics,
  coefficients, scores, loadings, biplots and correlation loadings.

The main changes since 1.1-0 are

- predict() now handles missing data like the `lm' method does (the
  default is to predict `NA').
- fitted() and residuals() now return NA for observations with missing
  values, if na.action is na.exclude.
- `ncomp' is now reduced when it is too large for the requested 
  cross-validation.
- Line plot parameter arguments have been added to predplotXy(), so
  one can control the properties of the target line in predplot().
- MSEP(), RMSEP(), loadings(), loadingplot() and scoreplot() are now
  generic.


See the file CHANGES in the sources for all changes.

-- 
Ron Wehrens and Bjørn-Helge Mevik

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

2006-02-08 Thread Bjørn-Helge Mevik
Why don't you test it yourself?

E.g.,

set.seed(42)
bob1 - rnorm(1000,0,1)
set.seed(42)
bob2 - rnorm(500,0,1)
bob3 - rnorm(500,0,1)
identical(bob1, c(bob2, bob3))

I won't tell you the answer. :-)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Using R to process spectroscopic data

2006-02-08 Thread Bjørn-Helge Mevik
Dirk De Becker wrote:

 * Determine the range of the spectrum to be used - For this, I should 
 be able to calculate the regression coefficients

You can get the regression coefficients from a PLSR/PCR with the
coef() function.  See ?coef.mvr  However, using the regression
coefficients alone for selecting variables/regions, can be 'dangerous'
because the variables are highly correlated.

One alternative is 'variable importance' measures, e.g. VIP (variable
importance in projections) as described in Chong, Il-Gyo  Jun,
Chi-Hyuck, 2005, Performance of some variable selection methods when
multicollinearity is present, Chemometrics and Intelligent Laboratory
Systems 78, 103--112.  A crude implementation of VIP can be found in
http://mevik.net/work/software/pls.html

Another alternative is to use jackknife-estimated uncertainties of the
regression coefficients in significance tests.  (I don't have any
reference or implementation, sorry. :-)

The correlation loadings can also give valuable information about
which variables that might be important for the regression.  See
?corrplot in the pls package.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] 'all' inconsistent?

2006-01-31 Thread Bjørn-Helge Mevik
Seth Falcon wrote:

 On 29 Jan 2006, [EMAIL PROTECTED] wrote:

 On Sun, 29 Jan 2006, Elizabeth Purdom wrote:

 I came across the following behavior, which seems illogical to me.

 What did you expect and why?

 I don't know if it is a bug or if I'm missing something:

 all(logical(0))
 [1] TRUE

 All the values are true, all none of them.

 I thought all the values are false, all none of them, because there
 aren't any that are true:

 any(logical(0))
 [1] FALSE

But they are, all none of them:

 all(!logical(0))
[1] TRUE

:-)

And there aren't any FALSE values either:

 any(!logical(0))
[1] FALSE

so it is only logical that all none of them are TRUE.  I love the
empty set! :-)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] New package lspls

2006-01-16 Thread Bjørn-Helge Mevik
Dear R useRs,

A new package `lspls' is now available on CRAN.  It implements the
LS-PLS (least squares--partial least squares) regression method,
described in for instance

Jørgensen, K., Segtnan, V. H., Thyholt, K., Næs, T. (2004) A Comparison of
Methods for Analysing Regression Models with Both Spectral and Designed
Variables; Journal of Chemometrics 18(10), 451--464.

The current version of the package (0.1-0) should probably be
considered `alpha software'.  Nothing is cast in iron yet, and
especially the formula interface and internal structure are apt to
change in future versions.  The software should, however, be fully
usable in its present form.

`lspls' currently includes fit and cross-validation functions with
formula interfaces, a predict method, and plots of scores, loadings
and (R)MSEP values.

Suggestions, bug reports and other comments are very welcome.

-- 
Sincerely,
Bjørn-Helge Mevik

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] modifying code in contributed libraries - changes from versions 1.* to 2.*

2005-11-23 Thread Bjørn-Helge Mevik
Seth Falcon wrote:

 Actually, R source packages are also mangled.  While the source is
 readable, it is not in the form used to develop the package.

I haven't seen this behaviour.  At least for the simple package I'm
maintaining (pls), the only file in the source package that is changed
by R CMD build, is DESCRIPTION.  All .R and .Rd files are untouched
(even the modification dates are unchanged).  (This is on a Linux
system, I don't know how it works on MS/Mac.)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] changing figure size in Sweave

2005-11-18 Thread Bjørn-Helge Mevik
TEMPL Matthias wrote:

 Use \setkeys{Gin} to modify figure sizes or use explicit
 \includegraphics commands in combination with Sweave option
 include=FALSE.

Or use \documentclass[nogin,...]{...}.  Then the 'Gin' will have no
effect, and the size of the plots in the document will not be changed
from the size given as ...,height=??,width=?? (i.e. the size
produced by R).

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] changing figure size in Sweave

2005-11-18 Thread Bjørn-Helge Mevik
Bjørn-Helge Mevik wrote:

 Or use \documentclass[nogin,...]{...}.  Then the 'Gin' will have no
 effect, and the size of the plots in the document will not be changed
 from the size given as ...,height=??,width=?? (i.e. the size
 produced by R).

A small correction:  using the 'nogin' doesn't make LaTeX ignore
settings of 'Gin', but prevents Sweave.sty from setting it.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Insightful Announces: R and S-PLUS- Panel Discussion at 9th Annual 2005 User Conference

2005-10-18 Thread Bjørn-Helge Mevik
Michael O'Connell wrote:

 tools to make it easy to convert R packages to S-PLUS.

Not the other way around as well?

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ISO R-programming docs/refs

2005-10-17 Thread Bjørn-Helge Mevik
[A lot of polite and constructive critique deleted]

 Is my impression correct that R is simply not well-documented enough
 for serious programming?

No.

 Have I missed a key reference to programming R?

Yes.

How about reading the text that R displays when it starts (and follow
its suggestions)?  Or visiting the canonical web site for R
(http://www.r-project.org/)?  Or consulting question 2.7 in the R FAQ
(2.7 What documentation exists for R?)  Or reading the posting guide
for the list (http://www.R-project.org/posting-guide.html)?

All four methods would (presumably) quickly have led you to manuals
such as the R Language Definition, Writing R Extensions and R
Data Import/Export, and given references to books on programming S
and/or R.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] pls version 1.1-0

2005-10-16 Thread Ron Wehrens and Bjørn-Helge Mevik
Version 1.1-0 of the pls package is now available on CRAN.

The pls package implements partial least squares regression (PLSR) and
principal component regression (PCR).  Features of the package include

- Several plsr algorithms: orthogonal scores, kernel pls and simpls
- Flexible cross-validation
- A formula interface, with traditional methods like predict, coef,
  plot and summary
- Functions for extraction of scores and loadings, and calculation of
  (R)MSEP and R2
- A simple multiplicative scatter correction (msc) implementation
- Functions for plotting predictions, validation statistics,
  coefficients, scores, loadings, biplots and correlation loadings.

The main changes since 1.0-3 are

- mvr, mvrCv and predict.mvr now has builtin support for scaling of X.
- A new function stdize for explicit centering and/or scaling.
- Correlation loadings plot (corrplot).
- New argument `varnames' in coefplot, to label the x tick marks with the
  variable names.
- loadingplot, coefplot and plot.mvrVal can now display legends, with the
  argument 'legendpos'.

See CHANGES in the sources for all changes.


-- 
Bjørn-Helge Mevik and Ron Wehrens

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help: PLSR

2005-09-05 Thread Bjørn-Helge Mevik
Shengzhe Wu writes:

 I have a data set with 15 variables (first one is the response) and
 1200 observations. Now I use pls package to do the plsr as below.

[...]

 Because the trainSet has been scaled before training, I think Xtotvar
 should be equal to 14, but unexpectedly Xtotvar = 16562,

Because the Xtotvar is the total X variation, measured by sum(X^2)
(where X has been centered).  With 14 variables, scaled to sd == 1,
and 1200 observations, you should get Xtotvar == 14*(1200-1) ==
16786.  (Maybe you have 1184 observations: 14*1183 == 16562.)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Reference manual is not available in the help menu of the rgui

2005-09-02 Thread Bjørn-Helge Mevik
Sean O'Riordain writes:

 Actually, I've started reading the reference manual... :-)

 I printed it out 2-to-a-page and I'm working my way through it,

Ah!  This reminds me of the `good old days', reading the Emacs manual,
Emacs lisp manual, Gnu C library manual,   The payoff came in the
section giving the meaning of the C library error codes: EGREGIOUS
means `You did *what*?'.  :-)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] PLSR: model notation and reliabilities

2005-08-29 Thread Bjørn-Helge Mevik
I.Ioannou writes:

 I have a model with 2 latent constructs (D1 and D2)
 each one made by 3 indicators (D1a, D1b, D1c etc).
 Also I have 2 moderating indicators (factors, m1, m2).
 The response (Y) is also a latent construct, with 3 
 indicators (Y1,Y2,Y3).

[...]

It seems to me that what you are looking for, is some sort of
structured equation models (à la Lisrel).  The pls package implements
partial least squares regression and principal component regression,
which is something different.  I quess you could still use plsr for the
outer model (path model), but you would have to build the inner
model (the constructs) with other tools, such as prcomp/princomp or
other factor analyses (see e.g. ?factanal and ?varimax).

Alternatively, there is an R package sem that implements structured
equation models.  You might want to take a look at that.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] help: pls package

2005-07-22 Thread Bjørn-Helge Mevik
wu sz writes:

 trainSet = as.data.frame(scale(trainSet, center = T, scale = T))
 trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = 
 kernelpls,
 CV = TRUE, validation = LOO, model = TRUE, x = TRUE,
 y = TRUE)

[Two side notes here:
 1) scaling of the data (with its sd) should be performed inside the
 cross-validation.  In the current version of 'pls', one can use
 cvplsr - crossval(plsr(y ~ scale(X), ncomp = 14, data = mydata),
length.seg = 1)
 (However, 'crossval' is slower than the built-in cross-validation on
 'mvr'/'plsr'.  In the development version of the package, scaling
 within the cross-validation has been implemented in the built-in
 cross-validation.  This will hopefully be published shortly.)

 2) The 'CV' argument is from the earlier 'pls.pcr' package, and is no
 longer used.  It is silently ignored.]

 i = 1; msep_element = c()
 while(i = length(p)){
msep_element[,i] = (p[i]-y)^2
i = i + 1
 }

Hmm...  I don't see how you got that code to run.  This should work, though:

msep_element - (p - y)^2

 msep = colMeans(msep_element)
 msep_sd = sd(msep_element)

You will get much closer to the true value with

sd(msep_element) / sqrt(length(y))

However, this will not produce an unbiased estimate of the sd of the
estimated MSEP, because it ignores the depencies between the
residuals.  E.g., the residual when sample 1 is predicted is not
independent of the residual when sample 2 is predicted.  In general, I
think, it will produce underestimated sds.  The effect should be
largest for small data sets.

This is the reason the pls package currently doesn't estimate se of
cross-validated MSEPs.  There is also the question of what the
estimated should be conditioned on: for leave-one-out
cross-validation, sd(MSEP | trainData) = 0.

[If someone knows how to calculate unbiased estimates of
cross-validated MSEPs, please let me know. :-)]

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] PLS: problem transforming scores to variable space

2005-07-15 Thread Bjørn-Helge Mevik
rainer grohmann writes:

 However, when I try to map the scores back to variable space, I ran into
 problems:
[...]
 cbind(t$scores[,1],(t$scores%*%(t$loadings)%*%t$projection)[,1])

You need to transpose the loadings:

 all.equal(unclass(t$scores),
+   t$scores %*% t(t$loadings) %*% t$projection)
[1] TRUE

(A tip: Since 't' is used for transposing, it is usually a Good
Thing(TM) to avoid using it as a varable name.)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] add transformed columns into a matrix

2005-06-28 Thread Bjørn-Helge Mevik
Supposing 'inmatrix' is a matrix with coloumn names 'x1', 'x2' and
'x3'; how about something like

model.matrix(~ (x1 + x2 + x3)^2 + log(x1) + log(x2) + log(x3) +
 sqrt(x1) + sqrt(x2) + sqrt(x3) - 1,
 as.data.frame(inmatrix))

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] mvr function

2005-06-03 Thread Bjørn-Helge Mevik
McGehee, Robert writes:

 dataSet - data.frame(y = vol[, 12])
 dataSet$X - data.matrix(vol[, 1:11])

 ans.pcr - pcr(y ~ X, 6, data = dataSet, validation = CV)

 If there's a more elegant way of doing this without using data frames of
 matrices, I'd be interested as well.

I actually find using data frames with matrices the most elegant
way. :-)  Especially if you have several matrices.

Alternatively, to regress one variable of a data frame on the rest of
the variables, one can use

 ans.pcr - pcr(y ~ ., 6, data = vol, validation = CV)

(assuming the response variable is called `y' in the data frame; see
names(vol).)

One does not _have_ to store the data in a data frame (although I
would recommend it, because it is then easier to specify test data
sets and alternative data sets).  One can simply store the variables
in the global environment, and skip the `data' argument of `pcr',

-- 
HTH,
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] dot in formula

2005-06-03 Thread Bjørn-Helge Mevik
Adrian Baddeley writes:

 I want to manipulate a formula object, containing the name .
 so that . is replaced by a desired (arbitrary) expression.

How about

myf - y ~ .
update(myf, . ~ -. + X)

-- 
HTH,
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] pls version 1.0-3

2005-05-26 Thread Bjørn-Helge Mevik
Version 1.0-3 of the pls package is now available on CRAN.

The pls package implements partial least squares regression (PLSR) and
principal component regression (PCR).  Features of the package include

- Several plsr algorithms: orthogonal scores, kernel pls and simpls
- Flexible cross-validation
- A formula interface, with traditional methods like predict, coef,
  plot and summary
- Functions for extraction of scores and loadings, and calculation of
  (R)MSEP and R2
- A simple multiplicative scatter correction (msc) implementation
- Functions for plotting predictions, validation statistics,
  coefficients, scores, loadings and biplots.

(The pls package is meant to supersede the pls.pcr package.)

-- 
Ron Wehrens and Bjørn-Helge Mevik

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to break an axis?

2005-05-25 Thread Bjørn-Helge Mevik
What about simply using a log scale on the y axis? I.e. plot(..., log=y)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Can't reproduce clusplot princomp results.

2005-05-24 Thread Bjørn-Helge Mevik
Thomas M. Parris writes:

 clusplot reports that the first two principal components explain
 99.7% of the variability.
[...]

 loadings(pca)
[...]
Comp.1 Comp.2 Comp.3 Comp.4
 SS loadings  1.00   1.00   1.00   1.00
 Proportion Var   0.25   0.25   0.25   0.25
 Cumulative Var   0.25   0.50   0.75   1.00

This has nothing to do with how much of the variability of the
original data that is captured by each component; it merely measures
the variability in the coefficients of the loading vectors (and they
are standardised to length one in princomp)

What you want to look at is pca$sdev, for instance something like

totvar - sum(pca$sdev^2)
rbind(explained var = pca$sdev^2,
  prop. expl. var = pca$sdev^2/totvar,
  cum.prop.expl.var = cumsum(pca$sdev^2)/totvar)
 Comp.1Comp.2  Comp.3   Comp.4
explained var 3.4093746 0.5785399 0.011560142 0.0005252824
prop. expl. var   0.8523437 0.1446350 0.002890036 0.0001313206
cum.prop.expl.var 0.8523437 0.9969786 0.999868679 1.00

And as you can see, two comps explain 99.7%. :-)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Pca loading plot lables

2005-04-25 Thread Bjørn-Helge Mevik
One way is to use the loadingplot() function in the package `pls':

molprop.pc - princomp(whatever)

library(pls)
loadingplot(molprop.pc, scatter = TRUE, labels = names)

(If you want comp 2 vs. comp 1:
loadingplot(molprop.pc, comps = 2:1, scatter = TRUE, labels = names)


-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Thanks! (Was: Re: [R] R-2.1.0 is released)

2005-04-20 Thread Bjørn-Helge Mevik
I'd like to thank the developers in the Core Team for their great
work!  R has become an invaluable and indispensible tool for (at least)
me, much thanks to the hard and good work of the Core Team.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Journal Quality R Graphs?

2005-03-01 Thread Bjørn-Helge Mevik
Werner Wernersen writes:

 the graphs look nice on the screen but when printed in black and
 white every color apart from black doesn't look very nice.

My advice is: If you want a black-and-white or grayscale printout, don't
plot in colors.

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to set up number of prin comp.

2005-02-25 Thread Bjørn-Helge Mevik
 I am trying to use PrinComp to do principle component analysis. I 
 would like to know how to set the number of principle components.

I assume you mean the function princomp (case _does_ matter in R) in
package stats (which is loaded by default).  This function has no way
of specifying how many components to calculate; it always gives you
all components.  You have to select the components you want
afterwards.  See help(princomp) for details.  E.g.

X - some matrix
pc - princomp(X)
pc$scores[,1:4]# The four first score vectors
pc$loadings[,1:4]  # The four first loadings

(The loadings can also be extracted with loadings(pc)[,1:4] .)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] using 'nice' with R

2005-02-23 Thread Bjørn-Helge Mevik
Roger D. Peng writes:

 On a Unix like system you can do `nice +19 R' or perhaps `nice +19 R
 CMD BATCH commands.R'.

At least on Suse (9.1) and Debian (3.0) Linux, the syntax is
`nice -19 R' (i.e. with `-', not `+'.)

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Setting the width and height of a Sweave figure

2005-01-17 Thread Bjørn-Helge Mevik
I haven't found any other solution than using

fig=TRUE,height=7,width=14
theCode()
@

(but of course that doesn't have any effect when theCode() is used
interactively).

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] http://bugs.r-project.org down?

2004-12-09 Thread Bjørn-Helge Mevik
I haven't been able to connect to http://bugs.r-project.org the last
few days.  Is there a problem with the site (or am I having a
problem :-) ?

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Combined variable names

2004-12-02 Thread Bjørn-Helge Mevik
Peter Dalgaard writes:

 There are irregularities, e.g. the fact that you do help(foo), not
 help(foo), but they tend to get a pain in the long run (How do you
 get help on a name contained in a variable?

v - lm; help(v)
works for me :-)

(But I totally agree that the regularity of R (or S) is part of what
makes it so much better than for instance Matlab.  At least for me.)

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] LDA with previous PCA for dimensionality reduction

2004-11-25 Thread Bjørn-Helge Mevik
Torsten Hothorn writes:

 as long as one does not use the information in the response (the class
 variable, in this case) I don't think that one ends up with an
 optimistically biased estimate of the error

I would be a little careful, though.  The left-out sample in the
LDA-cross-validation, will still have influenced the PCA used to build
the LDA on the rest of the samples.  The sample will have a tendency
to lie closer to the centre of the complete PCA than of a PCA on the
remaining samples.  Also, if the sample has a high leverage on the
PCA, the directions of the two PCAs can be quite different.  Thus, the
LDA is built on data that fits better to the left-out sample than if
the sample was a completely new sample.

I have no proofs or numerical studies showing that this gives
over-optimistic error rates, but I would not recommend placing the PCA
outside the cross-validation.  (The same for any resampling-based
validation.)

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Recovering R Workspace

2004-11-05 Thread Bjørn-Helge Mevik
You probably don't need to re-install R; just remove or rename the
file .RData (it is probably located in your home directory (or My
Documents on MSWin).  Then R should start without problems.

As for recovering the workspace, I believ that is a lost cause (unless
you study the file format and use a binary editor to extract/repair
objects in the file -- and even then, if the file was compressed, you
may be out of luck).

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R-2.0: roadmap? release statements? plans?

2004-10-01 Thread Bjørn-Helge Mevik
Well, you could download the latest beta-release and look in the NEWS
file there.

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sparse Matrices in R

2004-09-01 Thread Bjørn-Helge Mevik
Wolski [EMAIL PROTECTED] writes:

 Hi!
 help.search(sparse matrix)


 graph2SparseM(graph)Coercion methods between graphs and sparse
 matrices
 tripletMatrix-class(Matrix)
 Class tripletMatrix sparse matrices in
 triplet form
 SparseM.hb(SparseM) Harwell-Boeing Format Sparse Matrices
 image,matrix.csr-method(SparseM)
 Image Plot for Sparse Matrices
 etc .

Which of course assumes that you already have packages such as
SparseM, Matrix and graph installed on your system.  If you don't,
help.search(sparse matrix) returns no matches.  :-)

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Precision in R

2004-07-21 Thread Bjørn-Helge Mevik
Since you didn't say anything about _what_ you did, either in SAS or
R, my first thought was:  Have you checked that you use the same
parametrization of the models in R and SAS?

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] center or scale before analyzing using pls.pcr

2004-02-05 Thread Bjørn-Helge Mevik
Jinsong Zhao [EMAIL PROTECTED] writes:

 I found pls.pcr package will give different results if the data are
 centered and scaled using scale().

Centering is done automatically by all implementations of PLSR I am
aware of (including pls.pcr, afaics).

 I am not sure about when I should scale my data,

There are no fixed rules about this.  Many practitioners live more or
less by the rule that unless the variables are `of the same type' or have
equal or comparable scales, they are scaled.  One example of data that
is typically not scaled (at least to begin with) is spectroscopic data.

 and whether the dependent variable should be scaled.

There is no need for scaling the dependent variable.

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Stepwise Regression and PLS

2004-02-03 Thread Bjørn-Helge Mevik
Liaw, Andy [EMAIL PROTECTED] writes:

 one needs to be lucky to have the first few PCs correlate well to
 the response in case of PCR.

Which is one reason PLSR is often preferred over PCR in at least the
field of chemometrics.  Since the components of PLSR maximise the
covariance with the response, the first few components are usually
more correlated to the response than PCs.  For spectroscopists, the
PLSR loadings are often very interpretable, and are much used to
qualitatively validate the model.

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] MATLAB to R

2004-02-02 Thread Bjørn-Helge Mevik
[EMAIL PROTECTED] writes:

 In MATLAB, I can write:

 for J=1:M
 Y(J+1)=Y(J)+ h * feval(f,T(J),Y(J));
 ...

 In R, I can write above as:

 for (J in 2:M)
 {
  y = y + h * f(t,y)
 ...
 }

Are you sure this gives the same result?  If Y and T in Matlab are
vectors, I believe

for (J in 1:M)
{
  y[J+1] - y[J] + h * f(tt[J], y[J])
  ...
}

is what you want.  (Don't use `t' as a variable; t() is the function
to transpose a matrix.)

 for J=1:M
 k1 = feval(f,T(J),Y(J));
 k2 = feval(f,T(J+1),Y(J)+ h * k1

I assume you mean k1(J) = ... and k2(J) = ...

 How do I write k2 in R?
 k1 = f(t,y)
 k2 = ?

## If f can take vector arguments:
k1 - f(tt[-M],y)
k2 - f(tt[-1], y+h*k1)
## Otherwise:
for (J in 1:M) {
  k1[J] - f(tt[J], y[J])
  k2[J] - f(tt[J+1], y[J] + h*k1[J])
}

-- 
Hth,
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nameless functions in R

2003-12-03 Thread Bjørn-Helge Mevik
Rajarshi Guha [EMAIL PROTECTED] writes:

 apply(x, c(2), funtion(v1,v2){ identical(v1,v2) }, v2=c(1,4,2))

 The above gives me a syntax error. I also tried:

No wonder!  Try with `function' instead of `funtion'.

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] /usr/lib/R/library vs /usr/local/lib/R/site-library

2003-10-30 Thread Bjørn-Helge Mevik
Dirk Eddelbuettel [EMAIL PROTECTED] writes:

 On Wed, Oct 29, 2003 at 07:51:16AM -0800, A.J. Rossini wrote:

 /usr/lib/R/site-library is for
 apt-installed R packages (from CRAN, or Jim Lindsey's works), and

 It is not yet fully implemented as not all apt-get'able Debian
 packages of CRAN, Omegahat, ...

Forgive my ignorance; I just switched to Debian.  Are there R packages
(such as `car' or `vegan') that are apt-get'able?  How can I find out
which ones, and how do I set up apt to get them?

(I've added `deb http://cran.r-project.org/bin/linux/debian woody
main' to /etc/apt/sources.list, to get the latest version of R.)

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] disappointed (card model)

2003-10-24 Thread Bjørn-Helge Mevik
Don't be disappointed, be glad:  It gives you the opportunity to
contribute by writing one yourself!

(Remember, R is developed by volunteers.)

-- 
Sincerely,
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] princomp with more coloumns than rows: why not?

2003-10-20 Thread Bjørn-Helge Mevik
Thanks for good suggestions for alternatives to princomp!

My original question, though, was /why/ it was decided to disallow
more coloumns than rows in princomp.  (And also whether it would be
possible to augment the result from prcomp with the coloumn means.)

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] How to flip image?

2003-10-16 Thread Bjørn-Helge Mevik
Ernie Adorio [EMAIL PROTECTED] writes:

 If not possible, is there any built-in R command to reverse the rows of a 
 matrix?

How about Face[nrow(Face):1, ] ?

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] princomp with more coloumns than rows: why not?

2003-10-16 Thread Bjørn-Helge Mevik
As of R 1.7.0, princomp no longer accept matrices with more coloumns
than rows.  I'm curious:  Why was this decision made?

I work a lot with data where more coloumns than rows is more of a rule
than an exception (for instance spectroscopic data).  To me, princomp
have two advantages above prcomp: 1) It has a predict method, and 2)
it has a biplot method.

A biplot method shouldn't be too difficult to implement (I believe
I've seen one on R-help).

A predict method seems to be more difficult, because the prcomp object
doesn't include the means that need to be subtracted from the new
data.  Would it break conformance with S to let prcomp return the
means as well?

-- 
Sincerely,
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] anova(lme object)

2003-08-21 Thread Bjørn-Helge Mevik
It is documented in ?anova.lme:

 anova(res1, type=marginal)

and

 anova(res2, type=marginal)

should give equivalent tables.

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Marginal (type II) SS for powers of continuous variables ina linear model?

2003-08-14 Thread Bjørn-Helge Mevik
Prof Brian Ripley [EMAIL PROTECTED] writes:

 drop1 is the part of R that does type II sum of squares, and it works in 
 your example.  So does Anova in the current car:

I'm sorry, I should have included an example to clarify what I meant
(or point out my misunderstandings :-).  I'll do that below, but first
a comment:

 And in summary.aov() those *are* marginal SS, as balance is assumed
 for aov models. (That is not to say the software does not work otherwise, 
 but the interpretability depends on balance.)

Maybe I've misunderstood, but in the documentation for aov, it says
(under Details):
 This provides a wrapper to `lm' for fitting linear models to
 balanced or unbalanced experimental designs.

Also, is this example (lm(y~x+I(x^2), Df)) really balanced?  I think
of balance as the property that there is an equal number of
observations for every combination of the factors.  With x and x^2,
this doesn't happen.  For instance, x=1 and x^2=1 occurs once, but x=1
and x^2=4 never occurs (naturally).  Or have I misunderstood something?

Now, the example:

 Df2 - expand.grid (A=factor(1:2), B=factor(1:2), x=1:5)
 Df2$y - codes(Df2$A) + 2*codes(Df2$B) + 0.05*codes(Df2$A)*codes(Df2$B) +
+   Df2$x + 0.1*Df2$x^2 + 0.1*(0:4)
 Df2 - Df2[-1,]# Remove one observation to make it unbalanced

 ABx2.lm - lm(y~A*B + x + I(x^2), data=Df2)

The SSs I call marginal are R(A | B, x, x^2), R(B | A, x, x^2),
R(A:B | A, B, x, x^2), R(x | A, B, A:B) and R(x^2 | A, B, A:B, x).

(Here, for instance, R(x | A, B, A:B) means the reduction of SSE due
to including x in a model when A, B and A:B (and the mean) are already
in the model. I've omitted the mean from the notation.)

 anova(ABx2.lm)
Analysis of Variance Table

Response: y
  Df Sum Sq Mean Sq   F valuePr(F)
A  1  1.737   1.737   66.5700 1.801e-06 ***
B  1 13.647  13.647  523.0292 6.953e-12 ***
x  1 93.677  93.677 3590.1703  2.2e-16 ***
I(x^2) 1  0.583   0.583   22.3302 0.0003966 ***
A:B1  0.011   0.0110.4238 0.5263772
Residuals 13  0.339   0.026
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

This gives SSs on the form R(A), R(B | A), R(x | A, B) etc.  (If the
design had been balanced (in A, B and x), this would have been the
same as the marginal SSs above.)

 drop1(ABx2.lm)
Single term deletions

Model:
y ~ A * B + x + I(x^2)
   Df Sum of Sq RSS AIC
none0.339 -64.486
x   1 1.188   1.527 -37.901
I(x^2)  1 0.592   0.931 -47.294
A:B 1 0.011   0.350 -65.877

This gives the SSs R(x | A, B, A:B, x^2), R(x^2 | A, B, A:B, x) and
R(A:B | A, B, x, x^2).  The SS for x is not marginal as defined
above.

 library (car)
 Anova(ABx2.lm)
Anova Table (Type II tests)

Response: y
   Sum Sq Df  F valuePr(F)
A  5.1806  1 198.5470 2.979e-09 ***
B 19.6610  1 753.5074 6.778e-13 ***
x  1.1879  1  45.5245 1.368e-05 ***
I(x^2) 0.5922  1  22.6970 0.0003699 ***
A:B0.0111  1   0.4238 0.5263772
Residuals  0.3392 13   
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

This gives marginal SSs for A, B, x^2 and A:B, but as with drop1, the
SS for x is R(x | A, B, A:B, x^2).

The only way I've figured out to give the `correct' SS for x, i.e.,
R(x | A, B, A:B), is: 

 AB.lm - lm(y~A*B, data=Df2)
 ABx.lm - lm(y~A*B + x, data=Df2)
 anova (AB.lm, ABx.lm, ABx2.lm)
Analysis of Variance Table

Model 1: y ~ A * B
Model 2: y ~ A * B + x
Model 3: y ~ A * B + x + I(x^2)
  Res.DfRSS Df Sum of SqFPr(F)
1 15 93.760
2 14  0.931  192.829 3557.651  2.2e-16 ***
3 13  0.339  1 0.592   22.697 0.0003699 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

(The ABx2.lm is included to give the same error term to test against
as in the ANOVAs above.)

The baseline of all this is that I think it would be nice if a
function like Anova in the car package returned R(x | A, B, A:B)
instead of R(x | A, B, A:B, x^2) as SS for x in a model such as the
above.

(I hope I've made myself clearer, and not insulted anyone by
oversimplifying. :-)

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] Marginal (type II) SS for powers of continuous variables in alinear model?

2003-08-14 Thread Bjørn-Helge Mevik
I've used Anova() from the car package to get marginal (aka type II)
sum-of-squares and tests for linear models with categorical
variables.  Is it possible to get marginal SSs also for continuous
variables, when the model includes powers of the continuous variables?

For instance, if A and B are categorical (factors) and x is
continuous (numeric),

Anova (lm (y ~ A*B + x, ...))

will produce marginal SSs for all terms (A, B, A:B and x).  However,
with 

Anova (lm (y ~ A*B + x + I(x^2), ...))

the SS for 'x' is calculated with I(x^2) present in the model, i.e. it
is no longer marginal.

Using poly (x, 2) instead of x + I(x^2), one gets a marginal SS for
the total effect of x, but not for the linear and quadratic effects
separately.  (summary.aov() has a 'split' argument that can be used to
get separate SSs, but these are not marginal.)


-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Marginal (type II) SS for powers of continuous variables ina linear model?

2003-08-14 Thread Bjørn-Helge Mevik
Prof Brian D Ripley [EMAIL PROTECTED] writes:

 On Tue, 12 Aug 2003, [iso-8859-1] Bjørn-Helge Mevik wrote:

 Also, is this example (lm(y~x+I(x^2), Df)) really balanced?  I think

 No, and I did not use summary,aov on it!

And I didn't say you did!

 This gives the SSs R(x | A, B, A:B, x^2), R(x^2 | A, B, A:B, x) and
 R(A:B | A, B, x, x^2).  The SS for x is not marginal as defined
 above.

 But that *is* how `marginal' is usually defined.

Ok.

 Why should I(x^2) be regarded as subservient to x?

In polynomial regression, it is usual to first consider a linear
model, then a quadratic, and so forth.  The interesting tests are usually
then the effect of a power of x whith all lower degree terms of x in the
model.  I thought it would be natural to treat polynomials of
continuous variables similarly in models with categorical variables as
well.

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Generating .R and .Rd files with Sweave/noweb?

2003-01-09 Thread Bjørn-Helge Mevik
Paul

You're right.  My primary goal was to write all the code and
documentation in one file, and split this into one .R file and
multiple .Rd files.  I got the idea of using Sweave/noweb because I'm
using Emacs with ESS, and I'd like to be in R-mode when I'm in a code
part of the file, and in Rd-mode in a documentation part.  I guess
using two files and a shell script, as you do, might be the best
solution.

-- 
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help