>>>>> Duncan Murdoch .... >>>>> on Thu, 17 May 2018 12:13:01 -0400 writes:
> On 17/05/2018 11:53 AM, Martin Maechler wrote: >>>>>>> Kevin Coombes ... on Thu, 17 >>>>>>> May 2018 11:21:23 -0400 writes: >> [..................] >> > [3] Should the documentation (man page) for "eigen" or >> > "mvrnorm" include a warning that the results can change >> > from machine to machine (or between things like 32-bit and >> > 64-bit R on the same machine) because of difference in >> > linear algebra modules? (Possibly including the statement >> > that "set.seed" won't save you.) >> The problem is that most (young?) people do not read help >> pages anymore. >> >> help(eigen) has contained the following text for years, >> and in spite of your good analysis of the problem you >> seem to not have noticed the last semi-paragraph: >> >>> Value: >>> >>> The spectral decomposition of ‘x’ is returned as a list >>> with components >>> >>> values: a vector containing the p eigenvalues of ‘x’, >>> sorted in _decreasing_ order, according to ‘Mod(values)’ >>> in the asymmetric case when they might be complex (even >>> for real matrices). For real asymmetric matrices the >>> vector will be complex only if complex conjugate pairs >>> of eigenvalues are detected. >>> >>> vectors: either a p * p matrix whose columns contain the >>> eigenvectors of ‘x’, or ‘NULL’ if ‘only.values’ is >>> ‘TRUE’. The vectors are normalized to unit length. >>> >>> Recall that the eigenvectors are only defined up to a >>> constant: even when the length is specified they are >>> still only defined up to a scalar of modulus one (the >>> sign for real matrices). >> >> It's not a warning but a "recall that" .. maybe because >> the author already assumed that only thorough users would >> read that and for them it would be a recall of something >> they'd have learned *and* not entirely forgotten since >> ;-) >> > I don't think you're really being fair here: the text in > ?eigen doesn't make clear that eigenvector values are not > reproducible even within the same version of R, and > there's nothing in ?mvrnorm to suggest it doesn't give > reproducible results. Ok, I'm sorry ... I definitely did not want to be unfair. I've always thought the remark in eigen was sufficient, but I'm probably wrong and we should add text explaining that it practically means that eigenvectors are only defined up to sign switches (in the real case) and hence results depend on the underlying {Lapack + BLAS} libraries and therefore are platform dependent. Even further, we could consider (optionally, by default FALSE) using defining a deterministic scheme for postprocessing the current output of eigen such that at least for the good cases where all eigenspaces are 1-dimensional, the postprocessing would result in reproducible signs, by e.g., ensuring the first non-zero entry of each eigenvector to be positive. MASS::mvrnorm() and mvtnorm::rmvnorm() both use "eigen", whereas mvtnorm::rmvnorm() *does* have method = "chol" which AFAIK does not suffer from such problems. OTOH, the help page of MASS::mvrnorm() mentions the Cholesky alternative but prefers eigen for better stability (without saying more). In spite of that, my personal recommendation would be to use mvtnorm::rmvnorm(.., method = "chol") { or the 2-3 lines of R code to the same thing without an extra package, just using rnorm(), chol() and simple matrix operations } because in simulations I'd expect the var-cov matrix Sigma to be far enough away from singular for chol() to be stable. Martin ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel