[R] extract t-values from pairwise.t.test
Hi, how can I extract the t-values after running a pairwise.t.test? The output just list the p-values. Many thanks for your help. Cheers Guido Guido J. Parra School of Tropical Environment Studies and Geography James Cook University Townsville Queensland 4811 Phone: 61 7 47815824 Fax:61 7 47814020 Mobile: 0437630843 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-help Digest, Vol 30, Issue 6
On Fri, 5 Aug 2005 Julia Reid wrote: Subject: [R] GAP pointer I am trying to do a simple segregation analysis using the GAP package. I have the documentation for pointer but I desperately need an example so that I can see how to format the datfile and the jobfile. For each individual, I have FamilyId, SubjectId, FatherId, MotherId, and AffectedStatus (0/1). I would like to obtain the likelihood ratio statistic for transmission. I would greatly appreciate any help on this subject. Best to all, Julia Reid I wouldn't use Pointer myself (there are lots of more recent packages*), but look at the examples in http://cedar.genetics.soton.ac.uk/pub/PROGRAMS/pointer/pointer.tar.Z and the manual, which is in the book: Morton N.E., Rao D.C Lalouel J-M (1983). Methods in Genetic Epidemiology. Karger PO Box, CH-4009 Basel (Switzerland). ISBN 3-8055-3668-2 which you will find in many academic libraries. David Duffy. * Don't you use Pap or JPap at Myriad? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] chisq.test
Hi I am trying to use this function. Can anyone show me how I would input the following example? Chi-Squared = (40-30)^2 + (20-30)^2 + (30-30)^2 30 30 30 = 3.333 + 3.333 + 0 = 6.666 (p value = 0.036) I want to be able to use different denominators so can you show me how I can do it to accommodate these rather than assuming they are all the same. Thanks. Stephen -- No virus found in this outgoing message. Checked by AVG Anti-Virus. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Alerte Virus
Le mail envoye a [EMAIL PROTECTED] le lundi 08 août contient un virus __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extract t-values from pairwise.t.test
Hallo My output lists more than p-values ttt-t.test(rnorm(10), rnorm(10), paired=T) ttt Paired t-test data: rnorm(10) and rnorm(10) t = 1.7508, df = 9, p-value = 0.1139 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.1750263 1.3735176 sample estimates: mean of the differences 0.5992456 str(ttt) List of 9 $ statistic : Named num 1.75 ..- attr(*, names)= chr t $ parameter : Named num 9 ..- attr(*, names)= chr df $ p.value: num 0.114 $ conf.int : atomic [1:2] -0.175 1.374 ..- attr(*, conf.level)= num 0.95 $ estimate : Named num 0.599 ..- attr(*, names)= chr mean of the differences $ null.value : Named num 0 ..- attr(*, names)= chr difference in means $ alternative: chr two.sided $ method : chr Paired t-test $ data.name : chr rnorm(10) and rnorm(10) - attr(*, class)= chr htest ttt$statistic t 1.750790 The output is list and you can call any part of it by its name or by [] braces. HTH Petr On 8 Aug 2005 at 16:26, Guido Parra Vergara wrote: Hi, how can I extract the t-values after running a pairwise.t.test? The output just list the p-values. Many thanks for your help. Cheers Guido Guido J. Parra School of Tropical Environment Studies and Geography James Cook University Townsville Queensland 4811 Phone: 61 7 47815824 Fax:61 7 47814020 Mobile: 0437630843 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extract t-values from pairwise.t.test
I think the questioner was interested in pairwise.t.test. See ?pairwise.t.test. From looking at the source, pairwise.t.test calls t.test if sd's are not pooled, or calculates its own t.val if sds are pooled. It looks very easy to hack to return the t values instead of the p values. Simon. At 04:46 PM 8/08/2005, Petr Pikal wrote: Hallo My output lists more than p-values ttt-t.test(rnorm(10), rnorm(10), paired=T) ttt Paired t-test data: rnorm(10) and rnorm(10) t = 1.7508, df = 9, p-value = 0.1139 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.1750263 1.3735176 sample estimates: mean of the differences 0.5992456 str(ttt) List of 9 $ statistic : Named num 1.75 ..- attr(*, names)= chr t $ parameter : Named num 9 ..- attr(*, names)= chr df $ p.value: num 0.114 $ conf.int : atomic [1:2] -0.175 1.374 ..- attr(*, conf.level)= num 0.95 $ estimate : Named num 0.599 ..- attr(*, names)= chr mean of the differences $ null.value : Named num 0 ..- attr(*, names)= chr difference in means $ alternative: chr two.sided $ method : chr Paired t-test $ data.name : chr rnorm(10) and rnorm(10) - attr(*, class)= chr htest ttt$statistic t 1.750790 The output is list and you can call any part of it by its name or by [] braces. HTH Petr On 8 Aug 2005 at 16:26, Guido Parra Vergara wrote: Hi, how can I extract the t-values after running a pairwise.t.test? The output just list the p-values. Many thanks for your help. Cheers Guido Guido J. Parra School of Tropical Environment Studies and Geography James Cook University Townsville Queensland 4811 Phone: 61 7 47815824 Fax:61 7 47814020 Mobile: 0437630843 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extract t-values from pairwise.t.test
Guido Parra Vergara [EMAIL PROTECTED] writes: Hi, how can I extract the t-values after running a pairwise.t.test? The output just list the p-values. Many thanks for your help. It's not a very complicated function. Why not just modify it to your needs? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] chisq.test
Stephen Choularton [EMAIL PROTECTED] writes: Hi I am trying to use this function. Can anyone show me how I would input the following example? Chi-Squared = (40-30)^2 + (20-30)^2 + (30-30)^2 30 30 30 = 3.333 + 3.333 + 0 = 6.666 (p value = 0.036) chisq.test(c(40,30,20)) Chi-squared test for given probabilities data: c(40, 30, 20) X-squared = 6.6667, df = 2, p-value = 0.03567 I want to be able to use different denominators so can you show me how I can do it to accommodate these rather than assuming they are all the same. No. You want to test different *hypotheses* about the distribution on the three groups. E.g. for 2:1:1 split: chisq.test(c(40,30,20),p=c(.5,.25,.25)) Chi-squared test for given probabilities data: c(40, 30, 20) X-squared = 3., df = 2, p-value = 0.1889 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Searchable Mailing List Archives Down
Hi, I was a user of the searchable Mail Archives which you have linked from somewhere on your homepage. http://maths.newcastle.edu.au/~rking/R/ This link is out of order. I know there is a search at GMANE and MARC, but the results were not that nice. Is it possible to recreate this searchable archive somewhere? I think it used google. Thank you very much Ido Tamir __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extract t-values from pairwise.t.test
Hi, I am not familiar with changing R functions. I can see in the code that t-values get calculated as t. val, however when I modified the code to include t.val under ans and then run the modified function I get Object t.val not found. How do I properly modify the function to list t. val in the output? Thanks Guido At 05:20 PM 8/08/2005, Peter Dalgaard wrote: Guido Parra Vergara [EMAIL PROTECTED] writes: Hi, how can I extract the t-values after running a pairwise.t.test? The output just list the p-values. Many thanks for your help. It's not a very complicated function. Why not just modify it to your needs? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 Guido J. Parra School of Tropical Environment Studies and Geography James Cook University Townsville Queensland 4811 Phone: 61 7 47815824 Fax:61 7 47814020 Mobile: 0437630843 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] nested logit with latent classes
Hi Everybody, I am interested in estimating nested logit with latent classes at the lower level. I have seen the codes for conditional logit and latent class analysis but I havent found anything about nested logit all the more nested logit with latent classes. Could you help me with appropriate coding? Thanks for your help Best regards, Agnieszka Prokopowicz M.Sc. Agnieszka Prokopowicz Department of Electronic Commerce Goethe-University Mertonstrasse 17 60054 Frankfurt/Main Germany Phone: ++49 69 798 28862 Fax: ++49 69 798 28973 Email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] installing problems about randomForest
Hi all, When I tried to install package randomForest, it gave out the following error message: install.packages(randomForest, dependencies = TRUE) trying URL 'http://www.lmbe.seu.edu.cn/CRAN/src/contrib/randomForest_4.5-12.tar.gz' Content type 'application/x-gzip' length 82217 bytes opened URL == downloaded 80Kb Cannot create directory : No such file or directory * Installing *source* package 'randomForest' ... ** libs gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c classTree.c -o classTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regTree.c -o regTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regrf.c -o regrf.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c rf.c -o rf.o f77 -OPT:IEEE_NaN_inf=ON-O2 -c rfsub.f -o rfsub.o rfsub.f, line 90: error(2346): expression must have logical or integer type if (decsplit 0.0) decsplit = 0.0 ^ rfsub.f, line 90: error(2051): expected a ) if (decsplit 0.0) decsplit = 0.0 ^ 2 errors detected in the compilation of rfsub.f. gmake: *** [rfsub.o] Error 2 ERROR: compilation failed for package 'randomForest' Can somebody help me ? Thanks in advance. Xiao Jianfeng __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] installing problems about randomForest
Xiao Jianfeng wrote: Hi all, When I tried to install package randomForest, it gave out the following error message: install.packages(randomForest, dependencies = TRUE) trying URL 'http://www.lmbe.seu.edu.cn/CRAN/src/contrib/randomForest_4.5-12.tar.gz' Content type 'application/x-gzip' length 82217 bytes opened URL == downloaded 80Kb Cannot create directory : No such file or directory * Installing *source* package 'randomForest' ... ** libs gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c classTree.c -o classTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regTree.c -o regTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regrf.c -o regrf.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c rf.c -o rf.o f77 -OPT:IEEE_NaN_inf=ON-O2 -c rfsub.f -o rfsub.o rfsub.f, line 90: error(2346): expression must have logical or integer type if (decsplit 0.0) decsplit = 0.0 ^ rfsub.f, line 90: error(2051): expected a ) if (decsplit 0.0) decsplit = 0.0 ^ 2 errors detected in the compilation of rfsub.f. gmake: *** [rfsub.o] Error 2 ERROR: compilation failed for package 'randomForest' Can somebody help me ? Which OS/platform and compiler are we talking about? Uwe Ligges Thanks in advance. Xiao Jianfeng __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extract t-values from pairwise.t.test
Hallo I am not sure but this could be what you want. You has to change function compare.levels not only add t.val in ans. If you want t- values AND p-values together in one table it probably is not so simple. my.pairded.t.test - function (x, g, p.adjust.method = p.adjust.methods, pool.sd = TRUE, ...) { DNAME - paste(deparse(substitute(x)), and, deparse(substitute(g))) g - factor(g) p.adjust.method - match.arg(p.adjust.method) if (pool.sd) { METHOD - t tests with pooled SD xbar - tapply(x, g, mean, na.rm = TRUE) s - tapply(x, g, sd, na.rm = TRUE) n - tapply(!is.na(x), g, sum) degf - n - 1 total.degf - sum(degf) pooled.sd - sqrt(sum(s^2 * degf)/total.degf) compare.levels - function(i, j) { dif - xbar[i] - xbar[j] se.dif - pooled.sd * sqrt(1/n[i] + 1/n[j]) t.val - dif/se.dif # 2 * pt(-abs(t.val), total.degf) this is commented out t.val# this is added } } else { METHOD - t tests with non-pooled SD compare.levels - function(i, j) { xi - x[as.integer(g) == i] xj - x[as.integer(g) == j] t.test(xi, xj, ...)$statistic # this is changed in case # pool.sd=F } } PVAL - pairwise.table(compare.levels, levels(g), p.adjust.method) ans - list(method = METHOD, data.name = DNAME, p.value = PVAL, p.adjust.method = p.adjust.method) class(ans) - pairwise.htest ans } HTH Petr On 8 Aug 2005 at 18:28, Guido Parra Vergara wrote: Hi, I am not familiar with changing R functions. I can see in the code that t-values get calculated as t. val, however when I modified the code to include t.val under ans and then run the modified function I get Object t.val not found. How do I properly modify the function to list t. val in the output? Thanks Guido At 05:20 PM 8/08/2005, Peter Dalgaard wrote: Guido Parra Vergara [EMAIL PROTECTED] writes: Hi, how can I extract the t-values after running a pairwise.t.test? The output just list the p-values. Many thanks for your help. It's not a very complicated function. Why not just modify it to your needs? -- O__ Peter Dalgaard Řster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 Guido J. Parra School of Tropical Environment Studies and Geography James Cook University Townsville Queensland 4811 Phone: 61 7 47815824 Fax:61 7 47814020 Mobile: 0437630843 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Searchable Mailing List Archives Down
The site by Robert King is now working again. I admit that I had problems accessing it too few times last week. By adding the phrase site:https://stat.ethz.ch/pipermail/r-help/; in my google search, I got something that resembled that output but it did specify from which archive and month the results came from. You might also find the search engine by Jonathan Baron (http://finzi.psych.upenn.edu/nmz.html) useful and it searched the documentations and functions as well as mail archives. Regards, Adai On Mon, 2005-08-08 at 10:11 +0200, Ido M. Tamir wrote: Hi, I was a user of the searchable Mail Archives which you have linked from somewhere on your homepage. http://maths.newcastle.edu.au/~rking/R/ This link is out of order. I know there is a search at GMANE and MARC, but the results were not that nice. Is it possible to recreate this searchable archive somewhere? I think it used google. Thank you very much Ido Tamir __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] installing problems about randomForest
On Mon, 8 Aug 2005, Uwe Ligges wrote: Xiao Jianfeng wrote: Hi all, When I tried to install package randomForest, it gave out the following error message: install.packages(randomForest, dependencies = TRUE) trying URL 'http://www.lmbe.seu.edu.cn/CRAN/src/contrib/randomForest_4.5-12.tar.gz' Content type 'application/x-gzip' length 82217 bytes opened URL == downloaded 80Kb Cannot create directory : No such file or directory I have no idea what that is about. * Installing *source* package 'randomForest' ... ** libs gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c classTree.c -o classTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regTree.c -o regTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regrf.c -o regrf.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c rf.c -o rf.o f77 -OPT:IEEE_NaN_inf=ON-O2 -c rfsub.f -o rfsub.o rfsub.f, line 90: error(2346): expression must have logical or integer type if (decsplit 0.0) decsplit = 0.0 ^ rfsub.f, line 90: error(2051): expected a ) if (decsplit 0.0) decsplit = 0.0 ^ 2 errors detected in the compilation of rfsub.f. gmake: *** [rfsub.o] Error 2 ERROR: compilation failed for package 'randomForest' Can somebody help me ? Which OS/platform and compiler are we talking about? From his multitudinous recent postings, a peculiar IRIX setup. However, that line is not valid Fortran: replace by .LT. Please note the posting guide asks you to discuss problems in packages with the maintainer (and to state your platform and R version). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help on regression by subsets in dataset
Hi Everyone May I request for a small help while performing the regression analysis. I would like to know is there any possibility of conducting the regression for different data subsets (in the same data file), classified on the basis of grouping variable. The alternative for this is running the regression for n number of times which you all know is quite cumbersome. Thank you for your kind attention and help rgds krishna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R: cbind
hi all are we able to combine column vectors of different lengths such that the result appears in matrix form? e.g. a=1 b=1:3 d=1:4 then z=CBIND(a,b,d) 1 1 1 2 2 3 3 4 i stil want the following! z[,1]=1 z[,2]=1:3 z[,3]=1:5 i made up the name of this function. we could use cbind but it does not seem to allows this! thanking you in advance. / allan__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R: matrix sizes
hi all assume that one is doing a simulation. in each iteration one produces a vector of results. this vectors length might change for each different iteration. how can one construct a matrix that contains all of the interation results in a matrix where each of the columns are the outputs from the different interations. how would have to define the output matrix initally? / thanking you in advance__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: matrix sizes
Clark Allan wrote: hi all assume that one is doing a simulation. in each iteration one produces a vector of results. this vectors length might change for each different iteration. how can one construct a matrix that contains all of the interation results in a matrix where each of the columns are the outputs from the different interations. how would have to define the output matrix initally? Of course, you define it to the maximal_length x number_iterations, but in fact you probably want a list rather than a matrix. Uwe Ligges / thanking you in advance __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: cbind
Clark Allan wrote: hi all are we able to combine column vectors of different lengths such that the result appears in matrix form? e.g. a=1 b=1:3 d=1:4 then z=CBIND(a,b,d) 1 1 1 2 2 3 3 4 i stil want the following! z[,1]=1 z[,2]=1:3 z[,3]=1:5 i made up the name of this function. we could use cbind but it does not seem to allows this! See my other message: You probably want a list. If not, the sparce matrix classes provided by package Matrix might be worth considering. Uwe Ligges thanking you in advance. / allan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: cbind
Clark Allan wrote: hi all are we able to combine column vectors of different lengths such that the result appears in matrix form? e.g. a=1 b=1:3 d=1:4 then z=CBIND(a,b,d) 1 1 1 2 2 3 3 4 i stil want the following! z[,1]=1 z[,2]=1:3 z[,3]=1:5 i made up the name of this function. we could use cbind but it does not seem to allows this! thanking you in advance. / allan Hi, Allan, How about the following: cbind.all - function(..., fill.with = NA) { args - list(...) len - sapply(args, NROW) if(diff(rng - range(len)) 0) { maxlen - rng[2] pad - function(x, n) c(x, rep(fill.with, n)) for(j in seq(along = args)) { if(maxlen == len[j]) next if(is.data.frame(args[[j]])) { args[[j]] - lapply(args[[j]], pad, maxlen - len[j]) args[[j]] - as.data.frame(args[[j]]) } else if(is.matrix(args[[j]])) { args[[j]] - apply(args[[j]], 2, pad, maxlen - len[j]) } else if(is.vector(args[[j]])) { args[[j]] - pad(args[[j]], maxlen - len[j]) } else { stop(... must only contain data.frames or arrays.) } } } do.call(cbind, args) } cbind.all(data.frame(a=1), data.frame(a=c(2,1)), x = 1, y = matrix(1:4,2,2)) cbind.all(a = 1, b = 1:3, d = 1:4) HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Groups in histograms?
Dear list, I would like to create histograms for up to three groups, with distincive colour/pattern, in a trellis panel. However, I have not been able to find a way to do this. histogram does not seem to have a group argument. Please help. /Fredrik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: cbind
On 8/8/05, Clark Allan [EMAIL PROTECTED] wrote: hi all are we able to combine column vectors of different lengths such that the result appears in matrix form? e.g. a=1 b=1:3 d=1:4 then z=CBIND(a,b,d) 1 1 1 2 2 3 3 4 i stil want the following! z[,1]=1 z[,2]=1:3 z[,3]=1:5 i made up the name of this function. we could use cbind but it does not seem to allows this! There are a number of alternatives: # 1. just create a list x1 - list(a = 1, b = 1:3, c = 1:4) # 2. create a ts object: x2 - do.call(cbind, lapply(x1, ts)) # 3. create a matrix from the ts object x3 - unclass(do.call(cbind, lapply(d, ts))) tsp(x3) - NULL __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: cbind
On 8/8/05, Gabor Grothendieck [EMAIL PROTECTED] wrote: On 8/8/05, Clark Allan [EMAIL PROTECTED] wrote: hi all are we able to combine column vectors of different lengths such that the result appears in matrix form? e.g. a=1 b=1:3 d=1:4 then z=CBIND(a,b,d) 1 1 1 2 2 3 3 4 i stil want the following! z[,1]=1 z[,2]=1:3 z[,3]=1:5 i made up the name of this function. we could use cbind but it does not seem to allows this! There are a number of alternatives: # 1. just create a list x1 - list(a = 1, b = 1:3, c = 1:4) # 2. create a ts object: x2 - do.call(cbind, lapply(x1, ts)) # 3. create a matrix from the ts object x3 - unclass(do.call(cbind, lapply(d, ts))) tsp(x3) - NULL That last one should have been: x3 - unclass(x2) tsp(x3) - NULL __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help on regression by subsets in dataset
you could use function lmList() from the nlme package, i.e., dat - data.frame(y = rnorm(120), x = runif(120, -3, 3), g = rep(1:3, each = 40)) library(nlme) m - lmList(y ~ x | g, data = dat) m summary(m) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Krishna [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Monday, August 08, 2005 12:20 PM Subject: [R] help on regression by subsets in dataset Hi Everyone May I request for a small help while performing the regression analysis. I would like to know is there any possibility of conducting the regression for different data subsets (in the same data file), classified on the basis of grouping variable. The alternative for this is running the regression for n number of times which you all know is quite cumbersome. Thank you for your kind attention and help rgds krishna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help on regression by subsets in dataset
On 8/8/05, Krishna [EMAIL PROTECTED] wrote: Hi Everyone May I request for a small help while performing the regression analysis. I would like to know is there any possibility of conducting the regression for different data subsets (in the same data file), classified on the basis of grouping variable. The alternative for this is running the regression for n number of times which you all know is quite cumbersome. This defines a model which has a separate intercept and slope for each value of the grouping variable g: # sample data x - 1:12 g - gl(4,3) g [1] 1 1 1 2 2 2 3 3 3 4 4 4 Levels: 1 2 3 4 set.seed(1) y - rnorm(12) # now define the model and run the regression lm(y ~ g/x - 1) Call: lm(formula = y ~ g/x - 1) Coefficients: g1g2g3g4 g1:x g2:x g3:x g4:x -0.21697 6.40748 0.24710 -3.29170 -0.10459 -1.20787 0.04418 0.34762 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Groups in histograms?
Hi Gary, I have found this, but it is not exactly what I am looking for. What I need is the groups to be inside of a single panel, not in different panels. Kind of like an histogram version of the xyplot(Y ~ X1, groups=X2,panel=panel.superpose) command. (I hope this is correct). /Fredrik On 8/8/05, Gary Collins [EMAIL PROTECTED] wrote: Have a look at the histogram function in the Lattice package. if x are your data to be displayed and y is your grouping variable you can just do histogram(~x|y) HTH Gary On 08/08/05, Fredrik Karlsson [EMAIL PROTECTED] wrote: Dear list, I would like to create histograms for up to three groups, with distincive colour/pattern, in a trellis panel. However, I have not been able to find a way to do this. histogram does not seem to have a group argument. Please help. /Fredrik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- My Gentoo + PVR-350 + IVTV + MythTV blog is on http://gentoomythtv.blogspot.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] installing problems about randomForest
Prof Brian Ripley wrote: On Mon, 8 Aug 2005, Uwe Ligges wrote: Xiao Jianfeng wrote: Hi all, When I tried to install package randomForest, it gave out the following error message: install.packages(randomForest, dependencies = TRUE) trying URL 'http://www.lmbe.seu.edu.cn/CRAN/src/contrib/randomForest_4.5-12.tar.gz' Content type 'application/x-gzip' length 82217 bytes opened URL == downloaded 80Kb Cannot create directory : No such file or directory I have no idea what that is about. In my .cshrc, I set R_LIBS like this: ' setenv R_LIBS=$HOME/local/lib/R/library ', is it OK? * Installing *source* package 'randomForest' ... ** libs gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c classTree.c -o classTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regTree.c -o regTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regrf.c -o regrf.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c rf.c -o rf.o f77 -OPT:IEEE_NaN_inf=ON-O2 -c rfsub.f -o rfsub.o rfsub.f, line 90: error(2346): expression must have logical or integer type if (decsplit 0.0) decsplit = 0.0 ^ rfsub.f, line 90: error(2051): expected a ) if (decsplit 0.0) decsplit = 0.0 ^ 2 errors detected in the compilation of rfsub.f. gmake: *** [rfsub.o] Error 2 ERROR: compilation failed for package 'randomForest' Can somebody help me ? Which OS/platform and compiler are we talking about? SGI IRIX 6.5, gcc 3.3, and f77 shiped with IRIX From his multitudinous recent postings, a peculiar IRIX setup. However, that line is not valid Fortran: replace by .LT. Please note the posting guide asks you to discuss problems in packages with the maintainer (and to state your platform and R version). Thanks, I will try to contact the maintainer. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] selecting outliers
Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] linkage disequilibrium
Date: Thu, 4 Aug 2005 19:36:35 +0200 From: Cristian [EMAIL PROTECTED] Subject: [R] linkage disequilibrium To: r-help@stat.math.ethz.ch Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=ISO-8859-1 I'm using the package Genetics, and I'm interested in the computation of D' statistics for Linkage Disequilibrium, for which the LD() command has been realised. Unfortunately I don't find any reference on how the D' is computed by the LD() function. In the package documentation it is generally referred as MLE estimation, but references are not provided. Does anybody knows how it is obtained or, at least, some references? Are there any other R package performing the D' computation both for phased and unphased genotype? Thanks! Cristian You need to look at the code: getAnywhere(LD.genotype) See any standard reference such as Bruce Weir's _Genetic Data Analysis_ (Sinauer Associates) or Pak Sham's book on statistical genetics for the background to the algorithm. The chi-square testing D=0 from LD() is twice what it should be, and you may be confused (I know I was) by the fact that the marginal allele frequencies are estimated using non-missing data for each locus in turn. This means the bounds (pmin and pmax) for the AB haplotype frequency are different from that in the actual table used to maximize the likelihood. So, you will get different answers from programs using jointly complete observations only. Several other packages for haplotype analysis are on CRAN. Package haplo.stats has the haplo.em() function to give the MLEs for the haplotype frequencies. From these you can easily calculate D etc. Package hwde estimates nonstandard disequilibrium coefficients in a loglinear framework, and can be used to compare different sample disequilibria. Note that haplo.stats and hapassoc are aimed specifically at comparing groups or testing for association to other traits. My package gllm is not as easy to use but can combine phased and unphased data in loglinear models -- you could probably use cat in the same way. David Duffy. | David Duffy (MBBS PhD) ,-_|\ | email: [EMAIL PROTECTED] ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] selecting outliers
Hi Alessandro, On Mon, 8 Aug 2005, alessandro carletti wrote: Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. The answer is no. The reasons are not technical. There are some quite easy outlier detection approaches around (e.g., compute robust Mahalanobis distances with cov.mcd/mahalanobis and call the points with too large distances outliers). But the main problem is that the term outlier has no objective, unique meaning. It depends crucially on your aims and on the assumptions you want to make about the non-outliers in the dataset (which should be elliptically distributed and homogeneously close to a multivariate normal distribution for the Mahalanobis approach). Best, Christian *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] selecting outliers
Perhaps what Alessandro is after is simpler than that: Making a plot of data in a data frame, being able to click on 'suspicious points', getting the corresponding rows of a data out in a new data frame (for further inspection) while keeping the 'good points' in the plot (and perhaps redoing some calculations on the basis of the good points only). This could then go on in an iterative way. That would be a perfectly sensible thing to do. How difficult it is technically I don't know, but it seems that it would require a call-back mechanism from a plot window to R (and a more 'advanced' one than provided by 'locator()'). Best regards Søren -Oprindelig meddelelse- Fra: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] På vegne af Christian Hennig Sendt: 8. august 2005 14:45 Til: alessandro carletti Cc: rHELP Emne: Re: [R] selecting outliers Hi Alessandro, On Mon, 8 Aug 2005, alessandro carletti wrote: Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. The answer is no. The reasons are not technical. There are some quite easy outlier detection approaches around (e.g., compute robust Mahalanobis distances with cov.mcd/mahalanobis and call the points with too large distances outliers). But the main problem is that the term outlier has no objective, unique meaning. It depends crucially on your aims and on the assumptions you want to make about the non-outliers in the dataset (which should be elliptically distributed and homogeneously close to a multivariate normal distribution for the Mahalanobis approach). Best, Christian *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] selecting outliers
Hi, if Soren is right, why not take a look on the identify help page? Christian On Mon, 8 Aug 2005, Søren Højsgaard wrote: Perhaps what Alessandro is after is simpler than that: Making a plot of data in a data frame, being able to click on 'suspicious points', getting the corresponding rows of a data out in a new data frame (for further inspection) while keeping the 'good points' in the plot (and perhaps redoing some calculations on the basis of the good points only). This could then go on in an iterative way. That would be a perfectly sensible thing to do. How difficult it is technically I don't know, but it seems that it would require a call-back mechanism from a plot window to R (and a more 'advanced' one than provided by 'locator()'). Best regards Søren -Oprindelig meddelelse- Fra: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] På vegne af Christian Hennig Sendt: 8. august 2005 14:45 Til: alessandro carletti Cc: rHELP Emne: Re: [R] selecting outliers Hi Alessandro, On Mon, 8 Aug 2005, alessandro carletti wrote: Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. The answer is no. The reasons are not technical. There are some quite easy outlier detection approaches around (e.g., compute robust Mahalanobis distances with cov.mcd/mahalanobis and call the points with too large distances outliers). But the main problem is that the term outlier has no objective, unique meaning. It depends crucially on your aims and on the assumptions you want to make about the non-outliers in the dataset (which should be elliptically distributed and homogeneously close to a multivariate normal distribution for the Mahalanobis approach). Best, Christian *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] coefficient of polynomial expansion
Hi, I would like to get the coefficient of polynomial expansion. For example, (1+ x)^2 = 1 + 2x + x^2, and the coefficients are 1, 2 and 1. (1 + x + x^2)^3 = 1 + 3*x + 6*x^2 + 7*x^3 + 6*x^4 + 3*x^5 + x^6, and the coefficients are 1, 3, 6, 7, 6, 3, and 1. I know that we can use polynom library. Is there any other way to do it without loading a library. Thanks a lot for your help. Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] installing problems about randomForest
From: Prof Brian Ripley On Mon, 8 Aug 2005, Uwe Ligges wrote: Xiao Jianfeng wrote: Hi all, When I tried to install package randomForest, it gave out the following error message: install.packages(randomForest, dependencies = TRUE) trying URL 'http://www.lmbe.seu.edu.cn/CRAN/src/contrib/randomForest_4.5- 12.tar.gz' Content type 'application/x-gzip' length 82217 bytes opened URL == downloaded 80Kb Cannot create directory : No such file or directory I have no idea what that is about. * Installing *source* package 'randomForest' ... ** libs gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c classTree.c -o classTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regTree.c -o regTree.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c regrf.c -o regrf.o gcc -I/user_data2/jfxiao/local/lib/R/include -I/usr/freeware/include -g -O2 -c rf.c -o rf.o f77 -OPT:IEEE_NaN_inf=ON-O2 -c rfsub.f -o rfsub.o rfsub.f, line 90: error(2346): expression must have logical or integer type if (decsplit 0.0) decsplit = 0.0 ^ rfsub.f, line 90: error(2051): expected a ) if (decsplit 0.0) decsplit = 0.0 ^ 2 errors detected in the compilation of rfsub.f. gmake: *** [rfsub.o] Error 2 ERROR: compilation failed for package 'randomForest' Can somebody help me ? Which OS/platform and compiler are we talking about? From his multitudinous recent postings, a peculiar IRIX setup. However, that line is not valid Fortran: replace by .LT. Please note the posting guide asks you to discuss problems in packages with the maintainer (and to state your platform and R version). Thanks for catching that! It must have slipped when I was going back and forth between C and Fortran... Update should appear on CRAN this week, I hope. Best, Andy -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] xyplot with 2 y-axes
Dear R-helpers, I'm trying to get xyplot to plot 2 y-axes. I looked at the examples and googled around. This is how far I got so far. test-data.frame(a=rnorm(100),b=rnorm(100)*10,ind=rep(1:10,10),con=rep (1:10,rep(10,10))) xyplot(a+b~con|ind,data=test,allow.multiple=T) This however puts a+b on one axis. I'd need e.g. a as left and b as right axis. Best wishes, Nathan -- $platform [1] powerpc-apple-darwin7.9.0 $arch [1] powerpc $os [1] darwin7.9.0 $system [1] powerpc, darwin7.9.0 $status [1] $major [1] 2 $minor [1] 1.1 $year [1] 2005 $month [1] 06 $day [1] 20 $language [1] R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] installing problems about randomForest
Liaw, Andy wrote: Thanks for catching that! It must have slipped when I was going back and forth between C and Fortran... Update should appear on CRAN this week, I hope. Best, Andy Thanks for your quick replay. I just want you to know that I have tried R 2.1.0 on my pc running Debian etch. I used 'apt-get r-base' to install R, and 'install(randomForest, dependencies = TRUE)' from within R to install randomForest, but it failed again. Regards, Xiao Jianfeng __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] heatmap -- invisible list?
Hi all, In heatmap's documentation, it mentions that the output value is actually an invisible list...how would one access this list? Thanks, Jake __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] coefficient of polynomial expansion
On 8/8/05, Peter Yang [EMAIL PROTECTED] wrote: Hi, I would like to get the coefficient of polynomial expansion. For example, (1+ x)^2 = 1 + 2x + x^2, and the coefficients are 1, 2 and 1. (1 + x + x^2)^3 = 1 + 3*x + 6*x^2 + 7*x^3 + 6*x^4 + 3*x^5 + x^6, and the coefficients are 1, 3, 6, 7, 6, 3, and 1. Use symbolic differentiation as in: https://stat.ethz.ch/pipermail/r-help/2004-December/060797.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: matrix sizes
1) If the output at each iteration gives a fixed number of elements, then you can pre-define the matrix. For example mat - matrix( NA, nr=6, nc=500 ) for(i in 1:500 ){ x - rnorm(13) mat[ , i] - summary(x) } 2) If the length of the output varies at each iteration, then it is probably best to use a list. mylist - list(NULL) for(i in 1:500){ x - rpois(1, lambda=10) + 1 y - rnorm(x) my.list[[ i ]] - y } Regards, Adai On Mon, 2005-08-08 at 12:34 +0200, Clark Allan wrote: hi all assume that one is doing a simulation. in each iteration one produces a vector of results. this vectors length might change for each different iteration. how can one construct a matrix that contains all of the interation results in a matrix where each of the columns are the outputs from the different interations. how would have to define the output matrix initally? / thanking you in advance __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] heatmap -- invisible list?
On 8/8/05 9:45 AM, Jacob Michaelson [EMAIL PROTECTED] wrote: Hi all, In heatmap's documentation, it mentions that the output value is actually an invisible list...how would one access this list? Mylist - heatmap() __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] INDVAL and mvpart
Hi, I'd like to perform Dufrene-Legendre Indicator Species Analysis for a multivariate regression tree. However I have problems with arguments of duleg(veg,class,numitr=1000)function. How to obtain a vector of numeric class memberships for samples, or a classification object returned from mvpart? thanks in advance -- Best regards, Agnieszka Strzelczak mailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] using - with a changing variable name (substitute?)
I have a matrix r and a scalar d, and I would like to apply the following functions to each of its elements: 1. if r 0, no change 2. if 0 = r d, replace element by zero 3. if d = r, replace element by r-d I wrote a small function for this m - function(b) {sapply(b, function(bb) { if (bb 0) {bb} else {if (bbd) {bb-d} else 0} })} so I can simply say r - m(r). The problem is that the matrix r is huge and only one of them fits in the memory, and I don't need the original r, so I would like to do this memory-efficiently. Moreover, there are matrices with various names (not only r) so I need a generic function (they don't fit in memory at the same time, I load, save and rm them). I tried various combinations of -, assign, substitute etc. but could not get it working. Could somebody please help me? Tamas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] coefficient of polynomial expansion
Peter == Peter Yang [EMAIL PROTECTED] on Mon, 8 Aug 2005 09:11:47 -0400 writes: Peter Hi, I would like to get the coefficient of polynomial Peter expansion. For example, Peter (1+ x)^2 = 1 + 2x + x^2, and the coefficients are 1, Peter 2 and 1. (1 + x + x^2)^3 = 1 + 3*x + 6*x^2 + 7*x^3 + Peter 6*x^4 + 3*x^5 + x^6, and the coefficients are 1, 3, Peter 6, 7, 6, 3, and 1. Peter I know that we can use polynom library. Is there any Peter other way to do it without loading a library. yes, load the polynom *package* (from the library where the - *package* is installed) What's bad about using polynom? IMO it is very nice, very useful package for the purpose it was written. Using packages for tasks they were written is one of the strengths of the R project, so why are you reluctant? Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] using - with a changing variable name (substitute?)
On Mon, 8 Aug 2005, Tamas K Papp wrote: I have a matrix r and a scalar d, and I would like to apply the following functions to each of its elements: 1. if r 0, no change 2. if 0 = r d, replace element by zero 3. if d = r, replace element by r-d I wrote a small function for this m - function(b) {sapply(b, function(bb) { if (bb 0) {bb} else {if (bbd) {bb-d} else 0} })} Why use sapply? r[] - ifelse(r = d, r-d, ifelse(r = 0, 0, r)) is one more efficient way. so I can simply say r - m(r). The problem is that the matrix r is huge and only one of them fits in the memory, If that is literally true, you cannot do this at R level AFAICS. The only way I can see that you can do this with standard semantics is m(r) - d with a replacement function m-() written using .Call. Using m- - function(r, d) ifelse(r = d, r-d, ifelse(r = 0, 0, r)) is going to make several copies. and I don't need the original r, so I would like to do this memory-efficiently. Moreover, there are matrices with various names (not only r) so I need a generic function (they don't fit in memory at the same time, I load, save and rm them). I tried various combinations of -, assign, substitute etc. but could not get it working. Could somebody please help me? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] make error: X11/Intrinsic.h: No such,,,
Jake == Jake Michaelson [EMAIL PROTECTED] on Fri, 05 Aug 2005 14:39:49 -0600 writes: Jake Thanks for the help -- this morning someone (on the Jake Ubuntu boards) was kind enough to point this out to Jake me. Now if there were only a decent Linux front Jake end/gui for R... is ESS (http://ESS.r-project.org/) indecent to you ? Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] p-values
Spencer, Thank you for referring me to your other email on Exact goodness-of-fit test. However, I'm not entirely sure if what you mentioned is the same for my case. I'm not a statistician and it would help me if you could explain what you meant in a little more detail. Perhaps I need to explain the problem in more detail. I am looking for a way to calculate exaxt p-values by Monte Carlo Simulation for Durbin's test. Durbin's test statistic is similar to Friedman's statistic, but considers the case of Balanced Incomplete block designs. I have found a function written by Felipe de Mendiburu for calculating Durbin's statistic, which gives the chi-squared p-value. I have also been read an article by Torsten Hothorn On exact rank Tests in R (R News 1(1), 11–12.) and he has shown how to calculate Monte Carlo p-values using pperm. In the article by Torsten Hothorn he gives: R pperm(W, ranks, length(x)) He compares his method to that of StatXact, which is the program Rayner and Best suggested using. Is there a way to do this for example for the friedman test. A paper by Joachim Rohmel discusses The permutation distribution for the friendman test (Computational Statistics Data Analysis 1997, 26: 83-99). This seems to be on the lines of what I need, although I am not quite sure. Has anyone tried to recode his APL program for R? I have tried a number of things, all unsucessful. Searching through previous postings have not been very successful either. It seems that pperm is the way to go, but I would need help from someone on this. Any hints on how to continue would be much appreciated. Peter Spencer Graves wrote: Hi, Peter: Please see my reply of a few minutes ago subject: exact goodness-of-fit test. I don't know Rayner and Best, but the same method, I think, should apply. spencer graves Peter Ho wrote: HI R-users, I am trying to repeat an example from Rayner and Best A contingency table approach to nonparametric testing (Chapter 7, Ice cream example). In their book they calculate Durbin's statistic, D1, a dispersion statistics, D2, and a residual. P-values for each statistic is calculated from a chi-square distribution and also Monte Carlo p-values. I have found similar p-values based on the chi-square distribution by using: pchisq(12, df= 6, lower.tail=F) [1] 0.0619688 pchisq(5.1, df= 6, lower.tail=F) [1] 0.5310529 Is there a way to calculate the equivalent Monte Carlo p-values? The values were 0.02 and 0.138 respectively. The use of the approximate chi-square probabilities for Durbin's test are considered not good enough according to Van der Laan (The American Statistician 1988,42,165-166). Peter ESTG-IPVC __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] using - with a changing variable name (substitute?)
On Mon, Aug 08, 2005 at 04:07:10PM +0100, Prof Brian Ripley wrote: On Mon, 8 Aug 2005, Tamas K Papp wrote: m - function(b) {sapply(b, function(bb) { if (bb 0) {bb} else {if (bbd) {bb-d} else 0} })} Why use sapply? r[] - ifelse(r = d, r-d, ifelse(r = 0, 0, r)) is one more efficient way. Thank you very much, this speeded up things considerably. I need this operation to ignore the first column of the matrix, is there a more efficient way than r[,-1] - ifelse(r[,-1] = d, r[,-1]-d, ifelse(r[,-1] = 0, 0, r[,-1])) ? Tamas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] vector vs array
Hi! OK, I'm trying to select some useful outliers from my dataset: I defined 11 treshold values (1 for each level of a variable (sampling site) as follows: tresholds-function(x) { tapply(x,mm$NAME,FUN=mean ,simplify = T, na.rm=T)-med tapply(x,mm$NAME,FUN=sd ,simplify = T, na.rm=T)-standev standev+med } tresholds(mm$chl) Now I'd like to select those values from vector mm$chl that are higher than each treshold value, but how can I compare a vector with 1885 elements with the one with 11? Sorry for this (probably) stupid question... and thanks in advance. Alessandro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] make error: X11/Intrinsic.h: No such,,,
I use Mac OS X at home and Linux at work, so the R Aqua GUI has spoiled me. I have not seen its equal so far (on Windows or Linux). The most important thing to me is how easily accessible the help and documentation is. I like how when I begin typing a function, the form and arguments to the function automatically appear at the bottom bar, refreshing my memory. I like that all plots are output to on-screen PDF. I could go on and on, but I hope that someday we'll see something on Linux with the same polish and ease-of-use. Maybe when Cairo is integrated into Gnome it might make PDF plot display more feasible... On Mon, 2005-08-08 at 17:17 +0200, Martin Maechler wrote: Jake == Jake Michaelson [EMAIL PROTECTED] on Fri, 05 Aug 2005 14:39:49 -0600 writes: Jake Thanks for the help -- this morning someone (on the Jake Ubuntu boards) was kind enough to point this out to Jake me. Now if there were only a decent Linux front Jake end/gui for R... is ESS (http://ESS.r-project.org/) indecent to you ? Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] filled.contour help
Hello, I plot with filled.contour and have this problem. There is an area that I want to cover with angled shading lines to represent NA in my data. Very much appreciate help. Thanks, Mark pal - palette(gray(seq(1.,0.,len=8))) filled.contour(fvec,qvec,etsarray, levels=c(-.6,-.4,-.2,,0.,.2,.4,,.6,.8,,1.),zlim=c(-.6,1), xlab=xstring,ylab=ystring, col=pal,key.axes=axis(4,c(-.6,-.4,-.2,0.,.2,.4,.6,.8,1.))) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] get the wald chi square in binary logistic regression
hello, I work since a few time on R and i wanted to know how to obtain the Wald chi square value when you make a binary logistic regression. In fact, i have the z value and the signification but is there a script to see what is the value of Wald chi square. You can see my model below, Best regards, Séverine Erhel [Previously saved workspace restored] m3 = glm(reponse2 ~ form + factor(critere2) ,family=binomial,data=mes.donnees) summary (m3) Call: glm(formula = reponse2 ~ form + factor(critere2), family = binomial, data = mes.donnees) Deviance Residuals: Min 1Q Median 3Q Max -2.5402 0.2064 0.3354 0.4833 1.4177 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)0.5482 0.3930 1.395 0.1631 form Illustration3.2904 0.6478 5.080 3.78e-07 *** form Texte+illustration 2.6375 0.4746 5.557 2.74e-08 *** factor(critere2)2 -1.0973 0.5103 -2.150 0.0315 * factor(critere2)3 -0.9891 0.5107 -1.937 0.0528 . --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 227.76 on 218 degrees of freedom Residual deviance: 162.11 on 214 degrees of freedom AIC: 172.11 Number of Fisher Scoring iterations: 5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Building R 2.1.0 on AIX 5.2.0.0
Hello, The set of messages below reports a successful build of 2.0.0 on AIX 5.2 using GCC 3.3.2 I've been trying for a while now to build 2.1.0 and 2.1.1 and have been unsuccessful. I've tried: - GCC 3.3.2 and 4.0.1 - default AIX make and GNU make - 32-bit and 64-bit builds A typical configure output is: == R is now configured for powerpc-ibm-aix5.2.0.0 Source directory: . Installation directory:/db2blast/R/2.1.0 C compiler:gcc -mno-fp-in-toc -maix32 -g -O2 C++ compiler: g++ -g -O2 Fortran compiler: g77 -maix32 -O2 Interfaces supported: X11 External libraries: Additional capabilities: PNG, JPEG, MBCS, NLS Options enabled: R profiling Recommended packages: yes == In all cases make crashes with: == make[5]: Leaving directory `/db2blast/R/R-2.1.0/src/library/tools/src' make[4]: Leaving directory `/db2blast/R/R-2.1.0/src/library/tools/src' Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/db2blast/R/R-2.1.0/library/tools/libs/tools.so': Execution halted make[3]: *** [all] Error 1 make[3]: Leaving directory `/db2blast/R/R-2.1.0/src/library/tools' make[2]: *** [R] Error 1 make[2]: Leaving directory `/db2blast/R/R-2.1.0/src/library' make[1]: *** [R] Error 1 make[1]: Leaving directory `/db2blast/R/R-2.1.0/src' make: *** [R] Error 1 == Any ideas on what might have changed between 2.0.1 and 2.1.0 to cause this, or maybe any suggestions on what I could try next? Paul -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, February 28, 2005 11:19 PM To: R-help@stat.math.ethz.ch Subject: Problems Building Ron AIX 5.2.0.0 (Solved) Happily I got this to work, largely by trial-and-error. In hopes that this will help somebody else, my config.site ended up being: OBJECT_MODE=64 R_PAPERSIZE=letter CC=/usr/local/bin/gcc MAIN_LDFLAGS=-Wl,-brtl SHLIB_LDFLAGS=-Wl,-G Which is virtually identical to that recommended in R-admin: one of my problems was using -W1,brtl rather than -W1,-brtl. This was R 2.0.1 on AIX 5.2.0.0 with GCC 3.3.2 The previous messages I'd posted on this issue are included below. Paul My apologies -- I had believed that by linking the source message I had made the detailed context available. I will be more careful in the future to correctly give full context. Paul -Original Message- From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] Sent: Saturday, February 26, 2005 5:23 AM To: paul.boutros at utoronto.ca Cc: r-help at stat.math.ethz.ch Subject: Re: [R] Problems Building R on AIX 5.2.0.0 (Update) Quotes from messages about Solaris 9 are not necessarily applicable to AIX, and in omitting the context you have misrepresented me. Please do bear in mind the `moral rights' on quoting given at http://www.jiscmail.ac.uk/help/policy/copyright.htm (Perhaps such a reference is needed in the posting guide?) On Fri, 25 Feb 2005 paul.boutros at utoronto.ca wrote: Hi, My previous message is appended: I'm still struggling with building on AIX. I updated my config.site to follow the suggestions from R-admin: MAIN_LDFLAGS=-Wl,brtl SHLIB_LDFLAGS=-Wl,-G This led to an error during configure: checking whether mixed C/Fortran code can be run... configure: WARNING: cannot run mixed C/Fortan code configure: error: Maybe check LDFLAGS for paths to Fortran libraries? This confused me a bit, because before adding the MAIN_LDFLAGS and SHLIB_LDFLAGS to config.site this step of configure did not show an error. When I googled this I found a previous message from last year: http://tolstoy.newcastle.edu.au/R/help/04/04/1622.html At the end of this message Professor Ripley says: You need wherever libg2c.so is installed in your LD_LIBRARY_PATH. So... I went looking for this file and could not find it! In /usr/local/lib I have: $ ls -al libg2c* -rw-r--r-- 1 freeware staff 7751224 Jan 09 2004 libg2c.a -rwxr-xr-x 1 freeware staff 714 Jan 09 2004 libg2c.la But no libg2c.so appears to be on my system. Does this indicate a bad install of gcc, or could anybody offer any suggestions on where to go from here? Paul --- From: Paul Boutros Paul.Boutros_at_utoronto.ca Date: Thu 24 Feb 2005 - 02:43:52 EST Hello, I am trying to build R 2.0.1 on an AIX 5.2.0.0 machine using gcc 3.3.2: $ oslevel 5.2.0.0 $ gcc -v Reading specs from /usr/local/lib/gcc-lib/powerpc-ibm-aix5.2.0.0/3.3.2/specs Configured with: ../gcc-3.3.2/configure :
[R] computationally singular
Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix S-cov(mymatrix) # this is fine D-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally singular: reciprocal condition number = 1.09501e-25 I understand the error message but I don't know how to trace down which variables caused this so that I can sacrifice them if there are not a lot. Again, not sure if it is due to some variables and not sure if dropping variables is a good idea either. Thanks for help, weiwei -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help with doing overlays plots...
I have a data frame with three columns, type (a factor with two values: Monolithic and Compositional), size (numeric), and states (numeric). I want to create a plot where size goes on the x-axis and states goes on the y-axis. In this plot, I want two lines, one where the type is Monolithic and one where the type is Compositional. I think this can be done by using the plot command to plot the line for one of the two types (setting the xlim and ylim parameters to ensure the plot area is large enough to hold all of the points). Then, I can use the lines and points commands to add the second line onto the plot. However, I don't want to have to specify the legend manually. I want something in R that does what can be done in SAS by using plot states*size=type in proc gplot. Here is a dump of my data set: tmp - structure(list(type = structure(as.integer(c(2, 2, 2, 1, 1, 1, 1, 1)), .Label = c(Compositional, Monolithic), class = factor), size = as.integer(c(2, 3, 4, 2, 3, 4, 5, 6)), states = as.integer(c(4910, 336026, 37526650, 4016, 44941, 310553, 8260254, 144145585 ))), .Names = c(type, size, states), row.names = c(1, 2, 3, 4, 5, 6, 7, 8), class = data.frame) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] get the wald chi square in binary logistic regression
[EMAIL PROTECTED] a écrit : hello, I work since a few time on R and i wanted to know how to obtain the Wald chi square value when you make a binary logistic regression. In fact, i have the z value and the signification but is there a script to see what is the value of Wald chi square. You can see my model below, Best regards, Séverine Erhel If you want a global test for several coeff associated with the same variable (e.g., form or criter2 in your example), you can fit the model without the variable and compare the 2 models with a likelihood ratio test (function anova): it is safer than the Wald test. If you really want the Wald test, it is available in different packages: see for example the function wald.test in package aod. Best, Renaud [Previously saved workspace restored] m3 = glm(reponse2 ~ form + factor(critere2) ,family=binomial,data=mes.donnees) summary (m3) Call: glm(formula = reponse2 ~ form + factor(critere2), family = binomial, data = mes.donnees) Deviance Residuals: Min 1Q Median 3Q Max -2.5402 0.2064 0.3354 0.4833 1.4177 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)0.5482 0.3930 1.395 0.1631 form Illustration3.2904 0.6478 5.080 3.78e-07 *** form Texte+illustration 2.6375 0.4746 5.557 2.74e-08 *** factor(critere2)2 -1.0973 0.5103 -2.150 0.0315 * factor(critere2)3 -0.9891 0.5107 -1.937 0.0528 . --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 227.76 on 218 degrees of freedom Residual deviance: 162.11 on 214 degrees of freedom AIC: 172.11 Number of Fisher Scoring iterations: 5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Dr Renaud Lancelot, vétérinaire Projet FSP régional épidémiologie vétérinaire C/0 Ambassade de France - SCAC BP 834 Antananarivo 101 - Madagascar e-mail: [EMAIL PROTECTED] tel.: +261 32 40 165 53 (cell) +261 20 22 665 36 ext. 225 (work) +261 20 22 494 37 (home) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] computationally singular
Once I had a situation where the reason was that the variables were scaled to extremely different magnitudes. 1e-25 is a *very* small number but still there is some probability that it may help to look up standard deviations and to multiply the variable with the smallest st.dev. with 1e20 or something. Best, Christian On Mon, 8 Aug 2005, Weiwei Shi wrote: Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix S-cov(mymatrix) # this is fine D-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally singular: reciprocal condition number = 1.09501e-25 I understand the error message but I don't know how to trace down which variables caused this so that I can sacrifice them if there are not a lot. Again, not sure if it is due to some variables and not sure if dropping variables is a good idea either. Thanks for help, weiwei -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] get the wald chi square in binary logistic regression
th,ks for your help, i don't have this package on my R, do you know an other package that have this test...thanks Selon Renaud Lancelot [EMAIL PROTECTED]: [EMAIL PROTECTED] a écrit : hello, I work since a few time on R and i wanted to know how to obtain the Wald chi square value when you make a binary logistic regression. In fact, i have the z value and the signification but is there a script to see what is the value of Wald chi square. You can see my model below, Best regards, Séverine Erhel If you want a global test for several coeff associated with the same variable (e.g., form or criter2 in your example), you can fit the model without the variable and compare the 2 models with a likelihood ratio test (function anova): it is safer than the Wald test. If you really want the Wald test, it is available in different packages: see for example the function wald.test in package aod. Best, Renaud [Previously saved workspace restored] m3 = glm(reponse2 ~ form + factor(critere2) ,family=binomial,data=mes.donnees) summary (m3) Call: glm(formula = reponse2 ~ form + factor(critere2), family = binomial, data = mes.donnees) Deviance Residuals: Min 1Q Median 3Q Max -2.5402 0.2064 0.3354 0.4833 1.4177 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)0.5482 0.3930 1.395 0.1631 form Illustration3.2904 0.6478 5.080 3.78e-07 *** form Texte+illustration 2.6375 0.4746 5.557 2.74e-08 *** factor(critere2)2 -1.0973 0.5103 -2.150 0.0315 * factor(critere2)3 -0.9891 0.5107 -1.937 0.0528 . --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 227.76 on 218 degrees of freedom Residual deviance: 162.11 on 214 degrees of freedom AIC: 172.11 Number of Fisher Scoring iterations: 5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Dr Renaud Lancelot, vétérinaire Projet FSP régional épidémiologie vétérinaire C/0 Ambassade de France - SCAC BP 834 Antananarivo 101 - Madagascar e-mail: [EMAIL PROTECTED] tel.: +261 32 40 165 53 (cell) +261 20 22 665 36 ext. 225 (work) +261 20 22 494 37 (home) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to insert a certain model in SVM regarding to fixed kernels ?
Dear R Users , Suppose that we want to regress a certain autoregressive model using SVM. We have our data and also some fixed kernels in libSVM behinde e1071 in front. The question: Where can we insert our certain autoregressive model ? During creating data frame ? Or perhaps we can make a relationship between our variables ended to desired autoregressive model ? Thanks a lot for your help. Amir Safari - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with doing overlays plots...
trellis.device() xyplot(states ~ size, groups=type, data=tmp, panel = panel.superpose, panel.groups =panel.linejoin, auto.key=TRUE) Jamieson Cobleigh wrote: I have a data frame with three columns, type (a factor with two values: Monolithic and Compositional), size (numeric), and states (numeric). I want to create a plot where size goes on the x-axis and states goes on the y-axis. In this plot, I want two lines, one where the type is Monolithic and one where the type is Compositional. I think this can be done by using the plot command to plot the line for one of the two types (setting the xlim and ylim parameters to ensure the plot area is large enough to hold all of the points). Then, I can use the lines and points commands to add the second line onto the plot. However, I don't want to have to specify the legend manually. I want something in R that does what can be done in SAS by using plot states*size=type in proc gplot. Here is a dump of my data set: tmp - structure(list(type = structure(as.integer(c(2, 2, 2, 1, 1, 1, 1, 1)), .Label = c(Compositional, Monolithic), class = factor), size = as.integer(c(2, 3, 4, 2, 3, 4, 5, 6)), states = as.integer(c(4910, 336026, 37526650, 4016, 44941, 310553, 8260254, 144145585 ))), .Names = c(type, size, states), row.names = c(1, 2, 3, 4, 5, 6, 7, 8), class = data.frame) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] computationally singular
I think the problem might be caused two variables are very correlated. Should I check the cov matrix and try to delete some? But i am just not quite sure of your reply. Could you detail it with some steps? thanks, weiwei On 8/8/05, Christian Hennig [EMAIL PROTECTED] wrote: Once I had a situation where the reason was that the variables were scaled to extremely different magnitudes. 1e-25 is a *very* small number but still there is some probability that it may help to look up standard deviations and to multiply the variable with the smallest st.dev. with 1e20 or something. Best, Christian On Mon, 8 Aug 2005, Weiwei Shi wrote: Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix S-cov(mymatrix) # this is fine D-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally singular: reciprocal condition number = 1.09501e-25 I understand the error message but I don't know how to trace down which variables caused this so that I can sacrifice them if there are not a lot. Again, not sure if it is due to some variables and not sure if dropping variables is a good idea either. Thanks for help, weiwei -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (more) computationally singular
More ideas: You can also perform an Eigenvalue decomposition of the covariance matrix and see along which directions the singularity occurs and how strong it is. Consequences could be: rescaling (or omission) of variables that are strong in these directions, taking principal components, or linear transformation of the whole data in order to attain less extreme ratios between cov eigenvalues. Generally I would say that information reduction (principal components or leaving out variables) should only be done if small variance along a direction means that this direction is not important in terms of the subject matter problem. Otherwise transformation could help. (Perhaps my guess was wrong in the first mail, you don't have to multiply something by 1e20 to repair a 1e-25 condition number and a more moderate transformation suffices.) Best, Christian On Mon, 8 Aug 2005, Weiwei Shi wrote: Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix S-cov(mymatrix) # this is fine D-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally singular: reciprocal condition number = 1.09501e-25 I understand the error message but I don't know how to trace down which variables caused this so that I can sacrifice them if there are not a lot. Again, not sure if it is due to some variables and not sure if dropping variables is a good idea either. Thanks for help, weiwei -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (part III) computationally singular
Sorry, our emails crossed... On Mon, 8 Aug 2005, Weiwei Shi wrote: I think the problem might be caused two variables are very correlated. Should I check the cov matrix and try to delete some? In this case, taking principal components should do the job. Variable deletion may help as well - I am not extremely against it, it depends on your whole project and aim, but I would not start with that before I found out if there are more proper possibilities. But i am just not quite sure of your reply. Could you detail it with some steps? Look up all std.devs of the variables. If the ratio between the largest one and the smallest one is more than, let's say, 1e5, consider that as not healthy. Multiply the variables with the smallest std.devs with constants so that the ratio between largest and smallest std.dev is not more than 1e3, say (I am not sure about the exact size of these numbers... try something...). Look if the problem vanishes after such rescaling. Don't ask me the same about the second email - I don't have the time to explain that in detail. Sorry, Christian thanks, weiwei On 8/8/05, Christian Hennig [EMAIL PROTECTED] wrote: Once I had a situation where the reason was that the variables were scaled to extremely different magnitudes. 1e-25 is a *very* small number but still there is some probability that it may help to look up standard deviations and to multiply the variable with the smallest st.dev. with 1e20 or something. Best, Christian On Mon, 8 Aug 2005, Weiwei Shi wrote: Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix S-cov(mymatrix) # this is fine D-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally singular: reciprocal condition number = 1.09501e-25 I understand the error message but I don't know how to trace down which variables caused this so that I can sacrifice them if there are not a lot. Again, not sure if it is due to some variables and not sure if dropping variables is a good idea either. Thanks for help, weiwei -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] modifying argument of a .C call (DUP=FALSE)
I have a huge matrix on which I need to do a simple (elementwise) transformation. Two of these matrices cannot fit in the memory, so I cannot do this in R. I thought of writing some C code to do this and calling it using .C with DUP=FALSE. All I need is a simple for loop that replaces elements with their new value, something like void transform(double *a, int *lengtha) { int i; for (i=0; i *lengtha; i++) { *(a+i) = calculatenewvaluesomehow(*(a+i)) } } trans - function(a) .C(transform,as.double(a), as.integer(length(a)) is it possible to do this? The manuals say that it is dangerous, is it possible to avoid the dangers somehow? Tamas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Square matrix plot
Hi, having a matrix where rows=n and cols=m, I calculated the spearman correlation values of the matrix, this generated a square matrix m x m. Dose anyone knows how can I create a plot similar to this http://bio.ifom-firc.it/User/finoc/ask.png ( produced with hierarchical cluster explorer) using R? I would like to have a range of colors from green(-1) to red(+1) proportional to the correlation value calculated. Thanks a lot. Regards Giacomo Finocchiaro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] GUIs (was Re: make error: X11/Intrinsic.h: No such,,,)
On Mon, 8 Aug 2005, Jake Michaelson wrote: I use Mac OS X at home and Linux at work, so the R Aqua GUI has spoiled me. I have not seen its equal so far (on Windows or Linux). The most important thing to me is how easily accessible the help and documentation is. I like how when I begin typing a function, the form and arguments to the function automatically appear at the bottom bar, refreshing my memory. You could use the JGR gui. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] get the wald chi square in binary logistic regression
Is this a research question? If not, I'd like to know why you think the Wald test is better. Are you famililiar with Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley), and with the concepts of intrinsic and parameter effects nonlinearity? In brief, nonlinear regression and maximum likelihood estimation more generally involve projection onto a nonlinear manifold, which is subject to intrinsic nonlinearity as well as parameter effects nonlinearity. The Wald test suffers from both types of nonlinearity, while the 2*log(likelihood ratio) procedure suffers from only the intrinsic nonlinearity. Moreover, one of the later chapters in Bates and Watts include a comparison intrinsic and parameter effects nonlinearity in several published nonlinear regression examples. I don't remember the details now, but in all but a few cases, the parameter effects were at least an order of magnitude greater than the intrinsic nonlinearity. If you are not familiar with Bates and Watts, I highly recommend it. If you are, I could see comparing Wald and 2*log(likelihood ratio) to decide if I want to use Wald in certain applications where 2*log(likelihood ratio) may not be feasible. If you have evidence raising questions about the above, I'd like to know. spencer graves [EMAIL PROTECTED] wrote: th,ks for your help, i don't have this package on my R, do you know an other package that have this test...thanks Selon Renaud Lancelot [EMAIL PROTECTED]: [EMAIL PROTECTED] a écrit : hello, I work since a few time on R and i wanted to know how to obtain the Wald chi square value when you make a binary logistic regression. In fact, i have the z value and the signification but is there a script to see what is the value of Wald chi square. You can see my model below, Best regards, Séverine Erhel If you want a global test for several coeff associated with the same variable (e.g., form or criter2 in your example), you can fit the model without the variable and compare the 2 models with a likelihood ratio test (function anova): it is safer than the Wald test. If you really want the Wald test, it is available in different packages: see for example the function wald.test in package aod. Best, Renaud [Previously saved workspace restored] m3 = glm(reponse2 ~ form + factor(critere2) ,family=binomial,data=mes.donnees) summary (m3) Call: glm(formula = reponse2 ~ form + factor(critere2), family = binomial, data = mes.donnees) Deviance Residuals: Min 1Q Median 3Q Max -2.5402 0.2064 0.3354 0.4833 1.4177 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)0.5482 0.3930 1.395 0.1631 form Illustration3.2904 0.6478 5.080 3.78e-07 *** form Texte+illustration 2.6375 0.4746 5.557 2.74e-08 *** factor(critere2)2 -1.0973 0.5103 -2.150 0.0315 * factor(critere2)3 -0.9891 0.5107 -1.937 0.0528 . --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 227.76 on 218 degrees of freedom Residual deviance: 162.11 on 214 degrees of freedom AIC: 172.11 Number of Fisher Scoring iterations: 5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Dr Renaud Lancelot, vétérinaire Projet FSP régional épidémiologie vétérinaire C/0 Ambassade de France - SCAC BP 834 Antananarivo 101 - Madagascar e-mail: [EMAIL PROTECTED] tel.: +261 32 40 165 53 (cell) +261 20 22 665 36 ext. 225 (work) +261 20 22 494 37 (home) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com http://www.pdf.com Tel: 408-938-4420 Fax: 408-280-7915 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] vector vs array
General Notes : a) Please try to give a simple example b) Please avoid the rightwards assignment (i.e. -). Eventhough it is perfectly legal to use it, it is confusing especially when you are posting to a mailing list. 1) Here is a reproducible example set.seed(1) # for reproducibility v - abs( rnorm(1000) ) thr - c( 0.5, 1.0, 2.0, 3.0 ) 2) If you simply want to count the number of points above a threshold sapply( thr, function(x) sum(v x) ) [1] 620 326 60 3 3) Or you can cut the data by threshold limits (be careful at the edges if you have discrete data) followed by breaks table( cut( v, breaks=c( -Inf, thr, Inf ) ) ) ) (-Inf,0.5](0.5,1] (1,2] (2,3](3,Inf] 380294266 57 3 4) If you want to turn the problem on its head and ask for which threshold point would you get 99%, 99.9% and 99.99% of the data below it, you can use use quantiles quantile( v, c(0.99, 0.999, 0.) ) 99%99.9% 99.99% 2.529139 3.056497 3.734899 Regards, Adai On Mon, 2005-08-08 at 08:34 -0700, alessandro carletti wrote: Hi! OK, I'm trying to select some useful outliers from my dataset: I defined 11 treshold values (1 for each level of a variable (sampling site) as follows: tresholds-function(x) { tapply(x,mm$NAME,FUN=mean ,simplify = T, na.rm=T)-med tapply(x,mm$NAME,FUN=sd ,simplify = T, na.rm=T)-standev standev+med } tresholds(mm$chl) Now I'd like to select those values from vector mm$chl that are higher than each treshold value, but how can I compare a vector with 1885 elements with the one with 11? Sorry for this (probably) stupid question... and thanks in advance. Alessandro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] tapply huge speed difference if X has names
Hi all, Apologies if this has been raised before ... R's tapply is very fast, but if X has names in this example, there seems to be a huge slow down: under 1 second compared to 151 seconds. The following timings are repeatable and are timed properly on a single user machine : X = 1:10 names(X) = X system.time(fast-tapply(as.vector(X), rep(1:1,each=10), mean)) # as.vector() to drop the names [1] 0.36 0.00 0.35 0.00 0.00 system.time(slow-tapply(X, rep(1:1,each=10), mean)) [1] 149.95 1.83 151.79 0.00 0.00 head(fast) 123456 5.5 15.5 25.5 35.5 45.5 55.5 head(slow) 123456 5.5 15.5 25.5 35.5 45.5 55.5 identical(fast,slow) [1] TRUE Looking inside tapply, which then calls split, it seems there is an is.null(names(x)) which prevents R's internal fast version from being called. Why is that there? Could it be removed? I often do something like tapply(mat[,colname],...) where mat has rownames. Therefore the rownames of mat become the names of the vector mat[,colname], and this seems to slow down tapply a lot. Perhaps other functions which call split also suffer this problem? split.default function (x, f) { if (is.list(f)) f - interaction(f) f - factor(f) if (is.null(attr(x, class)) is.null(names(x))) return(.Internal(split(x, f))) lf - levels(f) y - vector(list, length(lf)) names(y) - lf for (k in lf) y[[k]] - x[f %in% k] y } environment: namespace:base version _ platform x86_64-redhat-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R Thanks and regards, Matthew [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Difference amongst spline smoothers
Dear all, Anybody knows about the difference amongst various spline smoothers, specifically in R, 'bs' (by default a cubic spline), 'ns', smooth.spline with roughness penalty along with many other smoothers? I've consulted serveral books like 'S-plus Guide to Statistics' by Mathsoft, Ripley and Venables' Splus book and Eubank's Spline Smoothing, but I was still looking for a substantial discussion on the difference in using these smoothers. I try to put the question clear or maybe somebody can provide some sources for reference. If possible, I'd also like to see some comments for practice. Regards, LS [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] parametric survival plot
Hi to all, I am new in R , and I would like to ask how to plot the survival function, and the associated baseline hazard in the case of parametric survival estimation models(SURVIVAL PACKAGE). plot.survfit works only with cox models. A lot of thanks D.Lalountas __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Reading large files in R
Dear R-listers: I am trying to work with a big (262 Mb) file but apparently reach a memory limit using R on a MacOSX as well as on a unix machine. This is the script: type=list(a=0,b=0,c=0) tmp - scan(file=coastal_gebco_sandS_blend.txt, what=type, sep=\t, quote=\, dec=., skip=1, na.strings=-99, nmax=13669628) Read 13669627 records gebco - data.frame(tmp) Error: cannot allocate vector of size 106793 Kb Even tmp does not seem right: summary(tmp) Error: recursive default argument reference Do you have any suggestion? Thanks, Jean-Pierre Gattuso __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Square matrix plot
The URL that you sent is not working. Can you please check ? If you mean 2 dimensional hierarchical clustering as often used in microarrays, then see help(heatmap). There was a discussion last week about using red-green for heatmap. See http://tolstoy.newcastle.edu.au/R/help/05/08/9714.html Or if you want to want to plot one column against another, then see help(pairs). Regards, Adai On Mon, 2005-08-08 at 19:42 +0200, Finocchiaro Giacomo wrote: Hi, having a matrix where rows=n and cols=m, I calculated the spearman correlation values of the matrix, this generated a square matrix m x m. Dose anyone knows how can I create a plot similar to this http://bio.ifom-firc.it/User/finoc/ask.png ( produced with hierarchical cluster explorer) using R? I would like to have a range of colors from green(-1) to red(+1) proportional to the correlation value calculated. Thanks a lot. Regards Giacomo Finocchiaro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] use different symbols for frequency in a plot
suppose I have the following data x-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) If I plot(x,y) in R, I will only get seven distinct points. What I want to do is to use different symbols to show the frequency at each point. e.g. if the frequncey is between 1 and 5, then I plot the point as a circle; if the frequency is between 6 and 10, then I plot the point as a square; if the frequency is above 10, then I plot the point as a triangle. I am not sure how to do this in R. Can anybody help me? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Reading large files in R
From Note section of help(read.delim) : 'read.table' is not the right tool for reading large matrices, especially those with many columns: it is designed to read _data frames_ which may have columns of very different classes. Use 'scan' instead. So I am not sure why you used 'scan', then converted it to a data frame. 1) Can provide an sample of the data that you are trying to read in. 2) How much memory does your machine has ? 3) Try reading in the first few lines using the nmax argument in scan. Regards, Adai On Mon, 2005-08-08 at 12:50 -0600, Jean-Pierre Gattuso wrote: Dear R-listers: I am trying to work with a big (262 Mb) file but apparently reach a memory limit using R on a MacOSX as well as on a unix machine. This is the script: type=list(a=0,b=0,c=0) tmp - scan(file=coastal_gebco_sandS_blend.txt, what=type, sep=\t, quote=\, dec=., skip=1, na.strings=-99, nmax=13669628) Read 13669627 records gebco - data.frame(tmp) Error: cannot allocate vector of size 106793 Kb Even tmp does not seem right: summary(tmp) Error: recursive default argument reference Do you have any suggestion? Thanks, Jean-Pierre Gattuso __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
You might consider one of these approaches instead: plot(jitter(x), jitter(y)) or pdf(file=c:/AlphaExample.pdf, version = 1.4) plot(x, y, col = rgb(1, 0, 0, .2), pch = 16) dev.off() Kerry Bush wrote: suppose I have the following data x-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) If I plot(x,y) in R, I will only get seven distinct points. What I want to do is to use different symbols to show the frequency at each point. e.g. if the frequncey is between 1 and 5, then I plot the point as a circle; if the frequency is between 6 and 10, then I plot the point as a square; if the frequency is above 10, then I plot the point as a triangle. I am not sure how to do this in R. Can anybody help me? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] modifying argument of a .C call (DUP=FALSE)
Tamas K Papp [EMAIL PROTECTED] writes: I have a huge matrix on which I need to do a simple (elementwise) transformation. Two of these matrices cannot fit in the memory, so I cannot do this in R. I thought of writing some C code to do this and calling it using .C with DUP=FALSE. All I need is a simple for loop that replaces elements with their new value, something like void transform(double *a, int *lengtha) { int i; for (i=0; i *lengtha; i++) { *(a+i) = calculatenewvaluesomehow(*(a+i)) } } trans - function(a) .C(transform,as.double(a), as.integer(length(a)) is it possible to do this? The manuals say that it is dangerous, is it possible to avoid the dangers somehow? It's more a question of whether the dangers affect you. In general, the issue is that you risk modifying a second (virtual) copy of the data along with the one you intend to modify. If you're sure that you don't have any, the point is moot. It is fairly difficult to be sure of that in the general case, which is why we generally discourage DUP=FALSE, especially for package writers, but for personal use you might just get away with it. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Reading large files in R
... and it is likely that even if you did have enough memory (several times the size of the data are generally needed) it would take a very long time. If you do have enough memory and the data are all of one type -- numeric here -- you're better off treating it as a matrix rather than converting it to a data frame. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adaikalavan Ramasamy Sent: Monday, August 08, 2005 12:02 PM To: Jean-Pierre Gattuso Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Reading large files in R From Note section of help(read.delim) : 'read.table' is not the right tool for reading large matrices, especially those with many columns: it is designed to read _data frames_ which may have columns of very different classes. Use 'scan' instead. So I am not sure why you used 'scan', then converted it to a data frame. 1) Can provide an sample of the data that you are trying to read in. 2) How much memory does your machine has ? 3) Try reading in the first few lines using the nmax argument in scan. Regards, Adai On Mon, 2005-08-08 at 12:50 -0600, Jean-Pierre Gattuso wrote: Dear R-listers: I am trying to work with a big (262 Mb) file but apparently reach a memory limit using R on a MacOSX as well as on a unix machine. This is the script: type=list(a=0,b=0,c=0) tmp - scan(file=coastal_gebco_sandS_blend.txt, what=type, sep=\t, quote=\, dec=., skip=1, na.strings=-99, nmax=13669628) Read 13669627 records gebco - data.frame(tmp) Error: cannot allocate vector of size 106793 Kb Even tmp does not seem right: summary(tmp) Error: recursive default argument reference Do you have any suggestion? Thanks, Jean-Pierre Gattuso __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] modifying argument of a .C call (DUP=FALSE)
On Mon, 8 Aug 2005, Peter Dalgaard wrote: Tamas K Papp [EMAIL PROTECTED] writes: I have a huge matrix on which I need to do a simple (elementwise) transformation. Two of these matrices cannot fit in the memory, so I cannot do this in R. I thought of writing some C code to do this and calling it using .C with DUP=FALSE. All I need is a simple for loop that replaces elements with their new value, something like void transform(double *a, int *lengtha) { int i; for (i=0; i *lengtha; i++) { *(a+i) = calculatenewvaluesomehow(*(a+i)) } } trans - function(a) .C(transform,as.double(a), as.integer(length(a)) is it possible to do this? The manuals say that it is dangerous, is it possible to avoid the dangers somehow? It's more a question of whether the dangers affect you. In general, the issue is that you risk modifying a second (virtual) copy of the data along with the one you intend to modify. If you're sure that you don't have any, the point is moot. It is fairly difficult to be sure of that in the general case, which is why we generally discourage DUP=FALSE, especially for package writers, but for personal use you might just get away with it. I did specifically suggest .Call in an earlier reply to the same person on the same problem, because there you can do this via a replacement function with standard semantics. See the discussion of SET_NAMED in `Writing R Extensions'. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
Group by which variable ? If you mean the joint distribution of 'x' and 'y' then something along the following lines x - rep( c(0.1, 0.2, 0.4, 0.5), c(5, 6, 10, 20) ) y - rep( c(0.5, 0.6, 1.2, 2.5, 3.0), c(3, 8, 8, 18, 4) ) new - factor( paste(x, y, sep=_) ) tb - table(new) pchcode - cut(tb , c(-Inf, 1, 5, 6, 10, Inf), labels=F) tmp - t( sapply( strsplit( names(tb), split=_) , c ) ) df - data.frame( x=tmp[ ,1], y=tmp[ ,2], freq=as.vector(tb), pchcode = pchcode -1 ) x y freq pchcode 1 0.1 0.53 1 2 0.1 0.62 1 3 0.2 0.66 2 4 0.4 1.28 3 5 0.4 2.52 1 6 0.5 2.5 16 4 7 0.5 34 1 And now to plot it, we use points() repeatedly. plot( as.numeric(df$x), as.numeric(df$y), type=n ) for( i in unique( df$pchcode ) ){ w - which( df$pchcode == i ) points( df$x[w], df$y[w], pch=as.numeric(i) ) } I am sure someone else will come up with a neater solution. Can I also suggest that you try the following plot( jitter(x), jitter(y) ) or better still the following library(hexbin) plot( hexbin(x, y) ) Regards, Adai On Mon, 2005-08-08 at 11:57 -0700, Kerry Bush wrote: suppose I have the following data x-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) If I plot(x,y) in R, I will only get seven distinct points. What I want to do is to use different symbols to show the frequency at each point. e.g. if the frequncey is between 1 and 5, then I plot the point as a circle; if the frequency is between 6 and 10, then I plot the point as a square; if the frequency is above 10, then I plot the point as a triangle. I am not sure how to do this in R. Can anybody help me? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
Thank you. But I only need three classes of freqnencies (in another words, only three kinds of symbols) for 1-5, 5-10 and above 10, not to use different symbols for different frequencies. Otherwise, clearly R will run out of available symbols and the plot is also hard to view. Thank you anyway. --- Adaikalavan Ramasamy [EMAIL PROTECTED] wrote: Group by which variable ? If you mean the joint distribution of 'x' and 'y' then something along the following lines x - rep( c(0.1, 0.2, 0.4, 0.5), c(5, 6, 10, 20) ) y - rep( c(0.5, 0.6, 1.2, 2.5, 3.0), c(3, 8, 8, 18, 4) ) new - factor( paste(x, y, sep=_) ) tb - table(new) pchcode - cut(tb , c(-Inf, 1, 5, 6, 10, Inf), labels=F) tmp - t( sapply( strsplit( names(tb), split=_) , c ) ) df - data.frame( x=tmp[ ,1], y=tmp[ ,2], freq=as.vector(tb), pchcode = pchcode -1 ) x y freq pchcode 1 0.1 0.53 1 2 0.1 0.62 1 3 0.2 0.66 2 4 0.4 1.28 3 5 0.4 2.52 1 6 0.5 2.5 16 4 7 0.5 34 1 And now to plot it, we use points() repeatedly. plot( as.numeric(df$x), as.numeric(df$y), type=n ) for( i in unique( df$pchcode ) ){ w - which( df$pchcode == i ) points( df$x[w], df$y[w], pch=as.numeric(i) ) } I am sure someone else will come up with a neater solution. Can I also suggest that you try the following plot( jitter(x), jitter(y) ) or better still the following library(hexbin) plot( hexbin(x, y) ) Regards, Adai On Mon, 2005-08-08 at 11:57 -0700, Kerry Bush wrote: suppose I have the following data x-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) If I plot(x,y) in R, I will only get seven distinct points. What I want to do is to use different symbols to show the frequency at each point. e.g. if the frequncey is between 1 and 5, then I plot the point as a circle; if the frequency is between 6 and 10, then I plot the point as a square; if the frequency is above 10, then I plot the point as a triangle. I am not sure how to do this in R. Can anybody help me? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
On Mon, 2005-08-08 at 11:57 -0700, Kerry Bush wrote: suppose I have the following data x-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) If I plot(x,y) in R, I will only get seven distinct points. What I want to do is to use different symbols to show the frequency at each point. e.g. if the frequncey is between 1 and 5, then I plot the point as a circle; if the frequency is between 6 and 10, then I plot the point as a square; if the frequency is above 10, then I plot the point as a triangle. I am not sure how to do this in R. Can anybody help me? You might want to review this recent post by Deepayan Sarkar: https://stat.ethz.ch/pipermail/r-help/2005-July/074042.html with modest modification you can replace his example, which plots the frequencies with: x - c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y - c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) temp - data.frame(x, y) foo - subset(as.data.frame(table(temp)), Freq 0) foo x y Freq 1 0.1 0.53 5 0.1 0.62 6 0.2 0.66 11 0.4 1.28 15 0.4 2.52 16 0.5 2.5 16 20 0.5 34 # Use cut() to create the bins and specify the plotting symbols # for each bin, which are the 'label' values foo$sym - with(foo, cut(Freq, c(0, 5, 10, Inf), labels = c(21, 22, 24))) # convert 'foo' to all numeric from factors above for plotting foo - apply(foo, 2, function(x) as.numeric(as.character(x))) foo x y Freq sym 1 0.1 0.53 21 5 0.1 0.62 21 6 0.2 0.66 22 11 0.4 1.28 22 15 0.4 2.52 21 16 0.5 2.5 16 24 20 0.5 34 21 # Now do the plot. Keep in mind that 'foo' is now # a matrix, rather than a data frame plot(foo[, x], foo[, y], pch = foo[, sym]) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] tapply huge speed difference if X has names
Please use a current version of R! This was fixed long ago, and you will find it in the NEWS file: split() now handles vectors with names internally and so is almost as fast as on vectors without names (and maybe 100x faster than before). On Mon, 8 Aug 2005, Matthew Dowle wrote: Hi all, Apologies if this has been raised before ... R's tapply is very fast, but if X has names in this example, there seems to be a huge slow down: under 1 second compared to 151 seconds. The following timings are repeatable and are timed properly on a single user machine : X = 1:10 names(X) = X system.time(fast-tapply(as.vector(X), rep(1:1,each=10), mean)) # as.vector() to drop the names [1] 0.36 0.00 0.35 0.00 0.00 system.time(slow-tapply(X, rep(1:1,each=10), mean)) [1] 149.95 1.83 151.79 0.00 0.00 head(fast) 123456 5.5 15.5 25.5 35.5 45.5 55.5 head(slow) 123456 5.5 15.5 25.5 35.5 45.5 55.5 identical(fast,slow) [1] TRUE Looking inside tapply, which then calls split, it seems there is an is.null(names(x)) which prevents R's internal fast version from being called. Why is that there? Could it be removed? I often do something like tapply(mat[,colname],...) where mat has rownames. Therefore the rownames of mat become the names of the vector mat[,colname], and this seems to slow down tapply a lot. Perhaps other functions which call split also suffer this problem? split.default function (x, f) { if (is.list(f)) f - interaction(f) f - factor(f) if (is.null(attr(x, class)) is.null(names(x))) return(.Internal(split(x, f))) lf - levels(f) y - vector(list, length(lf)) names(y) - lf for (k in lf) y[[k]] - x[f %in% k] y } environment: namespace:base version _ platform x86_64-redhat-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R Thanks and regards, Matthew [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help with non-integer #successes in a binomial glm
Hi, I had a logit regression, but don't really know how to handle the Warning message: non-integer #successes in a binomial glm! in: eval(expr, envir, enclos) problem. I had the same logit regression without weights and it worked out without the warning, but I figured it makes more sense to add the weights. The weights sum up to one. Could anyone give me some hint? Thanks a lot! FYI, I have posted both regressions (with and without weights) below. Ed setwd(P:/Work in Progress/Haibo/Hans) Lease=read.csv(lease.csv, header=TRUE) Lease$ET - factor(Lease$EarlyTermination) SICCode=factor(Lease$SIC.Code) Lease$TO=factor(Lease$TenantHasOption) Lease$LO=factor(Lease$LandlordHasOption) Lease$TEO=factor(Lease$TenantExercisedOption) RegA=glm(ET~1+TO, + family=binomial(link=logit), data=Lease) summary(RegA) Call: glm(formula = ET ~ 1 + TO, family = binomial(link = logit), data = Lease) Deviance Residuals: Min 1Q Median 3Q Max -0.5839 -0.5839 -0.5839 -0.3585 2.3565 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -1.682710.02363 -71.20 2e-16 *** TO1 -1.029590.09012 -11.43 2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 12987 on 15809 degrees of freedom Residual deviance: 12819 on 15808 degrees of freedom AIC: 12823 Number of Fisher Scoring iterations: 5 setwd(P:/Work in Progress/Haibo/Hans) Lease=read.csv(lease.csv, header=TRUE) Lease$ET - factor(Lease$EarlyTermination) SICCode=factor(Lease$SIC.Code) Lease$TO=factor(Lease$TenantHasOption) Lease$LO=factor(Lease$LandlordHasOption) Lease$TEO=factor(Lease$TenantExercisedOption) RegA=glm(ET~1+TO, + family=binomial(link=logit), data=Lease, weights=PortionSF) Warning message: non-integer #successes in a binomial glm! in: eval(expr, envir, enclos) summary(RegA) Call: glm(formula = ET ~ 1 + TO, family = binomial(link = logit), data = Lease, weights = PortionSF) Deviance Residuals: Min 1Q Median 3QMax -0.055002 -0.003434 0.00 0.00 0.120656 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -1.120 2.618 -0.4280.669 TO1 -1.570 9.251 -0.1700.865 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1.0201 on 9302 degrees of freedom Residual deviance: 0.9787 on 9301 degrees of freedom AIC: 4 Number of Fisher Scoring iterations: 5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] reverse order of matrix rows
Quick question: how can I reverse the order of the rows in a matrix? i.e. make the last row first and the first row last, etc.? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] two term exponential model
Dear R users, Does anybody know if there is an R function (package) to fit a two-terms exponential model like y = a*exp(bx) + c*exp(dx) where y is dependent variable and x is independent variable. MATLAB has a Curve Fitting Toolbox to implement this fitting, but I don't know if there is an R package for this fitting. Thank you! Deming Mi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] AIC model selection
Hello All; I need to run a multiple regression analysis and use Akaike's Information Criterion for model selection. I understand that this command will give the AIC value for specified models: AIC(object, ..., k = 2) with ... meaning any other optional models for which I would like AIC values. But, how can I specify (in the place of ...) that I want R to perform an model selection prodecure based on Akaike's Information Criterion on a set of potential independent variables in a model such as: model.lm=lm(A~B+C+D+E+F+G) ? Thanks a million; Marty __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reverse order of matrix rows
sapply(nrow(matrix):1, function(x) matrix[x,]) On Mon, 8 Aug 2005, Jake wrote: Quick question: how can I reverse the order of the rows in a matrix? i.e. make the last row first and the first row last, etc.? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reverse order of matrix rows
Thanks to those who provided the one-liner answers! They worked quite well. I'm quite sure that 95% of the questions posted on this mailing list could be answered with a quick...read the manual, stupid..., but I'm very grateful to those who take the time to write one-liners. I know that I have been greatly helped by searching through archived questions from other people, which I'm sure to some R gurus must have seemed like stupid questions. But in reality the Mailing List archives are an invaluable source of documentation, one that wouldn't exist if every question were met with a go read the manual... response. --Jake On Aug 8, 2005, at 5:35 PM, Berton Gunter wrote: Quick answer: Read An Introduction to R and learn about indexing. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jake Sent: Monday, August 08, 2005 2:58 PM To: R-help@stat.math.ethz.ch Subject: [R] reverse order of matrix rows Quick question: how can I reverse the order of the rows in a matrix? i.e. make the last row first and the first row last, etc.? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
Remove the '6' from the code that contains 'cut'. I am not sure how it crept into my code. Then you should have the following mapping Freqpch code 1-5 1 6-102 11- 3 I am more concerned about viewers getting confused with many symbols than running out of symbols in R. Looking at the last example in help(points), I would say that there are 20-30 usable symbols. Remember that you can also use text() to put multi-character text. Regards, Adai On Mon, 2005-08-08 at 13:42 -0700, Kerry Bush wrote: Thank you. But I only need three classes of freqnencies (in another words, only three kinds of symbols) for 1-5, 5-10 and above 10, not to use different symbols for different frequencies. Otherwise, clearly R will run out of available symbols and the plot is also hard to view. Thank you anyway. --- Adaikalavan Ramasamy [EMAIL PROTECTED] wrote: Group by which variable ? If you mean the joint distribution of 'x' and 'y' then something along the following lines x - rep( c(0.1, 0.2, 0.4, 0.5), c(5, 6, 10, 20) ) y - rep( c(0.5, 0.6, 1.2, 2.5, 3.0), c(3, 8, 8, 18, 4) ) new - factor( paste(x, y, sep=_) ) tb - table(new) pchcode - cut(tb , c(-Inf, 1, 5, 6, 10, Inf), labels=F) tmp - t( sapply( strsplit( names(tb), split=_) , c ) ) df - data.frame( x=tmp[ ,1], y=tmp[ ,2], freq=as.vector(tb), pchcode = pchcode -1 ) x y freq pchcode 1 0.1 0.53 1 2 0.1 0.62 1 3 0.2 0.66 2 4 0.4 1.28 3 5 0.4 2.52 1 6 0.5 2.5 16 4 7 0.5 34 1 And now to plot it, we use points() repeatedly. plot( as.numeric(df$x), as.numeric(df$y), type=n ) for( i in unique( df$pchcode ) ){ w - which( df$pchcode == i ) points( df$x[w], df$y[w], pch=as.numeric(i) ) } I am sure someone else will come up with a neater solution. Can I also suggest that you try the following plot( jitter(x), jitter(y) ) or better still the following library(hexbin) plot( hexbin(x, y) ) Regards, Adai On Mon, 2005-08-08 at 11:57 -0700, Kerry Bush wrote: suppose I have the following data x-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) If I plot(x,y) in R, I will only get seven distinct points. What I want to do is to use different symbols to show the frequency at each point. e.g. if the frequncey is between 1 and 5, then I plot the point as a circle; if the frequency is between 6 and 10, then I plot the point as a square; if the frequency is above 10, then I plot the point as a triangle. I am not sure how to do this in R. Can anybody help me? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reverse order of matrix rows
How about simply mat - mat[ nrow(mat):1, ] Regards, Adai On Mon, 2005-08-08 at 19:44 -0400, Jean Eid wrote: sapply(nrow(matrix):1, function(x) matrix[x,]) On Mon, 8 Aug 2005, Jake wrote: Quick question: how can I reverse the order of the rows in a matrix? i.e. make the last row first and the first row last, etc.? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] AIC model selection
Are you looking for possibly stepAIC from the package MASS ? Regards, Adai On Mon, 2005-08-08 at 16:39 -0600, Martin Kardos wrote: Hello All; I need to run a multiple regression analysis and use Akaike's Information Criterion for model selection. I understand that this command will give the AIC value for specified models: AIC(object, ..., k = 2) with ... meaning any other optional models for which I would like AIC values. But, how can I specify (in the place of ...) that I want R to perform an model selection prodecure based on Akaike's Information Criterion on a set of potential independent variables in a model such as: model.lm=lm(A~B+C+D+E+F+G) ? Thanks a million; Marty __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] AIC model selection
The last time I used it, the function step() was using AIC as model selection criteria as the default. It is in the base package so you don't have to refer to other fancy functions. --- Adaikalavan Ramasamy [EMAIL PROTECTED] wrote: Are you looking for possibly stepAIC from the package MASS ? Regards, Adai On Mon, 2005-08-08 at 16:39 -0600, Martin Kardos wrote: Hello All; I need to run a multiple regression analysis and use Akaike's Information Criterion for model selection. I understand that this command will give the AIC value for specified models: AIC(object, ..., k = 2) with ... meaning any other optional models for which I would like AIC values. But, how can I specify (in the place of ...) that I want R to perform an model selection prodecure based on Akaike's Information Criterion on a set of potential independent variables in a model such as: model.lm=lm(A~B+C+D+E+F+G) ? Thanks a million; Marty __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
On Mon, Aug 08, 2005 at 03:17:44PM -0400, Chuck Cleland wrote: You might consider one of these approaches instead: plot(jitter(x), jitter(y)) or pdf(file=c:/AlphaExample.pdf, version = 1.4) plot(x, y, col = rgb(1, 0, 0, .2), pch = 16) dev.off() sunflowerplot() is also useful for this (although it won't be as elegant as the pdf with alpha on screen, it looks better on paper). Tamas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html