[R] F tests for random effect models
Dear R-users, My question is how to get right F tests for random effects in random effect models (I hope this question has not been answered too many times yet - I didn't find an answer in rhelp archives). My data are in mca2 (enc.) : names(mca2) [1] LigneePollinisateur Rendement dim(mca2) [1] 100 3 replications(Rendement ~ Lignee * Pollinisateur, data = mca2) LigneePollinisateur Lignee:Pollinisateur 20 102 Of course, summary(aov(Rendement ~ Pollinisateur * Lignee, data = mca2)) gives wrong tests of random effects. But, summary(aov1 - aov(Rendement ~ Error(Pollinisateur * Lignee), data = mca2)) gives no test at all, and I have to do it like this : tab1 - matrix(unlist(summary(aov1)), nc=5, byrow=T)[,1:3] Femp - c(tab1[1:3, 3]/tab1[c(3,3,4), 3]) names(Femp) - c(Pollinisateur, Lignee, Interaction) 1 - pf(Femp, tab1[1:3,1], tab1[c(3,3,4),1]) With lme4 package (I did'nt succeed in writing a working formula with lme from nlme package), I can see standard deviations of random effects (but don't know how to find them) with : library(lme4) summary(lmer(Rendement ~ (1 |Pollinisateur) + (1 | Lignee) + (1 | Pollinisateur:Lignee), data=mca2)) but I can't get F tests. Thanks in advance. Best regards, Jacques VESLOT __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data.frame-question]
First a general comment on posting style, could you please be more specific about where the error occurs as without this it is very difficult to identify what the problem is. Now concerning your problem. When I tried the code I posted yesterday I thought it worked fine. I've tried it again now and found that the data.frame TAB3 actually only has one column and the names A, B etc are actually interpreted as the row names. Since there is only one column the 'colnames(TAB3) -' fails when you give it a vector with two components. I think that if you display it without renaming the columns then it still displays the correct results though. I also tested it with including NA's and it worked fine. I'm quite a newbie myself so I don't know how you can't tell you how to return the result into a two column data.frame. For completeness, here is the code I used to test. Name - c(rep(A, 3), rep(B, 5), C) Number - rep(1, 8) Number[9] - NA TAB1 - data.frame(Name, Number) TAB1 Name Number 1A 1 2A 1 3A 1 4B 1 5B 1 6B 1 7B 1 8B 1 9C NA TAB3 - with(TAB1, tapply(Number, Name, sum, na.rm=TRUE)) TAB3 A B C 3 5 0 TAB3 - as.data.frame(TAB3) TAB3 TAB3 A3 B5 C0 colnames(TAB3) - c(Name_singular, Sum) Error in dimnames-.data.frame(`*tmp*`, value = list(c(A, B, C : invalid 'dimnames' given for data frame TAB3 TAB3 A3 B5 C0 str(TAB3) `data.frame': 3 obs. of 1 variable: $ TAB3: num [, 1:3] 3 5 0 ..- attr(*, dimnames)=List of 1 .. ..$ : chr A B C version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor2.0 year 2005 month10 day 06 svn rev 35749 language R -Original Message- From: Michael Graber [mailto:[EMAIL PROTECTED] Sent: 27 October 2005 12:43 AM To: [EMAIL PROTECTED] Subject: Re: Re: [R] data.frame-question] This is what I am looking for, but I still get an error message, that my arguments are not of the same length. How can I avoid this error message? Maybe I should add, that there are also NA´s in the second column, but I tried to ignore them by na.rm=TRUE. Thanks in advance, Michael Graber Michael Graber schrieb: -- -- Betreff: RE: [R] data.frame-question Von: Brandt, T. (Tobias) [EMAIL PROTECTED] Datum: Wed, 26 Oct 2005 09:20:23 +0200 An: 'r-help@stat.math.ethz.ch' r-help@stat.math.ethz.ch An: 'r-help@stat.math.ethz.ch' r-help@stat.math.ethz.ch CC: 'Michael Graber' [EMAIL PROTECTED] Is TAB3 - as.data.frame(with(TAB1, tapply(Number, Name, sum))) colnames(TAB3) - c(Name_singular, Sum) what you are looking for? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Graber Sent: 25 October 2005 09:45 PM To: R-Mailingliste Subject: [R] data.frame-question Dear R-List, I am very new to R and programming itself, so my question may be easy to answer for you. I tried a lot and read through the manuals, but I still have the following problem: I have 2 data-frames: Number-as.numeric (Number) Name-as.character (Name) TAB1-data.frame (Name,Number) - it looks like this:- Name Number A 2 A 3 A 6 B 8 B 12 B 7 C 8 D 90 E 12 E 45 ... Name_singular-as.character (Name_singular) TAB2-data.frame (Name_singular) # it looks like this: Name_singular A B C D E -My result should be a data-frame, where the first column is Name_singular and the second column should be the sum of the numbers where Name ==Name_singular.- For example: TAB3: Name_singular Sum A 11 B 27 ... - I tried it with for-loops, but I think there must be an easier way.- I would be very grateful for your help, Michael Graber __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- Nedbank Limited Reg No 1951/09/06 Directors: WAM Clewlow (Chairman) Prof MM Katz (Vice-chairman) ML Ndlovu (Vice-chairman) TA Boardman (Chief Executive) CJW Ball MWT Brown RG Cottrell BE Davison N Dennis (British) MA Enus-Brey Prof B de L Figaji RM Head (British) RJ Khoza JB Magwaza ME Mkwanazi JVF Roberts (British) CML Savage GT Serobe JH Sutcliffe (British) Company Secretary: GS Nienaber 16.08.2005 This email and any accompanying attachments may contain confidential and proprietary information. This information is private and protected by law and, accordingly, if you are not the intended recipient, you are requested to delete this entire communication
Re: [R] data.frame-question]
Hi quite near using aggregate it is possible to reach what you want TAB3 - with(TAB1, aggregate(Number, list(Name_singular=Name), sum, na.rm=TRUE)) see str(TAB3) `data.frame': 3 obs. of 2 variables: $ Name_singular: Factor w/ 3 levels A,B,C: 1 2 3 $ x: num 3 5 0 HTH Petr On 27 Oct 2005 at 8:47, Brandt, T. (Tobias) wrote: From: Brandt, T. (Tobias) [EMAIL PROTECTED] To: 'Michael Graber' [EMAIL PROTECTED], 'r-help@stat.math.ethz.ch' r-help@stat.math.ethz.ch Date sent: Thu, 27 Oct 2005 08:47:45 +0200 Subject:Re: [R] data.frame-question] First a general comment on posting style, could you please be more specific about where the error occurs as without this it is very difficult to identify what the problem is. Now concerning your problem. When I tried the code I posted yesterday I thought it worked fine. I've tried it again now and found that the data.frame TAB3 actually only has one column and the names A, B etc are actually interpreted as the row names. Since there is only one column the 'colnames(TAB3) -' fails when you give it a vector with two components. I think that if you display it without renaming the columns then it still displays the correct results though. I also tested it with including NA's and it worked fine. I'm quite a newbie myself so I don't know how you can't tell you how to return the result into a two column data.frame. For completeness, here is the code I used to test. Name - c(rep(A, 3), rep(B, 5), C) Number - rep(1, 8) Number[9] - NA TAB1 - data.frame(Name, Number) TAB1 Name Number 1A 1 2A 1 3A 1 4B 1 5B 1 6B 1 7B 1 8B 1 9C NA TAB3 - with(TAB1, tapply(Number, Name, sum, na.rm=TRUE)) TAB3 A B C 3 5 0 TAB3 - as.data.frame(TAB3) TAB3 TAB3 A3 B5 C0 colnames(TAB3) - c(Name_singular, Sum) Error in dimnames-.data.frame(`*tmp*`, value = list(c(A, B, C : invalid 'dimnames' given for data frame TAB3 TAB3 A3 B5 C0 str(TAB3) `data.frame': 3 obs. of 1 variable: $ TAB3: num [, 1:3] 3 5 0 ..- attr(*, dimnames)=List of 1 .. ..$ : chr A B C version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor2.0 year 2005 month10 day 06 svn rev 35749 language R -Original Message- From: Michael Graber [mailto:[EMAIL PROTECTED] Sent: 27 October 2005 12:43 AM To: [EMAIL PROTECTED] Subject: Re: Re: [R] data.frame-question] This is what I am looking for, but I still get an error message, that my arguments are not of the same length. How can I avoid this error message? Maybe I should add, that there are also NA´s in the second column, but I tried to ignore them by na.rm=TRUE. Thanks in advance, Michael Graber Michael Graber schrieb: -- -- Betreff: RE: [R] data.frame-question Von: Brandt, T. (Tobias) [EMAIL PROTECTED] Datum: Wed, 26 Oct 2005 09:20:23 +0200 An: 'r-help@stat.math.ethz.ch' r-help@stat.math.ethz.ch An: 'r-help@stat.math.ethz.ch' r-help@stat.math.ethz.ch CC: 'Michael Graber' [EMAIL PROTECTED] Is TAB3 - as.data.frame(with(TAB1, tapply(Number, Name, sum))) colnames(TAB3) - c(Name_singular, Sum) what you are looking for? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Graber Sent: 25 October 2005 09:45 PM To: R-Mailingliste Subject: [R] data.frame-question Dear R-List, I am very new to R and programming itself, so my question may be easy to answer for you. I tried a lot and read through the manuals, but I still have the following problem: I have 2 data-frames: Number-as.numeric (Number) Name-as.character (Name) TAB1-data.frame (Name,Number) - it looks like this:- Name Number A 2 A 3 A 6 B 8 B 12 B 7 C 8 D 90 E 12 E 45 ... Name_singular-as.character (Name_singular) TAB2-data.frame (Name_singular) # it looks like this: Name_singular A B C D E -My result should be a data-frame, where the first column is Name_singular and the second column should be the sum of the numbers where Name ==Name_singular.- For example: TAB3: Name_singular Sum A 11 B 27 ... - I tried it with for-loops, but I think there must be an easier way.- I would be very grateful for your help, Michael Graber __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to predict with logistic model in package logistf ?
dear community, I am a beginer in R , and can't predict with logistic model in package logistf, could anyone help me ? thanks ! the following is my command and result : library(logistf) data(sex2) fit-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sex2) predict(fit,newdata=sex2) Error in predict(fit, newdata = sex2) : no applicable method for predict __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] F tests for random effect models
Sorry, Actually I gave my data in an image file (.RData) - I've just checked my send emails. Am I to give data in another format, such as text ? Here are they in text (.txt). The output are : summary(aov1 - aov(Rendement ~ Error(Pollinisateur * Lignee), data = mca2) Error: Pollinisateur Df Sum Sq Mean Sq F value Pr(F) Residuals 9 11.9729 1.3303 Error: Lignee Df Sum Sq Mean Sq F value Pr(F) Residuals 4 18.0294 4.5074 Error: Pollinisateur:Lignee Df Sum Sq Mean Sq F value Pr(F) Residuals 36 5.1726 0.1437 Error: Within Df Sum Sq Mean Sq F value Pr(F) Residuals 50 3.7950 0.0759 # F tests : Femp - c(tab1[1:3, 3]/tab1[c(3,3,4), 3]) names(Femp) - c(Pollinisateur, Lignee, Interaction) Femp PollinisateurLignee Interaction 9.258709 31.370027 1.893061 1 - pf(Femp, tab1[1:3,1], tab1[c(3,3,4),1]) PollinisateurLignee Interaction 4.230265e-07 2.773448e-11 1.841028e-02 # Standard deviation : variances - c(c(tab1[1:3, 3] - tab1[c(3,3,4), 3]) / c(2*5, 2*10, 2), tab1[4,3]) names(variances) - c(names(Femp), Residuelle) variances PollinisateurLignee InteractionResiduelle 0.118663890.218183330.033891670.0759 # Using lmer : library(lme4) summary(lmer(Rendement ~ (1 |Pollinisateur) + (1 | Lignee) + (1 | Pollinisateur:Lignee), data=mca2)) Linear mixed-effects model fit by REML Formula: Rendement ~ (1 | Pollinisateur) + (1 | Lignee) + (1 | Pollinisateur:Lignee) Data: mca2 AIC BIClogLik MLdeviance REMLdeviance 105.3845 118.4104 -47.69227 94.35162 95.38453 Random effects: Groups NameVariance Std.Dev. Pollinisateur:Lignee (Intercept) 0.033892 0.18410 Pollinisateur(Intercept) 0.118664 0.34448 Lignee (Intercept) 0.218183 0.46710 Residual 0.075900 0.27550 # of obs: 100, groups: Pollinisateur:Lignee, 50; Pollinisateur, 10; Lignee, 5 Fixed effects: Estimate Std. Error DF t value Pr(|t|) (Intercept) 12.601000.23862 99 52.808 2.2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Thanks, Jacques VESLOT Prof Brian Ripley a écrit : Nothing was enclosed, nor was the output from summary.aov, so we are left guessing. On Thu, 27 Oct 2005, Jacques VESLOT wrote: Dear R-users, My question is how to get right F tests for random effects in random effect models (I hope this question has not been answered too many times yet - I didn't find an answer in rhelp archives). My data are in mca2 (enc.) : names(mca2) [1] LigneePollinisateur Rendement dim(mca2) [1] 100 3 replications(Rendement ~ Lignee * Pollinisateur, data = mca2) LigneePollinisateur Lignee:Pollinisateur 20 102 Of course, summary(aov(Rendement ~ Pollinisateur * Lignee, data = mca2)) gives wrong tests of random effects. But, summary(aov1 - aov(Rendement ~ Error(Pollinisateur * Lignee), data = mca2)) gives no test at all, and I have to do it like this : tab1 - matrix(unlist(summary(aov1)), nc=5, byrow=T)[,1:3] Femp - c(tab1[1:3, 3]/tab1[c(3,3,4), 3]) names(Femp) - c(Pollinisateur, Lignee, Interaction) 1 - pf(Femp, tab1[1:3,1], tab1[c(3,3,4),1]) With lme4 package (I did'nt succeed in writing a working formula with lme from nlme package), I can see standard deviations of random effects (but don't know how to find them) with : library(lme4) summary(lmer(Rendement ~ (1 |Pollinisateur) + (1 | Lignee) + (1 | Pollinisateur:Lignee), data=mca2)) but I can't get F tests. Thanks in advance. Best regards, Jacques VESLOT Lignee Pollinisateur Rendement L1 P1 13.4 L1 P1 13.3 L2 P1 12.4 L2 P1 12.6 L3 P1 12.7 L3 P1 13 L4 P1 12.6 L4 P1 12.6 L5 P1 11.9 L5 P1 11.6 L1 P2 12.6 L1 P2 12.7 L2 P2 12.1 L2 P2 11.3 L3 P2 12.4 L3 P2 11.9 L4 P2 12.2 L4 P2 12.1 L5 P2 11.3 L5 P2 11.5 L1 P3 13.3 L1 P3 13.2 L2 P3 12.9 L2 P3 12.3 L3 P3 12.1 L3 P3 12.9 L4 P3 12.8 L4 P3 13.4 L5 P3 11.7 L5 P3 11.8 L1 P4 13.5 L1 P4 14 L2 P4 13.5 L2 P4 12.7 L3 P4 13.3 L3 P4 13.5 L4 P4 13.4 L4 P4 13.4 L5 P4 13 L5 P4 13.1 L1 P5 13.7 L1 P5 13.8 L2 P5 12.5 L2 P5 13.1 L3 P5 12.8 L3 P5 12.5 L4 P5 13.7 L4 P5 13.8 L5 P5 12.2 L5 P5 12.1 L1 P6 12.8 L1 P6 13.1 L2 P6 12 L2 P6 11.8 L3 P6 12.4 L3 P6 12.2 L4 P6 13.3 L4 P6 13.5 L5 P6 12.4 L5 P6 11.5 L1 P7
Re: [R] install.packages under SuSE 10 behind proxy, R 2.2.0 from source
Hi I figured it out. if I use install.packages(..., method=wget) it works but if I use the default method, it doesn't. Rainer Rainer M. Krug wrote: Hi I installed R 2.2.0 from source and want to use install.packages but it doesn't work. http_proxy is set to http://proxy.sun.ac.za:3128 but it still can't connect to the repository. The mirror is available, I can connect to it via the internet. Any help welcome, Rainer -- NEW TELEPHONE NUMBER Tel:+27 - (0)72 808 2975 (w) Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to convert time to days
Thanks to everyone for your help. Yuup, this is my stupid word secs which I put there. Usually I get to run simulation on my machine only a few seconds. Now, I recode my timestamp, but still I don't know how to make x days, x hours, x minutes, x seconds. Best wishes, Muhammad Subianto On this day 26/10/2005 10:00 PM, Don MacQueen wrote: The word secs appears in Run time: 1.960625 secs because you put it there in your cat() statement. It has nothing to do with the number itself. Simply try typing end.time - begin.time at the prompt, and see what you get. Then see ?difftime for more information. Example difftime(end.time,begin.time,units='hours') To get the interval formatted as 1 day, 23 hours, x minutes, x seconds you will have to do more work. -Don At 4:18 PM +0200 10/26/05, Muhammad Subianto wrote: Dear all, I have ran a simulation in R. This simulation was running about at least two days. Here is below the result some part of my code about time result. I don't understand about Start time: Mon Oct 24, 2005 at 04:23:01 PM Finish time: Wed Oct 26, 2005 at 03:26:19 PM Run time: 1.960625 secs. This is about two seconds or one day and nine hours? Then, how could I convert to 1 day, 23 hours, ? minutes, ? seconds. Thanks you very much for any suggestions. Best wishes, Muhammad Subianto # Begin of program and timestamp: cat(format(begin.time - Sys.time(), %a %b %d %X %Y) ,\n) Mon Oct 24 04:23:01 PM 2005 cat(Start time:, secs - format(begin.time, %X), \n) Start time: 04:23:01 PM cat(Sys.time:, begin.time - Sys.time(), '\n') Sys.time: 1130163781 --- CODE SIMULATION --- # End of program and timestamp: cat(Sys.time:,end.time - Sys.time(), '\n') Sys.time: 1130333179 cat(Run Time:,end.time-begin.time, 'secs.\n\n') Run Time: 1.960625 secs. cat(Finish time:, secs - format(end.time, %X), \n) Finish time: 03:26:19 PM cat(format(end.time - Sys.time(), %a %b %d %X %Y) ,\n) Wed Oct 26 03:26:19 PM 2005 cat(\n, + Start time:, secs - format(begin.time, %a %b %d, %Y at %X), \n, + Finish time:, secs - format(end.time, %a %b %d, %Y at %X), \n, + Run time:, end.time-begin.time, 'secs.\n\n') Start time: Mon Oct 24, 2005 at 04:23:01 PM Finish time: Wed Oct 26, 2005 at 03:26:19 PM Run time: 1.960625 secs. ### __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] aov() and lme()
Sorry for reposting, but even after extensive search I still did not find any answers. using: summary(aov(pointErrorAbs~noOfSegments*turnAngle+Error(subj/(noOfSegments+turnAngle)), data=anovaAllData )) with subj being a random factor and noOfSegments and turnAngle being fixed factors, I get the following results: -- Error: subj Df Sum Sq Mean Sq F value Pr(F) Residuals 17 246606 14506 Error: subj:noOfSegments Df Sum Sq Mean Sq F value Pr(F) noOfSegments 3 7806.6 2602.2 5.3257 0.002864 ** Residuals51 24919.4 488.6 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: subj:turnAngle Df Sum Sq Mean Sq F value Pr(F) turnAngle 5 146602932 3.1707 0.01131 * Residuals 85 78600 925 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: Within Df Sum Sq Mean Sq F valuePr(F) noOfSegments:turnAngle 15 196371309 2.9135 0.0001711 *** Residuals 687 308687 449 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -- all is fine, and I get exactly the same results as with unix anova. No I trying to fit the same data with lme and using the following call: anova(lme(fixed=pointErrorAbs~noOfSegments*turnAngle, random=~1|subj, data=anovaAllData)) Unfortunately the results are 'really' different from the aov() procedure (I guess I have the call wrong): (Intercept)1 823 42.10888 .0001 noOfSegments 3 823 5.19549 0.0015 turnAngle 5 823 5.85379 .0001 noOfSegments:turnAngle15 823 2.61373 0.0007 I, however, need a comparable method for lme(), because in a different data set I have single empty cells and can therefore not use aov(). does anyone know how to fit with lme() to obtain the same results (for this balanced data set) as with aov(). Thanks in advance, Jan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] tcltk package problems (R 2.2.0, SuSE 10)
Hi I installed R 2.2.0 from source and I have the packages for tcl and tk installed on my system, but the package tcltk says, when I try to load the library tcltk: Tcl/Tk support is not available on this system. Are there any settings / variables which I have to set so that R recognises that Tcl/Tk support is installed on the system? Rainer -- NEW TELEPHONE NUMBER Tel:+27 - (0)72 808 2975 (w) Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Box.test
Does p-value on Box.test(data,lag=l) returns probability, that H0: cor(1)=cor(2)=..=cor(l)=0 holds? Thanks. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Box.test
Hi, Give a look to the help page: ? Box.test Compute the Box-Pierce or Ljung-Box test statistic for examining the null hypothesis of independence in a given time series. See also: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/27265.html Regards. Vito You wrote: Does p-value on Box.test(data,lag=l) returns probability, that H0: cor(1)=cor(2)=..=cor(l)=0 holds? Thanks. Diventare costruttori di soluzioni Became solutions' constructors The business of the statistician is to catalyze the scientific learning process. George E. P. Box Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write H. G. Wells Top 10 reasons to become a Statistician 1. Deviation is considered normal 2. We feel complete and sufficient 3. We are 'mean' lovers 4. Statisticians do it discretely and continuously 5. We are right 95% of the time 6. We can legally comment on someone's posterior distribution 7. We may not be normal, but we are transformable 8. We never have to say we are certain 9. We are honestly significantly different 10. No one wants our jobs Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/palesesanto_spirito/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Puzzled over curve() syntax.
It's probably toadally elementary (and, like, duh) but I can't figure out why the following doesn't work: curve(function(x){qnorm(x,4,25)},from=0,to=1) I get the error: Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ But if I do foo - function(x){qnorm(x,4,25)} curve(foo,from=0,to=1) it goes like a train. Also plot(function(x){qnorm(x,4,25)},from=0,to=1) works just fine. I'm using version _ platform sparc-sun-solaris2.9 arch sparc os solaris2.9 system sparc, solaris2.9 status major2 minor2.0 year 2005 month10 day 06 svn rev 35749 language R This is just idle curiousity I guess, but I would like to deepen my understanding. There's probably something about the ``expression'' concept that I'm not grokking here Thanks for any insight. cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help:simple bin problem histogram
Hi, I cannot seem to change the default binning settings for the x axis successfully using hist(). I have tried using axis() in conjunction with xaxt=n, but I keep getting the error message Warning message: parameter vect could not be set in high-level plot() function can anyone help please? Simon Pickett Centre for Ecology and Conservation Biology University of Exeter in Cornwall Tremough Campus Penryn Cornwall TR10 9EZ UK Tel: 01326371852 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Puzzled over curve() syntax.
On 10/27/2005 9:50 AM, Rolf Turner wrote: It's probably toadally elementary (and, like, duh) but I can't figure out why the following doesn't work: curve(function(x){qnorm(x,4,25)},from=0,to=1) I get the error: Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ But if I do foo - function(x){qnorm(x,4,25)} curve(foo,from=0,to=1) it goes like a train. Also plot(function(x){qnorm(x,4,25)},from=0,to=1) works just fine. I'm using version _ platform sparc-sun-solaris2.9 arch sparc os solaris2.9 system sparc, solaris2.9 status major2 minor2.0 year 2005 month10 day 06 svn rev 35749 language R This is just idle curiousity I guess, but I would like to deepen my understanding. There's probably something about the ``expression'' concept that I'm not grokking here It's the way curve is written (and documented, though perhaps a little obscurely). If you debug it, you'll see that eventually your function gets assigned to a variable called expr, and a nice list of values gets assigned to x, then it tries to evaluate y - eval(expr, envir = list(x = x), enclos = parent.frame()) But if you evaluate expr, you just get the function back, you don't call it. The problem is that curve was written assuming you'd call it as curve(qnorm(x,4,25),from=0,to=1) in which case the expression qnorm(x,4,25) gets evaluated at those x values and things are fine. I don't think this is a bug, but it might be worth fixing so your code works too. It's a little tricky, because to know that you passed a function in, you probably want to evaluate it; but if you evaluate qnorm(x,4,25) before you've set up x, you'll get an error. A fix is to add an additional else clause after the first test, namely else if (is.language(sexpr) identical(sexpr[[1]], as.name(function))) { expr - substitute(do.call(expr, list(x)), list(expr=expr)) if (is.null(ylab)) ylab - deparse(sexpr) } but this still doesn't handle the case where you've given a more general expression that returns a function, e.g. picking one out of a list. You'll probably need another argument to distinguish the case of an expression returning y values from an expression returning a function, and I'm not sure that level of elaboration would really be a good idea. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] comment code
Hi All, In R, can one comment out a block of code at once instead of using # one line at a time? Say, in SAS, one can use /**/ to comment out many lines. Thanks, Johnny [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] comment code
On Thu, 27 Oct 2005, Li,Qinghong,ST.LOUIS,Molecular Biology wrote: In R, can one comment out a block of code at once instead of using # one line at a time? Say, in SAS, one can use /**/ to comment out many lines. Try RSiteSearch(comment multiple lines). Note that R-aware editors can do this for you using #. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] adding sequence for each value in a vector
hi, i have a vector like : x-c(1,15,30,45,60,90,115) i know that step by step i have always more than 10 min(diff(x)) =11 i want to add for each value a sequence of value:value+9 result should be : 1 2 3 4 5 6 7 8 9 10 15 16 17 18 19 20 21 22 23 24 30 31 (...) 39 45 46 (...) 54 60 61 etc.. how can i do this without a loop (i'm sure there is a elegant way like always with R but i can't find it this time!) best, yves [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] adding sequence for each value in a vector
Here's one way: unlist(lapply(x, function(x) x:(x+9))) [1] 1 2 3 4 5 6 7 8 9 10 15 16 17 18 19 20 21 22 23 24 [21] 30 31 32 33 34 35 36 37 38 39 45 46 47 48 49 50 51 52 53 54 [41] 60 61 62 63 64 65 66 67 68 69 90 91 92 93 94 95 96 97 98 99 [61] 115 116 117 118 119 120 121 122 123 124 Andy From: Yves Magliulo hi, i have a vector like : x-c(1,15,30,45,60,90,115) i know that step by step i have always more than 10 min(diff(x)) =11 i want to add for each value a sequence of value:value+9 result should be : 1 2 3 4 5 6 7 8 9 10 15 16 17 18 19 20 21 22 23 24 30 31 (...) 39 45 46 (...) 54 60 61 etc.. how can i do this without a loop (i'm sure there is a elegant way like always with R but i can't find it this time!) best, yves [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to predict with logistic model in package logistf ?
Did you try fit$predict? Elizabeth Lawson jinlong li [EMAIL PROTECTED] wrote: dear community, I am a beginer in R , and can't predict with logistic model in package logistf, could anyone help me ? thanks ! the following is my command and result : library(logistf) data(sex2) fit-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sex2) predict(fit,newdata=sex2) Error in predict(fit, newdata = sex2) : no applicable method for predict __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] adding sequence for each value in a vector
On 27 Oct 2005 17:04:21 +0200, Yves Magliulo [EMAIL PROTECTED] wrote: hi, i have a vector like : x-c(1,15,30,45,60,90,115) i know that step by step i have always more than 10 min(diff(x)) =11 i want to add for each value a sequence of value:value+9 result should be : 1 2 3 4 5 6 7 8 9 10 15 16 17 18 19 20 21 22 23 24 30 31 (...) 39 45 46 (...) 54 60 61 etc.. how can i do this without a loop (i'm sure there is a elegant way like always with R but i can't find it this time!) Try this: c(outer(0:9, x, +)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] adding sequence for each value in a vector
Le 27.10.2005 17:04, Yves Magliulo a écrit : hi, i have a vector like : x-c(1,15,30,45,60,90,115) i know that step by step i have always more than 10 min(diff(x)) =11 i want to add for each value a sequence of value:value+9 result should be : 1 2 3 4 5 6 7 8 9 10 15 16 17 18 19 20 21 22 23 24 30 31 (...) 39 45 46 (...) 54 60 61 etc.. how can i do this without a loop (i'm sure there is a elegant way like always with R but i can't find it this time!) best, yves Also : R rep(x, each=10) + 0:9 -- visit the R Graph Gallery : http://addictedtor.free.fr/graphiques +---+ | Romain FRANCOIS - http://francoisromain.free.fr | | Doctorant INRIA Futurs / EDF | +---+ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] its dates masked by chron
I built R 2.2.0 from source on my debian machine yesterday and updated all packages. My problem is that dates function from its, that my code heavely uses is now masked by dates from chron. How can I specify tehat I want to use dates from its or how can I prevent it from being masked? library(its) Loading required package: Hmisc Hmisc library by Frank E Harrell Jr Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') to see overall documentation. NOTE:Hmisc no longer redefines [.factor to drop unused levels when subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). Attaching package: 'Hmisc' The following object(s) are masked from package:stats : ecdf Attaching package: 'chron' The following object(s) are masked from package:its : dates __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Repost: Examples of classwt, strata, and sampsize in randomForest?
I have read both the help files and that article... the article very nicely evaluates the value of dealing with unbalanced data, and the help files show that you can, but offer no guidance in terms of how the syntax should be specified. The strata and classwt clearly can be specified, but it's not shown how to specify the values... The examples do not include specifications of those terms, and every guess I've made has generated an error On 10/27/05, Gabor Grothendieck [EMAIL PROTECTED] wrote: See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/40898.html On 10/27/05, David L. Van Brunt, Ph.D. [EMAIL PROTECTED] wrote: Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a 0 v 1 classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors of the classes. Need not add up to one. Ignored for regression. So is this something like ... classwt=c(.90,.10) ? I didn't see the syntax demonstrated. Similar for strata and sampsize though there is a default for sampsize that makes sense... not sure how you would make a vector of the length the number of strata, however Pointers? -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Extracting Variance Components
Mike, use --- VarCorr(lme.object) or for a user friendly output use varcomp from the 'ape' package-- require(ape) varcomp(lme.object) varcomp also allows scaling of components to unity (*100 gives %) and also allows for cumulative sum of components. Note. varcomp doesn't work for lmer objects. HTH, John -- Michel Friesenhahn wrote- Dear List, Is there a way to extract variance components from lmeObjects or summary.lme objects without using intervals()? For my purposes I don't need the confidence intervals which I'm obtaining using parametric bootstrap. Thanks, Mike __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] memory problem in handling large dataset
Dear Listers: I have a question on handling large dataset. I searched R-Search and I hope I can get more information as to my specific case. First, my dataset has 1.7 billion observations and 350 variables, among which, 300 are float and 50 are integers. My system has 8 G memory, 64bit CPU, linux box. (currently, we don't plan to buy more memory). R.version _ platform i686-redhat-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor1.1 year 2005 month06 day 20 language R If I want to do some analysis for example like randomForest on a dataset, how many max observations can I load to get the machine run smoothly? After figuring out that number, I want to do some sampling first, but I did not find read.table or scan can do this. I guess I can load it into mysql and then use RMySQL do the sampling or use python to subset the data first. My question is, is there a way I can subsample directly from file just using R? Thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] outer-question
Dear all, This is a rather lengthy message, but I don't know what I made wrong in my real example since the simple code works. I have two variables a, b and a function f for which I would like to calculate all possible combinations of the values of a and b. If f is multiplication, I would simply do: a - 1:5 b - 1:5 outer(a,b) ## A bit more complicated is this: f - function(a,b,d) { return(a*b+(sum(d))) } additional - runif(100) outer(X=a, Y=b, FUN=f, d=additional) ## So far so good. But now my real example. I would like to plot the ## log-likelihood surface for two parameters alpha and beta of ## a Gompertz distribution with given data ### I have a function to generate random-numbers from a Gompertz-Distribution ### (using the 'inversion method') random.gomp - function(n, alpha, beta) { return( (log(1-(beta/alpha*log(1-runif(n)/beta) } ## Now I generate some 'lifetimes' no.people - 1000 al - 0.1 bet - 0.1 lifetimes - random.gomp(n=no.people, alpha=al, beta=bet) ### Since I neither have censoring nor truncation in this simple case, ### the log-likelihood should be simply the sum of the log of the ### the densities (following the parametrization of Klein/Moeschberger ### Survival Analysis, p. 38) loggomp - function(alphas, betas, timep) { return(sum(log(alphas) + betas*timep + (alphas/betas * (1-exp(betas*timep) } ### Now I thought I could obtain a matrix of the log-likelihood surface ### by specifying possible values for alpha and beta with the given data. ### I was able to produce this matrix with two for-loops. But I thought ### I could use also 'outer' in this case. ### This is what I tried: possible.alphas - seq(from=0.05, to=0.15, length=30) possible.betas - seq(from=0.05, to=0.15, length=30) outer(X=possible.alphas, Y=possible.betas, FUN=loggomp, timep=lifetimes) ### But the result is: outer(X=possible.alphas, Y=possible.betas, FUN=loggomp, timep=lifetimes) Error in outer(X = possible.alphas, Y = possible.betas, FUN = loggomp, : dim- : dims [product 900] do not match the length of object [1] In addition: Warning messages: ... ### Can somebody give me some hint where the problem is? ### I checked my definition of 'loggomp' but I thought this looks fine: loggomp(alphas=possible.alphas[1], betas=possible.betas[1], timep=lifetimes) loggomp(alphas=possible.alphas[4], betas=possible.betas[10], timep=lifetimes) loggomp(alphas=possible.alphas[3], betas=possible.betas[11], timep=lifetimes) ### I'd appreciate any kind of advice. ### Thanks a lot in advance. ### Roland + This mail has been sent through the MPI for Demographic Rese...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] its dates masked by chron
To redescribe the problem; I need to use dates from its its depends on Hmisc Hmisc depends chron dates in chron masks dates in its -- Forwarded message -- From: Omar Lakkis [EMAIL PROTECTED] Date: Oct 27, 2005 11:47 AM Subject: its dates masked by chron To: r-help@stat.math.ethz.ch I built R 2.2.0 from source on my debian machine yesterday and updated all packages. My problem is that dates function from its, that my code heavely uses is now masked by dates from chron. How can I specify tehat I want to use dates from its or how can I prevent it from being masked? library(its) Loading required package: Hmisc Hmisc library by Frank E Harrell Jr Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') to see overall documentation. NOTE:Hmisc no longer redefines [.factor to drop unused levels when subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). Attaching package: 'Hmisc' The following object(s) are masked from package:stats : ecdf Attaching package: 'chron' The following object(s) are masked from package:its : dates __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Repost: Examples of classwt, strata, and sampsize i n randomForest?
classwt in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. sampsize and strata can be use in conjunction. If strata is not specified, the class labels will be used. Take the iris data as an example: randomForest(Species ~ ., iris, sampsize=c(10, 30, 10)) says to randomly draw 10, 30 and 10 from the three species (with replacement) to grow each tree. If you are unsure of the labels, use named vector, e.g., randomForest(Species ~ ., iris, sampsize=c(setosa=10, versicolor=30, virginica=10)) Now, if you want the stratified sampling to be done using a different variable than the class labels; e.g., for multi-centered clinical trial data, you want to draw the same number of patients per center to grow each tree (I'm just making things up, not that that necessarily makes any sense), you can do something like: randomForest(..., strata=center, sampsize=rep(min(table(center))), nlevels(center))) which draws the same number of patients (minimum at any center) from each center to grow each tree. Hope that's clear. Eventually all such things will be in the yet to be written package vignette... Andy From: David L. Van Brunt, Ph.D. I have read both the help files and that article... the article very nicely evaluates the value of dealing with unbalanced data, and the help files show that you can, but offer no guidance in terms of how the syntax should be specified. The strata and classwt clearly can be specified, but it's not shown how to specify the values... The examples do not include specifications of those terms, and every guess I've made has generated an error On 10/27/05, Gabor Grothendieck [EMAIL PROTECTED] wrote: See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/40898.html On 10/27/05, David L. Van Brunt, Ph.D. [EMAIL PROTECTED] wrote: Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a 0 v 1 classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors of the classes. Need not add up to one. Ignored for regression. So is this something like ... classwt=c(.90,.10) ? I didn't see the syntax demonstrated. Similar for strata and sampsize though there is a default for sampsize that makes sense... not sure how you would make a vector of the length the number of strata, however Pointers? -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] memory problem in handling large dataset
I think the general advice is that around 1/4 or 1/3 of your available memory is about the largest data set that R can handle -- and often considerably less depending upon what you do and how you do it (because R's semantics require explicitly copying objects rather than passing pointers). Fancy tricks using environments might enable you to do better, but that requires advice from a true guru, which I ain't. See ?connections, ?scan, ?seek for reading in a file a chunk at a time from a connection, thus enabling you to sample one line of data from each chunk, say. I suppose you could do this directly with repeated calls to scan() or read.table() by skipping more and more lines at the beginning at each call, but I assume that is horridly inefficient and would take forever. HTH. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Thursday, October 27, 2005 9:28 AM To: r-help Subject: [R] memory problem in handling large dataset Dear Listers: I have a question on handling large dataset. I searched R-Search and I hope I can get more information as to my specific case. First, my dataset has 1.7 billion observations and 350 variables, among which, 300 are float and 50 are integers. My system has 8 G memory, 64bit CPU, linux box. (currently, we don't plan to buy more memory). R.version _ platform i686-redhat-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor1.1 year 2005 month06 day 20 language R If I want to do some analysis for example like randomForest on a dataset, how many max observations can I load to get the machine run smoothly? After figuring out that number, I want to do some sampling first, but I did not find read.table or scan can do this. I guess I can load it into mysql and then use RMySQL do the sampling or use python to subset the data first. My question is, is there a way I can subsample directly from file just using R? Thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to write and read an array ?
Hi, Apologies if the question is too simple but I didn't find the answer by myself. I'm able to create a 3-dimensionnal array A and to write it with write.table() ... but, after that, I don't find how to read it with read.table() getting the right 3 dimensions. I tried to use as.array(), to force the dim, etc but it didn't work. (It's probably obvious ... ?) Thanks for your info or pointer. Vincent __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] its dates masked by chron
Omar Lakkis [EMAIL PROTECTED] writes: To redescribe the problem; I need to use dates from its its depends on Hmisc Hmisc depends chron dates in chron masks dates in its So use its::dates ... -- Forwarded message -- From: Omar Lakkis [EMAIL PROTECTED] Date: Oct 27, 2005 11:47 AM Subject: its dates masked by chron To: r-help@stat.math.ethz.ch I built R 2.2.0 from source on my debian machine yesterday and updated all packages. My problem is that dates function from its, that my code heavely uses is now masked by dates from chron. How can I specify tehat I want to use dates from its or how can I prevent it from being masked? library(its) Loading required package: Hmisc Hmisc library by Frank E Harrell Jr Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') to see overall documentation. NOTE:Hmisc no longer redefines [.factor to drop unused levels when subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). Attaching package: 'Hmisc' The following object(s) are masked from package:stats : ecdf Attaching package: 'chron' The following object(s) are masked from package:its : dates __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RSQLite problems
Hi, I'm experimenting with using (R)SQLite to do data management. Here are two little problems that I've encountered: 1. The presence of ',' in string values causes trouble since ',' is also the delimiter used in the SQL statement. 2. A newline '\n' line attached to the last string value of each row. Some examples: library (RSQLite) Loading required package: DBI sqlite - dbDriver (SQLite) db - dbConnect (sqlite, dbname = test.dbms) data (barley) dbWriteTable (db, barley, barley, overwrite = TRUE) [1] TRUE barley[1:3,] yield variety yearsite 1 27.0 Manchuria 1931 University Farm 2 48.86667 Manchuria 1931 Waseca 3 27.43334 Manchuria 1931 Morris dbReadTable (db, barley)[1:3,] yield variety year__1 site 1 27.0 Manchuria1931 University Farm\n 2 48.86667 Manchuria1931 Waseca\n 3 27.43334 Manchuria1931 Morris\n barley$site - as.character (barley$site) barley$site[1] - University, Farm dbWriteTable (db, barley, barley, overwrite = TRUE) Error in sqliteWriteTable(conn, name, value, ...) : RS-DBI driver: (RS_sqlite_import: /tmp/RtmpgSNaLn/rsdbi6a5d128c line 1 expected 5 columns of data but found 6) I'm using RSQLite 0.4.0 with R 2.1.1 on Mac OS X. Cheers, Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] F tests for random effect models
I think what you're looking for is in anova() fm1 - lmer(dv ~ IV ...) anova(fm1) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jacques VESLOT Sent: Thursday, October 27, 2005 2:22 AM To: R-help@stat.math.ethz.ch Subject: [R] F tests for random effect models Dear R-users, My question is how to get right F tests for random effects in random effect models (I hope this question has not been answered too many times yet - I didn't find an answer in rhelp archives). My data are in mca2 (enc.) : names(mca2) [1] LigneePollinisateur Rendement dim(mca2) [1] 100 3 replications(Rendement ~ Lignee * Pollinisateur, data = mca2) LigneePollinisateur Lignee:Pollinisateur 20 102 Of course, summary(aov(Rendement ~ Pollinisateur * Lignee, data = mca2)) gives wrong tests of random effects. But, summary(aov1 - aov(Rendement ~ Error(Pollinisateur * Lignee), data = mca2)) gives no test at all, and I have to do it like this : tab1 - matrix(unlist(summary(aov1)), nc=5, byrow=T)[,1:3] Femp - c(tab1[1:3, 3]/tab1[c(3,3,4), 3]) names(Femp) - c(Pollinisateur, Lignee, Interaction) 1 - pf(Femp, tab1[1:3,1], tab1[c(3,3,4),1]) With lme4 package (I did'nt succeed in writing a working formula with lme from nlme package), I can see standard deviations of random effects (but don't know how to find them) with : library(lme4) summary(lmer(Rendement ~ (1 |Pollinisateur) + (1 | Lignee) + (1 | Pollinisateur:Lignee), data=mca2)) but I can't get F tests. Thanks in advance. Best regards, Jacques VESLOT __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] outer-question
It looks like you didn't vectorize the function you gave outer in your longer example. Consider your short example with a diagnostic printout: a - 1:3 b - 1:4 f - function(a,b,d) { + cat(In f:, length(a), length(b), \n) + return(a*b+(sum(d))) + } additional - runif(100) outer(X=a, Y=b, FUN=f, d=additional) In f: 12 12 [,1] [,2] [,3] [,4] [1,] 53.61985 54.61985 55.61985 56.61985 [2,] 54.61985 56.61985 58.61985 60.61985 [3,] 55.61985 58.61985 61.61985 64.61985 Note that f is called only once, with vectors for a and b. -- Tony Plate Rau, Roland wrote: Dear all, This is a rather lengthy message, but I don't know what I made wrong in my real example since the simple code works. I have two variables a, b and a function f for which I would like to calculate all possible combinations of the values of a and b. If f is multiplication, I would simply do: a - 1:5 b - 1:5 outer(a,b) ## A bit more complicated is this: f - function(a,b,d) { return(a*b+(sum(d))) } additional - runif(100) outer(X=a, Y=b, FUN=f, d=additional) ## So far so good. But now my real example. I would like to plot the ## log-likelihood surface for two parameters alpha and beta of ## a Gompertz distribution with given data ### I have a function to generate random-numbers from a Gompertz-Distribution ### (using the 'inversion method') random.gomp - function(n, alpha, beta) { return( (log(1-(beta/alpha*log(1-runif(n)/beta) } ## Now I generate some 'lifetimes' no.people - 1000 al - 0.1 bet - 0.1 lifetimes - random.gomp(n=no.people, alpha=al, beta=bet) ### Since I neither have censoring nor truncation in this simple case, ### the log-likelihood should be simply the sum of the log of the ### the densities (following the parametrization of Klein/Moeschberger ### Survival Analysis, p. 38) loggomp - function(alphas, betas, timep) { return(sum(log(alphas) + betas*timep + (alphas/betas * (1-exp(betas*timep) } ### Now I thought I could obtain a matrix of the log-likelihood surface ### by specifying possible values for alpha and beta with the given data. ### I was able to produce this matrix with two for-loops. But I thought ### I could use also 'outer' in this case. ### This is what I tried: possible.alphas - seq(from=0.05, to=0.15, length=30) possible.betas - seq(from=0.05, to=0.15, length=30) outer(X=possible.alphas, Y=possible.betas, FUN=loggomp, timep=lifetimes) ### But the result is: outer(X=possible.alphas, Y=possible.betas, FUN=loggomp, timep=lifetimes) Error in outer(X = possible.alphas, Y = possible.betas, FUN = loggomp, : dim- : dims [product 900] do not match the length of object [1] In addition: Warning messages: ... ### Can somebody give me some hint where the problem is? ### I checked my definition of 'loggomp' but I thought this looks fine: loggomp(alphas=possible.alphas[1], betas=possible.betas[1], timep=lifetimes) loggomp(alphas=possible.alphas[4], betas=possible.betas[10], timep=lifetimes) loggomp(alphas=possible.alphas[3], betas=possible.betas[11], timep=lifetimes) ### I'd appreciate any kind of advice. ### Thanks a lot in advance. ### Roland + This mail has been sent through the MPI for Demographic Rese...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] memory problem in handling large dataset
If my calculation is correct (very doubtful, sometimes), that's 1.7e9 * (300 * 8 + 50 * 4) / 1024^3 [1] 4116.446 or over 4 terabytes, just to store the data in memory. To sample rows and read that into R, Bert's suggestion of using connections, perhaps along with seek() for skipping ahead, would be what I'd try. I had try to do such things in Python as a chance to learn that language, but I found operationally it's easier to maintain the project by doing everything in one language, namely R, if possible. Andy From: Berton Gunter I think the general advice is that around 1/4 or 1/3 of your available memory is about the largest data set that R can handle -- and often considerably less depending upon what you do and how you do it (because R's semantics require explicitly copying objects rather than passing pointers). Fancy tricks using environments might enable you to do better, but that requires advice from a true guru, which I ain't. See ?connections, ?scan, ?seek for reading in a file a chunk at a time from a connection, thus enabling you to sample one line of data from each chunk, say. I suppose you could do this directly with repeated calls to scan() or read.table() by skipping more and more lines at the beginning at each call, but I assume that is horridly inefficient and would take forever. HTH. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Thursday, October 27, 2005 9:28 AM To: r-help Subject: [R] memory problem in handling large dataset Dear Listers: I have a question on handling large dataset. I searched R-Search and I hope I can get more information as to my specific case. First, my dataset has 1.7 billion observations and 350 variables, among which, 300 are float and 50 are integers. My system has 8 G memory, 64bit CPU, linux box. (currently, we don't plan to buy more memory). R.version _ platform i686-redhat-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor1.1 year 2005 month06 day 20 language R If I want to do some analysis for example like randomForest on a dataset, how many max observations can I load to get the machine run smoothly? After figuring out that number, I want to do some sampling first, but I did not find read.table or scan can do this. I guess I can load it into mysql and then use RMySQL do the sampling or use python to subset the data first. My question is, is there a way I can subsample directly from file just using R? Thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to predict with logistic model in package logistf ?
On 27 Oct 2005, at 09:18, jinlong li wrote: dear community, I am a beginer in R , and can't predict with logistic model in package logistf, Not exactly the answer to your question, but an alternative to the logistf package, which purports to do similar things, is brlr (which does provide a predict method). Does this work for you? library(brlr) data(sex2) fit - brlr(case ~ age + oc + vic + vicl + vis + dia, data = sex2) predict(fit, newdata = sex2) David could anyone help me ? thanks ! the following is my command and result : library(logistf) data(sex2) fit-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sex2) predict(fit,newdata=sex2) Error in predict(fit, newdata = sex2) : no applicable method for predict __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] memory problem in handling large dataset
Hi, Jim: Thanks for the calculation. I think you won't mind if I cc the reply to r-help too so that I can get more info. I assume you use 4 bytes for integer and 8 bytes for float, so 300x8+50x4=2600 bytes for each observation, right? I wish I could have 500x8 G memory :) just kidding.. definately, sampling will be proceeded as the first step. Some feature selections (filtering, mainly) will be applied. Accepting Berton's suggestion, I will probably use python to do the sampling since whenever I have some slow situations like this, python never fails me. (I am not saying R is bad though) I understand I get what I pay here. But more information or experience on R's handling large dataset (like using RMySQL) will be appreciated. regards, Weiwei On 10/27/05, jim holtman [EMAIL PROTECTED] wrote: Based on the numbers that you gave, if you wanted all the data in memory at once, you would need 4.4TB of memory, about 500X what you currently have. Each of you observation will require about 2,600 bytes of memory. You probably don't want to have more than 25% for a single object since many of the algorithms make copies. This would limit you to about 700,000 observations at a time for processing. The real question is what are you trying to do with the data. Can you partition the data and do analysis on the subsets? On 10/27/05, Weiwei Shi [EMAIL PROTECTED] wrote: Dear Listers: I have a question on handling large dataset. I searched R-Search and I hope I can get more information as to my specific case. First, my dataset has 1.7 billion observations and 350 variables, among which, 300 are float and 50 are integers. My system has 8 G memory, 64bit CPU, linux box. (currently, we don't plan to buy more memory). R.version _ platform i686-redhat-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor1.1 year 2005 month06 day 20 language R If I want to do some analysis for example like randomForest on a dataset, how many max observations can I load to get the machine run smoothly? After figuring out that number, I want to do some sampling first, but I did not find read.table or scan can do this. I guess I can load it into mysql and then use RMySQL do the sampling or use python to subset the data first. My question is, is there a way I can subsample directly from file just using R? Thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] memory problem in handling large dataset
Dear Andy: I think our emails crossed. But thanks as before. Weiwei On 10/27/05, Liaw, Andy [EMAIL PROTECTED] wrote: If my calculation is correct (very doubtful, sometimes), that's 1.7e9 * (300 * 8 + 50 * 4) / 1024^3 [1] 4116.446 or over 4 terabytes, just to store the data in memory. To sample rows and read that into R, Bert's suggestion of using connections, perhaps along with seek() for skipping ahead, would be what I'd try. I had try to do such things in Python as a chance to learn that language, but I found operationally it's easier to maintain the project by doing everything in one language, namely R, if possible. Andy From: Berton Gunter I think the general advice is that around 1/4 or 1/3 of your available memory is about the largest data set that R can handle -- and often considerably less depending upon what you do and how you do it (because R's semantics require explicitly copying objects rather than passing pointers). Fancy tricks using environments might enable you to do better, but that requires advice from a true guru, which I ain't. See ?connections, ?scan, ?seek for reading in a file a chunk at a time from a connection, thus enabling you to sample one line of data from each chunk, say. I suppose you could do this directly with repeated calls to scan() or read.table() by skipping more and more lines at the beginning at each call, but I assume that is horridly inefficient and would take forever. HTH. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Thursday, October 27, 2005 9:28 AM To: r-help Subject: [R] memory problem in handling large dataset Dear Listers: I have a question on handling large dataset. I searched R-Search and I hope I can get more information as to my specific case. First, my dataset has 1.7 billion observations and 350 variables, among which, 300 are float and 50 are integers. My system has 8 G memory, 64bit CPU, linux box. (currently, we don't plan to buy more memory). R.version _ platform i686-redhat-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor1.1 year 2005 month06 day 20 language R If I want to do some analysis for example like randomForest on a dataset, how many max observations can I load to get the machine run smoothly? After figuring out that number, I want to do some sampling first, but I did not find read.table or scan can do this. I guess I can load it into mysql and then use RMySQL do the sampling or use python to subset the data first. My question is, is there a way I can subsample directly from file just using R? Thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachment...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to write and read an array ?
check ?dput and ?dget Cheers Francisco From: [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] how to write and read an array ? Date: Thu, 27 Oct 2005 19:00:10 +0200 Hi, Apologies if the question is too simple but I didn't find the answer by myself. I'm able to create a 3-dimensionnal array A and to write it with write.table() ... but, after that, I don't find how to read it with read.table() getting the right 3 dimensions. I tried to use as.array(), to force the dim, etc but it didn't work. (It's probably obvious ... ?) Thanks for your info or pointer. Vincent __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] encrypted RData file?
Hi, I wonder if there is interest/intention to allow for encrypted .RData files? One can certainly do that outside R manually but that will leave a decrypted RData file somewhere which one has to remember to delete. Cheers, Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Repost: Examples of classwt, strata, and sampsize i n randomForest?
Perfect! More useful than I was even hoping for. Great help, many thanks! On 10/27/05, Liaw, Andy [EMAIL PROTECTED] wrote: classwt in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. sampsize and strata can be use in conjunction. If strata is not specified, the class labels will be used. Take the iris data as an example: randomForest(Species ~ ., iris, sampsize=c(10, 30, 10)) says to randomly draw 10, 30 and 10 from the three species (with replacement) to grow each tree. If you are unsure of the labels, use named vector, e.g., randomForest(Species ~ ., iris, sampsize=c(setosa=10, versicolor=30, virginica=10)) Now, if you want the stratified sampling to be done using a different variable than the class labels; e.g., for multi-centered clinical trial data, you want to draw the same number of patients per center to grow each tree (I'm just making things up, not that that necessarily makes any sense), you can do something like: randomForest(..., strata=center, sampsize=rep(min(table(center))), nlevels(center))) which draws the same number of patients (minimum at any center) from each center to grow each tree. Hope that's clear. Eventually all such things will be in the yet to be written package vignette... Andy From: David L. Van Brunt, Ph.D. I have read both the help files and that article... the article very nicely evaluates the value of dealing with unbalanced data, and the help files show that you can, but offer no guidance in terms of how the syntax should be specified. The strata and classwt clearly can be specified, but it's not shown how to specify the values... The examples do not include specifications of those terms, and every guess I've made has generated an error On 10/27/05, Gabor Grothendieck [EMAIL PROTECTED] wrote: See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/40898.html On 10/27/05, David L. Van Brunt, Ph.D. [EMAIL PROTECTED] wrote: Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a 0 v 1 classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors of the classes. Need not add up to one. Ignored for regression. So is this something like ... classwt=c(.90,.10) ? I didn't see the syntax demonstrated. Similar for strata and sampsize though there is a default for sampsize that makes sense... not sure how you would make a vector of the length the number of strata, however Pointers? -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] adding error bars to lattice plots
On 10/20/05, Mario Aigner-Torres [EMAIL PROTECTED] wrote: [...] I have right now a dataset that looks like this: tail(partition, 3) element run logfO2 TC buffer xAn sdXan Di Disigma 416 Al 36 -0.68 1180 AIR 0.734 0.007 2.10 0.02 417 Ca 36 -0.68 1180 AIR 0.734 0.007 1.29 0.02 418 Na 36 -0.68 1180 AIR 0.734 0.007 1.16 0.06 Basicaly I would like to insert error bars into a xyplot like this [...] Generally speaking, you need to pass some auxiliary variables to the panel function. This is easy to do, since all unrecognized arguments are passed to the panel function anyway. The trick is to figure out inside the panel function which elements of these variables correspond to the subset of data in that panel. This is done using the subscripts argument. So, for example, you can define prepanel.ci - function(x, y, lx, ux, subscripts, ...) { x - as.numeric(x) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) list(xlim = range(x, ux, lx, finite = TRUE)) } panel.ci - function(x, y, lx, ux, subscripts, pch = 16, ...) { x - as.numeric(x) y - as.numeric(y) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) panel.abline(h = unique(y), col = grey) panel.arrows(lx, y, ux, y, col = 'black', length = 0.25, unit = native, angle = 90, code = 3) panel.xyplot(x, y, pch = pch, ...) } and then add these to your call, supplying suitable values for lx and ux (the vectors of lower and upper limits). The one glitch with this approach is that unlike variables in the formula (and groups, which is treated specially because of its ubiquity but otherwise works on exactly the same principle), lx and ly will not be evaluated in 'data'. I like to use 'with' to get around this. Here's an example with the singer data, it should be easy to translate to your example. singer.split - with(singer, split(height, voice.part)) singer.ucl - sapply(singer.split, function(x) { st - boxplot.stats(x) c(st$stats[3], st$conf) }) singer.ucl - as.data.frame(t(singer.ucl)) names(singer.ucl) - c(median, lower, upper) singer.ucl$voice.part - factor(rownames(singer.ucl), levels = rownames(singer.ucl)) ## show the data frame singer.ucl with(singer.ucl, xyplot(voice.part ~ median, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci)) -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problems with source() function
Hello list members! I'm trying to enter some data in an R session using source() function with an URL as argument. The data source is a PHP script located in an apache web server and the data is a long list generated on-the-fly, these are the initial lines: groups-list() groups[['ENSMUST001']]=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785) groups[['ENSMUST003']]=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) groups[['ENSMUST028']]=c(199311,325400,184761,241988,376845,75052,67724,404240,439543,391057,393816) groups[['ENSMUST031']]=c(402587,352900,139030,186068,463553,328881,74942,277085,301431,256149,410846) groups[['ENSMUST033']]=c(12700,23908,11140,122358,389908,390084,383903,354007,457965,106395,131876) groups[['ENSMUST049']]=c(59336,203239,101077,382882,327374,281549,212042,275594,361523,490934,240275) groups[['ENSMUST056']]=c(409571,304584,394332,379699,13785,4260,29,42538,304075,47734,485512,52501,328509,504846,334607,82566,250088,150240,16422,446551,314484,91878,124752,341638,379512,379890,319764,8019,59221,156508,362524,74001,149400) groups[['ENSMUST058']]=c(26511,45! 5190,466368,358528,268486,315461,149260,422804,137641,163718,352555) The problem: When I execute the command it apparently finish ok, without printed errors but when I test the consistency of the data entered using the command length() I always obtain different figures. More facts: When I source the data from a static file instead an url, the data is fully entered and the length is always the same (20346 list elements). It delays 30 secs to load. When I source the data from the dynamic way, from an url, it delays 2 min. and always data is truncated. Tried and miserably failed: - Changed .Options$timeout from 60 to 300 - Using R --verbose is of no help, the data is silently truncated. - Changed the expression in which data is entered: groups-list( 'ENSMUST001'=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785), 'ENSMUST003'=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) ... ) Kind list members, is there some timeout I am missing? Some way to debug the process? Some suggestion? Sincerely, thank you! Alberto de Luis www.cicancer.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] horizontal violin plots?
On 10/26/05, Karin Lagesen [EMAIL PROTECTED] wrote: I am trying to make horizontal violin plots. I have tried both vioplot and simple.violinplot, but both of them seem to not be willing to take the horizontal option. Is this correct, or am I just bungling it somehow? For instance, for vioplot (from the example shown, with the horizontal modification): vioplot(bimodal,uniform,normal, horizontal=TRUE) Error in median(data) : need numeric data One possibility is to use lattice instead, see e.g. library(lattice) example(panel.violin) HTH, Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Yes, it is of interest and was sitting on my todo list at some time. If you want to go ahead and provide code to do it, that would be terrific. There are other areas where encryption would be good to have, so a general mechanism would be nice. D. Na Li wrote: Hi, I wonder if there is interest/intention to allow for encrypted .RData files? One can certainly do that outside R manually but that will leave a decrypted RData file somewhere which one has to remember to delete. Cheers, Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - -- Duncan Temple Lang[EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 371 Kerr Hall fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDYRme9p/Jzwa2QP4RAtdWAJ9xsBXYFpNQipw6szvSfcjuplCrHwCfe0iV avTkVUUFlolKsNKZmGtCbFw= =/JFv -END PGP SIGNATURE- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RSQLite problems
I encountered this too, and my limited investigation (both on the web and in R) was unable to find a work around. -roger Na Li wrote: Hi, I'm experimenting with using (R)SQLite to do data management. Here are two little problems that I've encountered: 1. The presence of ',' in string values causes trouble since ',' is also the delimiter used in the SQL statement. 2. A newline '\n' line attached to the last string value of each row. Some examples: library (RSQLite) Loading required package: DBI sqlite - dbDriver (SQLite) db - dbConnect (sqlite, dbname = test.dbms) data (barley) dbWriteTable (db, barley, barley, overwrite = TRUE) [1] TRUE barley[1:3,] yield variety yearsite 1 27.0 Manchuria 1931 University Farm 2 48.86667 Manchuria 1931 Waseca 3 27.43334 Manchuria 1931 Morris dbReadTable (db, barley)[1:3,] yield variety year__1 site 1 27.0 Manchuria1931 University Farm\n 2 48.86667 Manchuria1931 Waseca\n 3 27.43334 Manchuria1931 Morris\n barley$site - as.character (barley$site) barley$site[1] - University, Farm dbWriteTable (db, barley, barley, overwrite = TRUE) Error in sqliteWriteTable(conn, name, value, ...) : RS-DBI driver: (RS_sqlite_import: /tmp/RtmpgSNaLn/rsdbi6a5d128c line 1 expected 5 columns of data but found 6) I'm using RSQLite 0.4.0 with R 2.1.1 on Mac OS X. Cheers, Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
I would be interested in that, particularly with certain kinds of confidential data. What was the approach you had in mind (if you in fact had one in mind)? -roger Na Li wrote: Hi, I wonder if there is interest/intention to allow for encrypted .RData files? One can certainly do that outside R manually but that will leave a decrypted RData file somewhere which one has to remember to delete. Cheers, Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Dendrogram for many cases
David, Sounds as if you're looking for cut.dendrogram(). My solution (with c. 250 cases) has been to color the terminals so patterns can be seen even when there are too many terminals to label. I don't think you can do that easily with plot.hclust() or plot.dendrogram() so I posted a hacked version of plot.dendrogram() to R-devel last week. Subsequently I was also pointed to a package that does a better job of it than my hacked function. It's called A2R, and is not on CRAN but can be downloaded from: http://addictedtor.free.fr/packages. Walton Date: Wed, 26 Oct 2005 11:23:26 +0100 From: David Lucy [EMAIL PROTECTED] Subject: [R] Dendrogram for many cases To: r-help@stat.math.ethz.ch Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=us-ascii; format=flowed Dear All, I have a cluster object based on a dissimilarity matrix from about 1,100 cases and wish to know whether anyone can think of any tips to display some form of graphical output which would give some sense of the similarity between the cases. A standard form of dendrogram would be fine, but with so many cases the dendrogram on the standard devices (R-2.20 on NT4) is very compact in the x-dimension. I wonder whether there is any way that the dendrogram can be subdivided into discrete pieces? Failing that, is there any other means of graphically representing the dissimilarity matrix. I am only interested in the low order dissimilarity rather than high order structure between these cases. A further constraint is that the NT4 box is well bolted down in that it has no means by which data can be transfered to, or from it. Cheers, David. .signature # Walton A. Green[EMAIL PROTECTED] # # 139 Caulkinstown Road P. O. Box 208109, Yale Station # # Sharon, Connecticut 06069 New Haven, Connecticut 06520 # # (860) 364-5100(203) 640-8122 # 60 characters wide __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fitting of Non-Linear Diff Equations and Parameter Estimation
Raja Jayaraman rajnmsu79 at gmail.com writes: Hello Everybody, I am running R 2.2.0 with Windows XP i am trying to fit nonlinear differential equation to data sets which looks like this: [SNIP] and i need to fit these data to the following diff equation: dNdt=a*N-b*N*C, dCdt=N^2, Where a=birth rate, b=death rate and N= Current count, C= Cumulative Count. i need to fit the differential equation, solve and obtain parameters a,b. can someone help with this, Thanks Raj Try looking at the package odesolve for solving the ode system. Woody Setzer National Center for Computational Toxicology US EPA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problems with source() function
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Does source(textConnection(readLines(url(http://...))) give the correct answer. If not, what is being dropped when you just use readLines() and look at the contents of the download. And how long is the longest line? The RCurl package (http://www.omegahat.org/RCurl) gives you a lot of control in perform and processing HTTP requests, allowing you to control the request, and read the body and the header of the response. It may be worth a try if things are getting frustrating. D. Al wrote: Hello list members! I'm trying to enter some data in an R session using source() function with an URL as argument. The data source is a PHP script located in an apache web server and the data is a long list generated on-the-fly, these are the initial lines: groups-list() groups[['ENSMUST001']]=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785) groups[['ENSMUST003']]=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) groups[['ENSMUST028']]=c(199311,325400,184761,241988,376845,75052,67724,404240,439543,391057,393816) groups[['ENSMUST031']]=c(402587,352900,139030,186068,463553,328881,74942,277085,301431,256149,410846) groups[['ENSMUST033']]=c(12700,23908,11140,122358,389908,390084,383903,354007,457965,106395,131876) groups[['ENSMUST049']]=c(59336,203239,101077,382882,327374,281549,212042,275594,361523,490934,240275) groups[['ENSMUST056']]=c(409571,304584,394332,379699,13785,4260,29,42538,304075,47734,485512,52501,328509,504846,334607,82566,250088,150240,16422,446551,314484,91878,124752,341638,379512,379890,319764,8019,59221,156508,362524,74001,149400) groups[['ENSMUST058']]=c(26511,4 5! 5190,466368,358528,268486,315461,149260,422804,137641,163718,352555) The problem: When I execute the command it apparently finish ok, without printed errors but when I test the consistency of the data entered using the command length() I always obtain different figures. More facts: When I source the data from a static file instead an url, the data is fully entered and the length is always the same (20346 list elements). It delays 30 secs to load. When I source the data from the dynamic way, from an url, it delays 2 min. and always data is truncated. Tried and miserably failed: - Changed .Options$timeout from 60 to 300 - Using R --verbose is of no help, the data is silently truncated. - Changed the expression in which data is entered: groups-list( 'ENSMUST001'=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785), 'ENSMUST003'=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) ... ) Kind list members, is there some timeout I am missing? Some way to debug the process? Some suggestion? Sincerely, thank you! Alberto de Luis www.cicancer.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - -- Duncan Temple Lang[EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 371 Kerr Hall fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDYSvk9p/Jzwa2QP4RAsqfAJ98RNScQ7ea1/MAnt72R0VGZoXaEQCfZvyl WNNN/HT1hx/Kix3KSp15XwM= =VsDG -END PGP SIGNATURE- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] installing Rmpi
Hello, I've installed R on my RHEL3 cluster and I am trying to get Rmpi to work properly. R is installed using the following ./configure --prefix=/home/apps/R-2.2.0 I installed snow using R CMD INSTALL /home/apps/snow And finaly Rmpi R CMD INSTALL /home/apps/Rmpi --configure-args=--with-mpi=/path/to/lam There were no errors or warnings upon installation. However when i perform the test below i get an error message R library(snow) R cl - makeCluster(2) Rmpi version: 0.4-9 Rmpi is an interface (wrapper) to MPI APIs with interactive R slave functionalities. See `library (help=Rmpi)' for details. Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/home/apps/R-2.2.0/lib/R/library/Rmpi/libs/Rmpi.so': libmpi.so.0: cannot open shared object file: No such file or directory Error in dyn.unload(x) : dynamic/shared library '/home/apps/R-2.2.0/lib/R/library/Rmpi/libs/Rmpi.so' was not loaded Error in makeMPIcluster(spec, ...) : the `Rmpi' package is needed for MPI clusters. The Rmpi.so exists and the permissions are fine. I also did a lamboot, and its running in the backround fine as well. Any suggestions? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] its dates masked by chron
Peter Dalgaard wrote: Omar Lakkis [EMAIL PROTECTED] writes: To redescribe the problem; I need to use dates from its its depends on Hmisc Hmisc depends chron dates in chron masks dates in its So use its::dates ... ... or ask the package maintainer (which might be a hard task: the package currently appears to be more or less unmaintained) to fix this probably unintended behaviour. Uwe Ligges -- Forwarded message -- From: Omar Lakkis [EMAIL PROTECTED] Date: Oct 27, 2005 11:47 AM Subject: its dates masked by chron To: r-help@stat.math.ethz.ch I built R 2.2.0 from source on my debian machine yesterday and updated all packages. My problem is that dates function from its, that my code heavely uses is now masked by dates from chron. How can I specify tehat I want to use dates from its or how can I prevent it from being masked? library(its) Loading required package: Hmisc Hmisc library by Frank E Harrell Jr Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') to see overall documentation. NOTE:Hmisc no longer redefines [.factor to drop unused levels when subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). Attaching package: 'Hmisc' The following object(s) are masked from package:stats : ecdf Attaching package: 'chron' The following object(s) are masked from package:its : dates __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] its dates masked by chron
On Thu, 27 Oct 2005, Uwe Ligges wrote: Peter Dalgaard wrote: Omar Lakkis [EMAIL PROTECTED] writes: To redescribe the problem; I need to use dates from its its depends on Hmisc Hmisc depends chron dates in chron masks dates in its So use its::dates ... ... or ask the package maintainer (which might be a hard task: the package currently appears to be more or less unmaintained) to fix this probably unintended behaviour. If he can reproduce it: I cannot. Hmisc does not say it depends on chron according to its DESCRIPTION file, so I don't see where the idea comes from. A misreading of http://cran.r-project.org/src/contrib/Descriptions/Hmisc.html ? Certainly Hmisc does not load chron (or anything else) on any of my systems. Further, dependencies are loaded *before* the package in question. After library(its) I get search() [1] .GlobalEnvpackage:its package:Hmisc [4] package:methods package:stats package:graphics ... One way out would be to load chron, Hmisc and then its in that order, but are these really the current its (1.0.9) and Hmisc (3.0-7)? -- Forwarded message -- From: Omar Lakkis [EMAIL PROTECTED] Date: Oct 27, 2005 11:47 AM Subject: its dates masked by chron To: r-help@stat.math.ethz.ch I built R 2.2.0 from source on my debian machine yesterday and updated all packages. My problem is that dates function from its, that my code heavely uses is now masked by dates from chron. How can I specify tehat I want to use dates from its or how can I prevent it from being masked? library(its) Loading required package: Hmisc Hmisc library by Frank E Harrell Jr Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') to see overall documentation. NOTE:Hmisc no longer redefines [.factor to drop unused levels when subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). Attaching package: 'Hmisc' The following object(s) are masked from package:stats : ecdf Attaching package: 'chron' The following object(s) are masked from package:its : dates __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] its dates masked by chron
Uwe, It was unclear whether you were referring to chron or its as being unmaintained. I still maintain its, and I'm actually releasing a new version tonight since Kurt has pointed out that the current version is failing package checking. It seems that both its and chron use namespaces. I thought the intent of namespaces was to prevent problems like this. If there are namespace experts out there who can suggest a fix to this problem, I'm happy to put it into the next release. -Whit -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Thursday, October 27, 2005 4:07 PM To: Peter Dalgaard Cc: r-help@stat.math.ethz.ch Subject: Re: [R] its dates masked by chron Peter Dalgaard wrote: Omar Lakkis [EMAIL PROTECTED] writes: To redescribe the problem; I need to use dates from its its depends on Hmisc Hmisc depends chron dates in chron masks dates in its So use its::dates ... ... or ask the package maintainer (which might be a hard task: the package currently appears to be more or less unmaintained) to fix this probably unintended behaviour. Uwe Ligges -- Forwarded message -- From: Omar Lakkis [EMAIL PROTECTED] Date: Oct 27, 2005 11:47 AM Subject: its dates masked by chron To: r-help@stat.math.ethz.ch I built R 2.2.0 from source on my debian machine yesterday and updated all packages. My problem is that dates function from its, that my code heavely uses is now masked by dates from chron. How can I specify tehat I want to use dates from its or how can I prevent it from being masked? library(its) Loading required package: Hmisc Hmisc library by Frank E Harrell Jr Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') to see overall documentation. NOTE:Hmisc no longer redefines [.factor to drop unused levels when subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). Attaching package: 'Hmisc' The following object(s) are masked from package:stats : ecdf Attaching package: 'chron' The following object(s) are masked from package:its : dates __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] AOV with repeated measures
You probably need specify the repeated measures by using an Error term in aov for repeated measures: aov(trait ~ species + strain + Error(species/strain)) Take a look at Ripley's book. Treat above with caution: I am no expert, but the answer is in that direction... Michael Jerosch-Herold I have a question on using R to analyze data with repeated measurements. I have 2 species with several strains (12) per species, each of which has been measured twice with for a given trait. No particular covariance, just two measures. Now I want to analyze the data with an ANOVA (aov) considering these repeated measures to get the MSq and SSq for the species and strain level. I would like to know how to write the ANOVA model in R. I have done the following: aov(trait ~ species + strain/replicate) Is it accurate? Thanks a lot, Christian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] tree widget question
I'm trying to create an app using TclTk and R Can someone please explain how I bind a click event to the tree widget (http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/TreeWidget.html) Ideally I'd like to bind to particular elements in the tree but tkbind doesnt seem to work. thanks tom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] syntax of nlme with nesting
This may appear too elementary to some on this list, but not to me. My apologies if this is the case. I have mastered the lme function but the nlme function has me stumped. I am attempting to fit a nonlinear mixed model with 4 levels of nesting. I am getting a cryptic error message and do not know what is wrong with the syntax of the call. This is the call: nlme(Photosynthese~NRhyperbola(Irr,theta,Am,alpha,Rd), + fixed=theta+Am+alpha+Rd~1, + random=theta~1|Reference/Espece/Plante/Groupe, + data=lit.data) NRhyperbola is a self-starting function with one variable (Irr) and four parameters (theta,Am,alpha,Rd). The data set (lit.data) contains Photosynthese (dependent variable) and Irr, as well as the grouping structure, which is Reference, Espece nested in Reference, Plante nested in Espece and Groupe nested in Plante. I want to allow only the parameter theta to vary randomly. I get the following error message: Error: subscript out of bounds. What does this mean? There are some Plante for which there is only one Groupe , some Espece for which there is only one Plante etc. Is this the source of the error? If so, how can one solve this? Bill Shipley [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] memory problem in handling large dataset
An alternative could be to store data in a MySql database and then select a sample of the cases using the RODBC package. Best Søren Fra: [EMAIL PROTECTED] på vegne af Liaw, Andy Sendt: to 27-10-2005 19:21 Til: 'Berton Gunter'; 'Weiwei Shi'; 'r-help' Emne: Re: [R] memory problem in handling large dataset If my calculation is correct (very doubtful, sometimes), that's 1.7e9 * (300 * 8 + 50 * 4) / 1024^3 [1] 4116.446 or over 4 terabytes, just to store the data in memory. To sample rows and read that into R, Bert's suggestion of using connections, perhaps along with seek() for skipping ahead, would be what I'd try. I had try to do such things in Python as a chance to learn that language, but I found operationally it's easier to maintain the project by doing everything in one language, namely R, if possible. Andy From: Berton Gunter I think the general advice is that around 1/4 or 1/3 of your available memory is about the largest data set that R can handle -- and often considerably less depending upon what you do and how you do it (because R's semantics require explicitly copying objects rather than passing pointers). Fancy tricks using environments might enable you to do better, but that requires advice from a true guru, which I ain't. See ?connections, ?scan, ?seek for reading in a file a chunk at a time from a connection, thus enabling you to sample one line of data from each chunk, say. I suppose you could do this directly with repeated calls to scan() or read.table() by skipping more and more lines at the beginning at each call, but I assume that is horridly inefficient and would take forever. HTH. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Thursday, October 27, 2005 9:28 AM To: r-help Subject: [R] memory problem in handling large dataset Dear Listers: I have a question on handling large dataset. I searched R-Search and I hope I can get more information as to my specific case. First, my dataset has 1.7 billion observations and 350 variables, among which, 300 are float and 50 are integers. My system has 8 G memory, 64bit CPU, linux box. (currently, we don't plan to buy more memory). R.version _ platform i686-redhat-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor1.1 year 2005 month06 day 20 language R If I want to do some analysis for example like randomForest on a dataset, how many max observations can I load to get the machine run smoothly? After figuring out that number, I want to do some sampling first, but I did not find read.table or scan can do this. I guess I can load it into mysql and then use RMySQL do the sampling or use python to subset the data first. My question is, is there a way I can subsample directly from file just using R? Thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
On 27 Oct 2005, Duncan Temple Lang wrote: Yes, it is of interest and was sitting on my todo list at some time. If you want to go ahead and provide code to do it, that would be terrific. There are other areas where encryption would be good to have, so a general mechanism would be nice. D. Na Li wrote: Hi, I wonder if there is interest/intention to allow for encrypted .RData files? One can certainly do that outside R manually but that will leave a decrypted RData file somewhere which one has to remember to delete. I was hoping someone has already done it. ;-( One possibility is to implement an interface package to gpgme library which itself is an interface to GnuPG. But I'm not sure how the input of passphrase can be handled without using clear text. Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] installing Rmpi
not sure if this message sent the first time, sorry :) -- Forwarded message -- From: Jon Savian [EMAIL PROTECTED] Date: Oct 27, 2005 1:04 PM Subject: installing Rmpi To: r-help@stat.math.ethz.ch Hello, I've installed R on my RHEL3 cluster and I am trying to get Rmpi to work properly. R is installed using the following ./configure --prefix=/home/apps/R-2.2.0 I installed snow using R CMD INSTALL /home/apps/snow And finaly Rmpi R CMD INSTALL /home/apps/Rmpi --configure-args=--with-mpi=/path/to/lam There were no errors or warnings upon installation. However when i perform the test below i get an error message R library(snow) R cl - makeCluster(2) Rmpi version: 0.4-9 Rmpi is an interface (wrapper) to MPI APIs with interactive R slave functionalities. See `library (help=Rmpi)' for details. Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/home/apps/R-2.2.0/lib/R/library/Rmpi/libs/Rmpi.so': libmpi.so.0: cannot open shared object file: No such file or directory Error in dyn.unload(x) : dynamic/shared library '/home/apps/R-2.2.0/lib/R/library/Rmpi/libs/Rmpi.so' was not loaded Error in makeMPIcluster(spec, ...) : the `Rmpi' package is needed for MPI clusters. The Rmpi.so exists and the permissions are fine. I also did a lamboot, and its running in the backround fine as well. Any suggestions? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] outer-question
You want FAQ 7.17 Why does outer() behave strangely with my function? -thomas On Thu, 27 Oct 2005, Rau, Roland wrote: Dear all, This is a rather lengthy message, but I don't know what I made wrong in my real example since the simple code works. I have two variables a, b and a function f for which I would like to calculate all possible combinations of the values of a and b. If f is multiplication, I would simply do: a - 1:5 b - 1:5 outer(a,b) ## A bit more complicated is this: f - function(a,b,d) { return(a*b+(sum(d))) } additional - runif(100) outer(X=a, Y=b, FUN=f, d=additional) ## So far so good. But now my real example. I would like to plot the ## log-likelihood surface for two parameters alpha and beta of ## a Gompertz distribution with given data ### I have a function to generate random-numbers from a Gompertz-Distribution ### (using the 'inversion method') random.gomp - function(n, alpha, beta) { return( (log(1-(beta/alpha*log(1-runif(n)/beta) } ## Now I generate some 'lifetimes' no.people - 1000 al - 0.1 bet - 0.1 lifetimes - random.gomp(n=no.people, alpha=al, beta=bet) ### Since I neither have censoring nor truncation in this simple case, ### the log-likelihood should be simply the sum of the log of the ### the densities (following the parametrization of Klein/Moeschberger ### Survival Analysis, p. 38) loggomp - function(alphas, betas, timep) { return(sum(log(alphas) + betas*timep + (alphas/betas * (1-exp(betas*timep) } ### Now I thought I could obtain a matrix of the log-likelihood surface ### by specifying possible values for alpha and beta with the given data. ### I was able to produce this matrix with two for-loops. But I thought ### I could use also 'outer' in this case. ### This is what I tried: possible.alphas - seq(from=0.05, to=0.15, length=30) possible.betas - seq(from=0.05, to=0.15, length=30) outer(X=possible.alphas, Y=possible.betas, FUN=loggomp, timep=lifetimes) ### But the result is: outer(X=possible.alphas, Y=possible.betas, FUN=loggomp, timep=lifetimes) Error in outer(X = possible.alphas, Y = possible.betas, FUN = loggomp, : dim- : dims [product 900] do not match the length of object [1] In addition: Warning messages: ... ### Can somebody give me some hint where the problem is? ### I checked my definition of 'loggomp' but I thought this looks fine: loggomp(alphas=possible.alphas[1], betas=possible.betas[1], timep=lifetimes) loggomp(alphas=possible.alphas[4], betas=possible.betas[10], timep=lifetimes) loggomp(alphas=possible.alphas[3], betas=possible.betas[11], timep=lifetimes) ### I'd appreciate any kind of advice. ### Thanks a lot in advance. ### Roland + This mail has been sent through the MPI for Demographic Rese...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] tree widget question
tom wright [EMAIL PROTECTED] writes: I'm trying to create an app using TclTk and R Can someone please explain how I bind a click event to the tree widget (http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/TreeWidget.html) Ideally I'd like to bind to particular elements in the tree but tkbind doesnt seem to work. You need to study the docs for the Tree widget in the BWidget package, e.g. via http://tcllib.sourceforge.net/BWman/Tree.html. I think that you can do something like tcl(mytree, bindText, Button-1, function(node)print(node)) (As multiple hints in the wording should tell you, this is completely untested) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to manipulate an abitrary dimensioned array.
If I have an n1 x n1 x 2 array X I can calculate, say, X[,,1]/X[,,2]. If it is a 4 dimensional array then I want to be able to calculate X[,,,1]/X[,,,2], and similarly for higher dimensions. How can I write a function to do this in a general way without having to do a switch for each possible length(dim(X)). So I want a function g that will take an arbitrary dimensioned array, X, and return X[,,,1]/X[,,,2], etc. I know how to do this by turning X into a vector, then doing the division, then re-shaping as an array, but that doesn't seem very elegant. What I think I am missing is how to paste/substitute/eval a bunch of commas into an array selection. Thanks, --Mike -- Mike Meyer, Seattle WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
On Thu, 2005-10-27 at 16:15 -0500, Na Li wrote: On 27 Oct 2005, Duncan Temple Lang wrote: Yes, it is of interest and was sitting on my todo list at some time. If you want to go ahead and provide code to do it, that would be terrific. There are other areas where encryption would be good to have, so a general mechanism would be nice. D. Na Li wrote: Hi, I wonder if there is interest/intention to allow for encrypted .RData files? One can certainly do that outside R manually but that will leave a decrypted RData file somewhere which one has to remember to delete. I was hoping someone has already done it. ;-( One possibility is to implement an interface package to gpgme library which itself is an interface to GnuPG. But I'm not sure how the input of passphrase can be handled without using clear text. Michael Seems to me that a better option would be to encrypt the full partition such that (unless you write the files to a non-encrypted partition) these issues are transparent. This would include the use of save(), save.image() and write() type functions to save what was an encrypted dataset/object to a unencrypted file. Of course, you would also have to encrypt the swap and tmp partitions (as appropriate) for similar reasons. On Linuxen/Unixen, full encryption of partitions is available via loopback devices and other mechanisms and some distros have this available as a built-in option. I believe that the FC folks finally have this on their list of functional additions for FC5. Windows of course can do something similar. The other consideration here, is that if R Core builds in some form of encryption, there is the potential for import/export restrictions on such technology since R is available via international CRAN mirrors. It may be best to provide for a plug-in encryption black box of sorts, so that folks can use a particular encryption schema that meets various legal/regulatory requirements. Of course, simply encrypting the file or even a complete partition has to be considered within a larger security strategy (ie. network security, physical access control, etc.) that meets a particular functional requirement (such as HIPAA here in the U.S.) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
On 27 Oct 2005, Marc Schwartz uttered the following: Seems to me that a better option would be to encrypt the full partition such that (unless you write the files to a non-encrypted partition) these issues are transparent. I actually do that on a Mac via an encrypted sparse disk image. But I may occasionally need transfer some files to other people or put it on a machine without such support. Also the encryption options are quite limited. Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to manipulate an abitrary dimensioned array.
Why doesn't apply() already do what you want? -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Meyer Sent: Thursday, October 27, 2005 2:50 PM To: r-help@stat.math.ethz.ch Subject: [R] How to manipulate an abitrary dimensioned array. If I have an n1 x n1 x 2 array X I can calculate, say, X[,,1]/X[,,2]. If it is a 4 dimensional array then I want to be able to calculate X[,,,1]/X[,,,2], and similarly for higher dimensions. How can I write a function to do this in a general way without having to do a switch for each possible length(dim(X)). So I want a function g that will take an arbitrary dimensioned array, X, and return X[,,,1]/X[,,,2], etc. I know how to do this by turning X into a vector, then doing the division, then re-shaping as an array, but that doesn't seem very elegant. What I think I am missing is how to paste/substitute/eval a bunch of commas into an array selection. Thanks, --Mike -- Mike Meyer, Seattle WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to manipulate an abitrary dimensioned array.
Thanks for the suggestion. Perhaps I can see how to use apply to get the ratio, but say I also want to return X[1] in a general way. Maybe I am being dense but I just don't see it --- probably as a result of too much Perl/Python/Java recently that is clouding my mind. So can someone suggest a general function that will give me the last layer of an arbitrary dimensioned array? Berton Gunter wrote: Why doesn't apply() already do what you want? -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Meyer Sent: Thursday, October 27, 2005 2:50 PM To: r-help@stat.math.ethz.ch Subject: [R] How to manipulate an abitrary dimensioned array. If I have an n1 x n1 x 2 array X I can calculate, say, X[,,1]/X[,,2]. If it is a 4 dimensional array then I want to be able to calculate X[,,,1]/X[,,,2], and similarly for higher dimensions. How can I write a function to do this in a general way without having to do a switch for each possible length(dim(X)). So I want a function g that will take an arbitrary dimensioned array, X, and return X[,,,1]/X[,,,2], etc. I know how to do this by turning X into a vector, then doing the division, then re-shaping as an array, but that doesn't seem very elegant. What I think I am missing is how to paste/substitute/eval a bunch of commas into an array selection. Thanks, --Mike -- Mike Meyer, Seattle WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Mike Meyer, Seattle WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Na Li wrote: On 27 Oct 2005, Duncan Temple Lang wrote: Yes, it is of interest and was sitting on my todo list at some time. If you want to go ahead and provide code to do it, that would be terrific. There are other areas where encryption would be good to have, so a general mechanism would be nice. D. Na Li wrote: Hi, I wonder if there is interest/intention to allow for encrypted .RData files? One can certainly do that outside R manually but that will leave a decrypted RData file somewhere which one has to remember to delete. I was hoping someone has already done it. ;-( Me too. One possibility is to implement an interface package to gpgme library which itself is an interface to GnuPG. But I'm not sure how the input of passphrase can be handled without using clear text. For the Unix-like operating systems, a simple thing that we can use is to call gpg as a system program. When we save a file, we can put it in R's temporary directory which is readable only by the owner of the R process. Then we call gpg to encrypt it and put the resulting file in the appropriate directory. Similarly, when loading, we can decrypt into this secure area, load the file in the usual way and throw away the decrypted version. This is definitely the poor man's version and one that I don't like as it uses the file system. But it will get us around the import/export restrictions that Luke Tierney immediately raise with no problems. It is also the mechanism the gpg package in emacs uses (except they just use the current directory, regardless of whether it is readable by anyone else). And I have just written a very simple prototype that does this for R. Interfacing to a library is the way to go, and I might get to that soon, but it requires that we do it in a way that does not put any encryption code into the R source. I can see 2 options off hand, but some more thought is necessary. And if anyone wants to volunteer to write this, that would be much better than me doing it for a variety of different reasons. Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - -- Duncan Temple Lang[EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 371 Kerr Hall fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDYVA09p/Jzwa2QP4RAoYHAJ4i4Nnd/XtNx2O+zrjxF1nxwYJ4egCfQV4Q p1CaXFo1fiThNax7Afg9uco= =j3Hw -END PGP SIGNATURE- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to make labels on my dendrogam look more clear and visible
Dear group, I have a matrix with readings for ~180 variables observed in 240 conditions. I am doing a hierarchical clustering method (hclust) by calculating eucledian distances among them. When I plot the dendrogram from hclust, all my variables at the end of the branches are cluttered. I cannot read them properly. I tried using : x11(width = 100, height = 70, pointsize = 10) plot(mydat.hcluster) and also by x11(width = 1000, height = 300, pointsize = 10) plot(mydat.hcluster) I could not make the dendrogram branches go wide and make variables at the end of braches more legible. Can any one please help me to make a good diagram so that I can see the lables at the end of branches more clearly. Thank you. cheers Sri __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to manipulate an abitrary dimensioned array.
Not sure what you're after, but the kth dimension of an array y can be obtained as: apply(y,k,c). Each column of the resulting matrix can then be dimensioned, if you like, via dim(y)[-k] . -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: Mike Meyer [mailto:[EMAIL PROTECTED] Sent: Thursday, October 27, 2005 3:43 PM To: Berton Gunter Cc: r-help@stat.math.ethz.ch Subject: Re: [R] How to manipulate an abitrary dimensioned array. Thanks for the suggestion. Perhaps I can see how to use apply to get the ratio, but say I also want to return X[1] in a general way. Maybe I am being dense but I just don't see it --- probably as a result of too much Perl/Python/Java recently that is clouding my mind. So can someone suggest a general function that will give me the last layer of an arbitrary dimensioned array? Berton Gunter wrote: Why doesn't apply() already do what you want? -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Meyer Sent: Thursday, October 27, 2005 2:50 PM To: r-help@stat.math.ethz.ch Subject: [R] How to manipulate an abitrary dimensioned array. If I have an n1 x n1 x 2 array X I can calculate, say, X[,,1]/X[,,2]. If it is a 4 dimensional array then I want to be able to calculate X[,,,1]/X[,,,2], and similarly for higher dimensions. How can I write a function to do this in a general way without having to do a switch for each possible length(dim(X)). So I want a function g that will take an arbitrary dimensioned array, X, and return X[,,,1]/X[,,,2], etc. I know how to do this by turning X into a vector, then doing the division, then re-shaping as an array, but that doesn't seem very elegant. What I think I am missing is how to paste/substitute/eval a bunch of commas into an array selection. Thanks, --Mike -- Mike Meyer, Seattle WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Mike Meyer, Seattle WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to make labels on my dendrogam look more clear and visible
On Thu, 27 Oct 2005 16:08:48 -0700 (PDT) Srinivas Iyyer [EMAIL PROTECTED] wrote: Dear group, I have a matrix with readings for ~180 variables observed in 240 conditions. I am doing a hierarchical clustering method (hclust) by calculating eucledian distances among them. When I plot the dendrogram from hclust, all my variables at the end of the branches are cluttered. I cannot read them properly. I tried using : x11(width = 100, height = 70, pointsize = 10) plot(mydat.hcluster) and also by x11(width = 1000, height = 300, pointsize = 10) plot(mydat.hcluster) I could not make the dendrogram branches go wide and make variables at the end of braches more legible. Can any one please help me to make a good diagram so that I can see the lables at the end of branches more clearly. Thank you. cheers Sri I don't know if it'll help, but I've grown fond of postscript graphs. For example... postscript(mydendogram.ps, height=800, width=2000, pointsize=[]) plot(mydendogram) dev.off() gives me very clear print, and then the ps2pdf app can turn it into a pdf for e-mailing or import to a presentation. jon b __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to predict with logistic model in package logistf ?
fit$predict does print the fitted value for training data frame, but what I want is to apply the fitted model to new coming data. maybe I can form the equation manually . thank you! jinlong From: Elizabeth Lawson [EMAIL PROTECTED] To: jinlong li [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Subject: Re: [R] how to predict with logistic model in package logistf ? Date: Thu, 27 Oct 2005 08:19:49 -0700 (PDT) Did you try fit$predict? Elizabeth Lawson jinlong li [EMAIL PROTECTED] wrote: dear community, I am a beginer in R , and can't predict with logistic model in package logistf, could anyone help me ? thanks ! the following is my command and result : library(logistf) data(sex2) fit-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sex2) predict(fit,newdata=sex2) Error in predict(fit, newdata = sex2) : no applicable method for predict __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - Yahoo! FareChase - Search multiple travel sites in one click. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] question about sm.density
How can I draw a 95% contour in sm.density? For example, y - cbind(rnorm(50), rnorm(50)) sm.density(y, display = slice) will give 25%, 50% and 75% contours automatically, but no reference on other values. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] MCMC in R
Dear R-helpers, Hi! All. I'm doing a project which needs MCMC simulation. I wonder whether there exists related packages in R. The only one I know is a MCMCpack package. What I want to do is implementing gibbs sampling and Metropolis-Hastings Algorithm to get the posterior of hierarchical bayesian models. Thanks in advance. Jun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RSQLite problems
Hi, Thanks for reporting the two problems. I'm attaching a simple update to two functions that will allow you to specify a different separator, e.g., using your example: dbWriteTable(con, barley, barley, overwrite = TRUE, sep = ;) This workaround still relies in dumping the data.frame into a temporary file and then importing into SQLite, but using prepared statements (which SQLite 3 supports) will require some more work. I'll look into the problem with the trailing newline soon. -- David Na Li wrote: Hi, I'm experimenting with using (R)SQLite to do data management. Here are two little problems that I've encountered: 1. The presence of ',' in string values causes trouble since ',' is also the delimiter used in the SQL statement. 2. A newline '\n' line attached to the last string value of each row. Some examples: library (RSQLite) Loading required package: DBI sqlite - dbDriver (SQLite) db - dbConnect (sqlite, dbname = test.dbms) data (barley) dbWriteTable (db, barley, barley, overwrite = TRUE) [1] TRUE barley[1:3,] yield variety yearsite 1 27.0 Manchuria 1931 University Farm 2 48.86667 Manchuria 1931 Waseca 3 27.43334 Manchuria 1931 Morris dbReadTable (db, barley)[1:3,] yield variety year__1 site 1 27.0 Manchuria1931 University Farm\n 2 48.86667 Manchuria1931 Waseca\n 3 27.43334 Manchuria1931 Morris\n barley$site - as.character (barley$site) barley$site[1] - University, Farm dbWriteTable (db, barley, barley, overwrite = TRUE) Error in sqliteWriteTable(conn, name, value, ...) : RS-DBI driver: (RS_sqlite_import: /tmp/RtmpgSNaLn/rsdbi6a5d128c line 1 expected 5 columns of data but found 6) I'm using RSQLite 0.4.0 with R 2.1.1 on Mac OS X. Cheers, Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html safe.write - function (value, file, batch, ..., sep=,, eol=\n, quote.string = FALSE) { N - nrow(value) if (N 1) { warning(no rows in data.frame) return(NULL) } if (missing(batch) || is.null(batch)) batch - 1 else if (batch = 0) batch - N from - 1 to - min(batch, N) while (from = N) { if (usingR()) write.table(value[from:to, , drop = FALSE], file = file, append = TRUE, quote = quote.string, sep = sep, na = .SQLite.NA.string, row.names = FALSE, col.names = FALSE, eol = eol, ...) else write.table(value[from:to, , drop = FALSE], file = file, append = TRUE, quote.string = quote.string, sep = ,, na = .SQLite.NA.string, dimnames.write = FALSE, end.of.row = \n, ...) from - to + 1 to - min(to + batch, N) } invisible(NULL) } sqliteWriteTable - function (con, name, value, field.types, row.names = TRUE, overwrite = FALSE, append = FALSE, ..., sep = ,) { if (overwrite append) stop(overwrite and append cannot both be TRUE) if (!is.data.frame(value)) value - as.data.frame(value) if (row.names) { value - cbind(row.names(value), value) names(value)[1] - row.names } if (missing(field.types) || is.null(field.types)) { field.types - sapply(value, dbDataType, dbObj = con) } i - match(row.names, names(field.types), nomatch = 0) if (i 0) field.types[i] - dbDataType(con, field.types$row.names) names(field.types) - make.db.names(con, names(field.types), allow.keywords = F) if (length(dbListResults(con)) != 0) { new.con - dbConnect(con) on.exit(dbDisconnect(new.con)) } else { new.con - con } if (dbExistsTable(con, name)) { if (overwrite) { if (!dbRemoveTable(con, name)) { warning(paste(table, name, couldn't be overwritten)) return(FALSE) } } else if (!append) { warning(paste(table, name, exists in database: aborting dbWriteTable)) return(FALSE) } } if (!dbExistsTable(con, name)) { sql1 - paste(create table , name, \n(\n\t, sep = ) sql2 - paste(paste(names(field.types), field.types), collapse = ,\n\t, sep = ) sql3 - \n)\n sql - paste(sql1, sql2, sql3, sep = ) rs - try(dbSendQuery(new.con, sql)) if (inherits(rs, ErrorClass)) { warning(could not create table: aborting assignTable) return(FALSE) } else dbClearResult(rs) } fn - tempfile(rsdbi) safe.write(value, file = fn, ..., sep=sep) on.exit(unlink(fn), add = TRUE) if (FALSE) { sql4 - paste(COPY ', name, ' FROM ', fn, ' USING DELIMITERS ',',
Re: [R] its dates masked by chron
On 27 October 2005 at 11:47, Omar Lakkis wrote: | I built R 2.2.0 from source on my debian machine yesterday and updated FYI, Debian had 2.2.0 package for you to download for over a week. | all packages. My problem is that dates function from its, that my | code heavely uses is now masked by dates from chron. | How can I specify tehat I want to use dates from its or how can I | prevent it from being masked? | | library(its) | Loading required package: Hmisc | Hmisc library by Frank E Harrell Jr | | Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') | to see overall documentation. | | NOTE:Hmisc no longer redefines [.factor to drop unused levels when | subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). | | Attaching package: 'Hmisc' | | | The following object(s) are masked from package:stats : | | ecdf | | | Attaching package: 'chron' | | | The following object(s) are masked from package:its : | | dates I can't replicate that. Using the Debian packages for R, Hmisc and its: [EMAIL PROTECTED]:~ dpkg -l r-base-core r-cran-hmisc r-cran-its | grep ^ii | cut -c-78 ii r-base-core2.2.0.final-2 GNU R core of statistical computing language ii r-cran-hmisc 3.0.7-1GNU R miscellaneous functions by Frank Harre ii r-cran-its 1.0.9-1GNU R package for handling irregular time se I get the following (using --quiet to truncate the output): [EMAIL PROTECTED]:~ R --quiet library(its) Loading required package: Hmisc Hmisc library by Frank E Harrell Jr Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview') to see overall documentation. NOTE:Hmisc no longer redefines [.factor to drop unused levels when subsetting. To get the old behavior of Hmisc type dropUnusedLevels(). Attaching package: 'Hmisc' The following object(s) are masked from package:stats : ecdf Hth, Dirk -- Statistics: The (futile) attempt to offer certainty about uncertainty. -- Roger Koenker, 'Dictionary of Received Ideas of Statistics' __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to make labels on my dendrogam look more clear and visible
I feel a bit timid in asking this question: Why create the PS? Why not create the pdf directly? ?pdf You have lots of control over the size and other characteristics, and the pdf can be used by MiKTeX to create a TeX - pdf document containing your graphic. I'm running R 2.2.0 on a DELL WinXP machine. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of jon butchar Sent: Thursday, October 27, 2005 8:02 PM To: Srinivas Iyyer Cc: r-help@stat.math.ethz.ch Subject: Re: [R] How to make labels on my dendrogam look more clear and visible On Thu, 27 Oct 2005 16:08:48 -0700 (PDT) Srinivas Iyyer [EMAIL PROTECTED] wrote: Dear group, I have a matrix with readings for ~180 variables observed in 240 conditions. I am doing a hierarchical clustering method (hclust) by calculating eucledian distances among them. When I plot the dendrogram from hclust, all my variables at the end of the branches are cluttered. I cannot read them properly. I tried using : x11(width = 100, height = 70, pointsize = 10) plot(mydat.hcluster) and also by x11(width = 1000, height = 300, pointsize = 10) plot(mydat.hcluster) I could not make the dendrogram branches go wide and make variables at the end of braches more legible. Can any one please help me to make a good diagram so that I can see the lables at the end of branches more clearly. Thank you. cheers Sri I don't know if it'll help, but I've grown fond of postscript graphs. For example... postscript(mydendogram.ps, height=800, width=2000, pointsize=[]) plot(mydendogram) dev.off() gives me very clear print, and then the ps2pdf app can turn it into a pdf for e-mailing or import to a presentation. jon b __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] installing Rmpi
On 27 Oct 2005, [EMAIL PROTECTED] wrote: Rmpi version: 0.4-9 Rmpi is an interface (wrapper) to MPI APIs with interactive R slave functionalities. See `library (help=Rmpi)' for details. Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/home/apps/R-2.2.0/lib/R/library/Rmpi/libs/Rmpi.so': libmpi.so.0: cannot open shared object file: No such file or directory Error in dyn.unload(x) : dynamic/shared library To help diagnose the issue, you might try calling ldd on Rmpi.so. Perhaps the issue is that you need to add a path to LD_LIBRARY_PATH so that the linker can find the mpi libs. HTH, + seth __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] inverse matrix
if solve(a,b) means to calculate an inverse matrix of a with b, and i wonder why solve(a)%%b will get different result? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] F tests for random effect models
Thanks a lot, but : anova(lmer(Rendement ~ (1 | Pollinisateur) + (1 | Lignee) + (1 | Pollinisateur : Lignee), data = mca2)) Analysis of Variance Table Erreur dans ok[, -nc] : nombre de dimensions incorrect It looks like working with at least one fixed effect but not with random effect models. Jacques VESLOT Doran, Harold a écrit : I think what you're looking for is in anova() fm1 - lmer(dv ~ IV ...) anova(fm1) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jacques VESLOT Sent: Thursday, October 27, 2005 2:22 AM To: R-help@stat.math.ethz.ch Subject: [R] F tests for random effect models Dear R-users, My question is how to get right F tests for random effects in random effect models (I hope this question has not been answered too many times yet - I didn't find an answer in rhelp archives). My data are in mca2 (enc.) : names(mca2) [1] LigneePollinisateur Rendement dim(mca2) [1] 100 3 replications(Rendement ~ Lignee * Pollinisateur, data = mca2) LigneePollinisateur Lignee:Pollinisateur 20 102 Of course, summary(aov(Rendement ~ Pollinisateur * Lignee, data = mca2)) gives wrong tests of random effects. But, summary(aov1 - aov(Rendement ~ Error(Pollinisateur * Lignee), data = mca2)) gives no test at all, and I have to do it like this : tab1 - matrix(unlist(summary(aov1)), nc=5, byrow=T)[,1:3] Femp - c(tab1[1:3, 3]/tab1[c(3,3,4), 3]) names(Femp) - c(Pollinisateur, Lignee, Interaction) 1 - pf(Femp, tab1[1:3,1], tab1[c(3,3,4),1]) With lme4 package (I did'nt succeed in writing a working formula with lme from nlme package), I can see standard deviations of random effects (but don't know how to find them) with : library(lme4) summary(lmer(Rendement ~ (1 |Pollinisateur) + (1 | Lignee) + (1 | Pollinisateur:Lignee), data=mca2)) but I can't get F tests. Thanks in advance. Best regards, Jacques VESLOT __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] question about sm.density
On Thu, 27 Oct 2005, Cunningham Kerry wrote: How can I draw a 95% contour in sm.density? For example, y - cbind(rnorm(50), rnorm(50)) sm.density(y, display = slice) will give 25%, 50% and 75% contours automatically, but no reference on other values. See ?sm.options, the place to set such options. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html