Re: [R] setting new working directories
Bill Shipley wrote: Hello, and Happy New Year. My default working directory is getting very cluttered. I know that I should be using a different working directory for each project (I work in Windows), but do not know how to go about creating different ones and moving back and forth between them. I have read Venables Ripley (Modern Applied Statistics with S-PLUS, 1994) but this seems out of date with respect to this topic and have searched through the documentation but cannot find a clear explanation for doing this. Can someone point me to the proper documentation for creating and using different working directories from within Windows (please, no comments about switching to UNIX...). Thanks. I'm not sure to which New Year you refer, but thanks and the same to you. I think you may want to automatically start up R in a directory for each project. There is a discussion of that in Kickstarting R at: http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_start.html Basically, you can create icons on the desktop that will start R in a number of directories. In Windows, don't worry about adding the line to your R startup file, just set the Start Program in to the desired directory in the Properties dialog of the shortcut. You can also set up an R file that gives you an interactive choice when the program starts. Create a file like this: cat(R programming\n) cat(Animals\n) cat(Vegetables\n) cat(Minerals\n) cat(Type in the first letter of the project -) answer-toupper(strsplit(readline())[1]) if(answer == R) setwd(c:/jim/R/programs) if(answer == A) setwd(c:/jim/things/animals) ... and say you name it c:\jim\R\SelectProject.R (notice that Windows uses backslashes but you use slashes in R) put the following line in your .First function: source(c:/jim/R/SelectProject.R) and you should be able to select your project and directory on startup. Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] RMySQL 0.5-11 uploaded to CRAN
Hello, I've uploaded version 0.5-11 of RMySQL into CRAN, and it should be available soon. From the NEWS file: Version 0.5-11 * Fixed a bug that would crash R with a failed mysql_real_connect(). * dbApply() is now working again (but still experimentally). * Re-formatted the C code. [0.5-9 through 0.5-10 were maintanance releases that Seth Falcon kindly put out.] Regards, -- David ___ R-packages mailing list R-packages@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in load
Dear All, can anyone explain this message Error in load(matched2.RData) : Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'builtin'? Have I the possibility to retrieve the data? TIA Giovanni -- dr. Giovanni Parrinello Department of Biotecnologies Medical Statistics Unit University of Brescia Viale Europa, 11 25123 Brescia email: [EMAIL PROTECTED] Phone: +390303717528 Fax: +390303717488 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color of opposite sign values in filled.contour
Claudia Tebaldi wrote: Dear R-helpers, I'm plotting geophysical data in the form of contours using filled.contour. The display would be much more effective if the areas with negative values could be color coded by -- say -- cold colors in the range of blue to green, and conversely the areas with positive values got plotted with warm colors, from yellow to red. Right now if I use a palette spanning the spectrum I need the entire range is associated with the actual range of the data, which can be positively or negatively skewed, and as a result the position of the zero is totally arbitrary. I'm wondering if someone out there has come up with a clever way to set the color scale accordingly, as a function of the actual range of the values in the matrix that is being plotted. Ideally, it would be neat to still use the entire spectrum, but sampling differently the cold and warm subsets accordingly to the extent of the negative and positive values in the data. Also, when I try to play around in an ad hoc fashion with the palette I often get funny results in the legend, with color-scale wrapping or blank cells at one of the extremes. I cannot hack effectively the code of the filled.contour function, obviously... Hi Claudia, Have a look at color.scale in the plotrix package. You can specify quite a variety of different color ranges into which your numeric values will be transformed. If you want different color ranges for positive and negative values, you can calculate them separately. Here's an example using blue to green for negative values and yellow to red for positive: testval-c(-3,6,-1,8,0,-2,4,10,12) testcol-rep(0,length(testval)) testcol[testval0]-color.scale(testval[testval0],0,c(0,1),c(1,0)) testcol[testval=0]-color.scale(testval[testval=0],1,c(1,0),0) plot(testval,col=testcol,pch=19) You might also be interested in color.scale.lines Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in load
You appear to have a corrupted character vector in your saved image. Either R was internally corrupted when the image was saved or the file has been corrupted. In neither case would the the data be intact. On Fri, 5 Jan 2007, Giovanni Parrinello wrote: Dear All, can anyone explain this message Error in load(matched2.RData) : Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'builtin'? Have I the possibility to retrieve the data? TIA Giovanni -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast Removing Duplicates from Every Column
Hi, I'm looking for some lines of code that does the following: I have a dataframe with 160 Columns and a number of rows (max 30): Col1 Col2 Col3 ... Col 159 Col 160 Row 1 0 0 LD ... 0 VD Row 2 HD 0 00 MD Row 3 0 HD HD 0 LD Row 4 LD HD HD 0 LD ... ... LastRow HDHDLD 0 MD Now I want a dataframe that looks like this. As you see all duplicates are removed. Can this dataframe be constructed in a fast way? Col1 Col2 Col3 ... Col 159 Col 160 Row 1 00LD 0VD Row 2 HD HD 00MD Row 3 LD 0HD 0LD Thx for helping me out. Bert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with plot() and POSIXt dates
Hy all, I'm plotting graphs using plot() function, they are on X axes POSIX dates: POSIXt oldClass POSIXct POSIXlt I can't figure out why sometimes it prints the month and days and sometimes it prints the unix timestamp. It appens usually when the xlim is short like only some days. xlim is settled as a POSIXt like this 2006-12-30 17:25:44 CET 2007-01-02 03:16:51 CET On the graph it prints : 116750 and 116770 instead of dates. And the result gives a x axes in unix timestamps as if the plot function didn't recognize that it is a timestamp but just an integer. What am i missing, since R sees itself that it is time stamps and not integer when xlim is enougth large, how do i tell the plot function to see these numbers as POSIXt? I put an attachement, inside there is an example that demonstrate what i experience (hope it won't be deleted, it is a tar.gz archive made using debian under R version 2.1.0, 2005-04-18, i386-pc-linux-gnu attached base packages: [1] methods stats graphics grDevices utils datasets [7] base). thks. COMTE Guillaume Ingénieur Projet Alliance Technologies Projet Philharmonie 24 rue Martre 92110 Clichy Tel : 01 40 87 48 06 Fax : 01 40 87 48 14 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] the Rscript is missing in my prévious mail .
Sorry i forgot to give the Rscript to execute with the datas in the archive attached : limite_x-structure(as.numeric(read.table(limite_x)), class=c(POSIXt,POSIXct)) limite_y-as.numeric(read.table(limite_y)) test-as.numeric(read.table(y_values)) abscisse_test-structure(as.numeric(read.table(x_values)), class=c(POSIXt,POSIXct)) plot((test/10)~abscisse_test,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) length_test-length(test) #if i cut the var test to be closer to xlim values, i know that xlim is at the end of test : test_short-test[(length_test-10):length_test] abscisse_test_short-abscisse_test[(length_test-10):length_test] plot((test_short/10)~abscisse_test_short,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) dev.off() COMTE Guillaume Ingénieur Projet Alliance Technologies Projet Philharmonie 24 rue Martre 92110 Clichy Tel : 01 40 87 48 06 Fax : 01 40 87 48 14 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse on data frames
Hi you could use also another approach in case of data frames A - as.data.frame(A) A0 - -A*log(A) A0[is.na(A0)] - 0 which changes NaN's to zeroes HTH Petr On 5 Jan 2007 at 16:38, talepanda wrote: Date sent: Fri, 5 Jan 2007 16:38:05 +0900 From: talepanda [EMAIL PROTECTED] To: [EMAIL PROTECTED] Copies to: r-help@stat.math.ethz.ch Subject:Re: [R] ifelse on data frames It can be explained. class(A) [1] data.frame length(A) [1] 5 class(A==0) [1] matrix length(A==0) [1] 10 class(-A*log(A)) [1] data.frame length(-A*log(A)) [1] 5 as you can see, the result of A==0 is matrix with length=10, while the result of -A*log(A) is still data.frame with length=5. then, when calling ifelse( [length=10], 0, [length=5] ), internally, the NO(3rd) argument was repeated by rep(-A*log(A),length.out=10) (try this). the result is list with length=10 and each element has 2 sub-elements. So, the return value of A[(A==0)==FALSE] has 2 sub-elements as you get. I think what confusing you is the behavior of A==0. However, when using 'ifelse', I think you should use matrix as the arguments because data.frame is not consistent with the purpose of 'ifelse'. On 1/5/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: [Using R 2.2.0 on Windows XP; OK, OK, I will update soon!] I have noticed some undesirable behaviour when applying ifelse to a data frame. Here is my code: A - scan() 1.00 0.00 0.00 0 0.0 0.027702 0.972045 0.000253 0 0.0 A - matrix(A,nrow=2,ncol=5,byrow=T) A == 0 ifelse(A==0,0,-A*log(A)) A - as.data.frame(A) ifelse(A==0,0,-A*log(A)) and this is the output:A - scan() 1: 1.00 0.00 0.00 0 0.0 6: 0.027702 0.972045 0.000253 0 0.0 11: Read 10 items A - matrix(A,nrow=2,ncol=5,byrow=T) A == 0[,1] [,2] [,3] [,4] [,5] [1,] FALSE TRUE TRUE TRUE TRUE [2,] FALSE FALSE FALSE TRUE TRUE ifelse(A==0,0,-A*log(A)) [,1] [,2][,3] [,4] [,5] [1,] 0. 0. 0.0 00 [2,] 0.09934632 0.02756057 0.00209537700 A - as.data.frame(A) ifelse(A==0,0,-A*log(A)) [[1]] [1] 0. 0.09934632 [[2]] [1]NaN 0.02756057 [[3]] [1] 0 [[4]] [1] NaN NaN [[5]] [1] 0 [[6]] [1] 0. 0.09934632 [[7]] [1] 0 [[8]] [1] 0 [[9]] [1] 0 [[10]] [1] 0 Is this a bug or a feature? Can the behaviour be explained? Regards, Murray Jorgensen -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Efficient multinom probs
Dear R-helpers, I need to compute probabilties of multinomial observations, eg by doing the following: y=sample(1:3,15,1) prob=matrix(runif(45),15) prob=prob/rowSums(prob) diag(prob[,y]) However, my question is whether this is the most efficient way to do this. In the call prob[,y] a whole matrix is computed which seems a bit of a waste. Is there maybe a vectorized version of dmultinom which does this? Best, Ingmar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast Removing Duplicates from Every Column
Hi I am not sure if I understand how do you want to select unique items. with sapply(DF, function(x) !duplicated(x)) you can get data frame with TRUE when an item in particular column is unique and FALSE in opposite. However then you need to choose which rows to keep or discard e.g. DF[rowSums(sapply(comp, function(x) !duplicated(x)))1,] selects all rows in which are 2 or more unique values. HTH Petr On 5 Jan 2007 at 9:54, Bert Jacobs wrote: From: Bert Jacobs [EMAIL PROTECTED] To: 'R help list' r-help@stat.math.ethz.ch Date sent: Fri, 5 Jan 2007 09:54:17 +0100 Subject:Re: [R] Fast Removing Duplicates from Every Column Hi, I'm looking for some lines of code that does the following: I have a dataframe with 160 Columns and a number of rows (max 30): Col1 Col2 Col3 ... Col 159 Col 160 Row 1 0 0 LD ... 0 VD Row 2 HD 0 00 MD Row 3 0 HD HD 0 LD Row 4 LD HD HD 0 LD ... ... LastRow HDHDLD 0 MD Now I want a dataframe that looks like this. As you see all duplicates are removed. Can this dataframe be constructed in a fast way? Col1 Col2 Col3 ... Col 159 Col 160 Row 1 00LD 0 VD Row 2 HD HD 00MD Row 3 LD 0HD 0LD Thx for helping me out. Bert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Efficient multinom probs
maybe prob[cbind(1:length(y), y)] I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Ingmar Visser [EMAIL PROTECTED] To: R-help@stat.math.ethz.ch Sent: Friday, January 05, 2007 11:41 AM Subject: [R] Efficient multinom probs Dear R-helpers, I need to compute probabilties of multinomial observations, eg by doing the following: y=sample(1:3,15,1) prob=matrix(runif(45),15) prob=prob/rowSums(prob) diag(prob[,y]) However, my question is whether this is the most efficient way to do this. In the call prob[,y] a whole matrix is computed which seems a bit of a waste. Is there maybe a vectorized version of dmultinom which does this? Best, Ingmar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gstat package. singular attibute
Hello, I'm using the gstat package within R for an automated procedure that uses ordinary kriging. I can see that there is a logical (singular) atrtibute of some adjusted model semivariograms: .- attr(*, singular)= logi TRUE I cannot find documentation about the exact meaning and the implications of this attribute, and I dont know anything about the inner calculations of model semivariograms. I guess that the inverse of some matrix need to be calculated , and this matrix is singular, but I also see that the model semivariogram is calculated anyway. Could you briefly tell me something about the significance of this attribute and if I should not use these model semivariograms when the singular attibute is true? Thank you very much and best regards, Javier -- Javier García-Pintado Institute of Earth Sciences Jaume Almera (CSIC) Lluis Sole Sabaris s/n, 08028 Barcelona Phone: +34 934095410 Fax: +34 934110012 e-mail:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with plot() and POSIXt dates
No attachment arrived: see the posting guide for handling attachments. But in any case the posting guide says If you are using an old version of R and think it does not work properly, upgrade to the latest version and try that, before posting. and your version of R is ancient. Note, 'before posting' On Fri, 5 Jan 2007, COMTE Guillaume wrote: Hy all, I'm plotting graphs using plot() function, they are on X axes POSIX dates: POSIXt oldClass POSIXct POSIXlt I can't figure out why sometimes it prints the month and days and sometimes it prints the unix timestamp. It appens usually when the xlim is short like only some days. xlim is settled as a POSIXt like this 2006-12-30 17:25:44 CET 2007-01-02 03:16:51 CET On the graph it prints : 116750 and 116770 instead of dates. And the result gives a x axes in unix timestamps as if the plot function didn't recognize that it is a timestamp but just an integer. What am i missing, since R sees itself that it is time stamps and not integer when xlim is enougth large, how do i tell the plot function to see these numbers as POSIXt? I put an attachement, inside there is an example that demonstrate what i experience (hope it won't be deleted, it is a tar.gz archive made using debian under R version 2.1.0, 2005-04-18, i386-pc-linux-gnu attached base packages: [1] methods stats graphics grDevices utils datasets [7] base). thks. COMTE Guillaume Ingénieur Projet Alliance Technologies Projet Philharmonie 24 rue Martre 92110 Clichy Tel : 01 40 87 48 06 Fax : 01 40 87 48 14 -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color of opposite sign values in filled.contour
On Thu, 4 Jan 2007, Richard M. Heiberger wrote: Get the RColorBrewer package from CRAN Description: The packages provides palettes for drawing nice maps shaded according to a variable. The package vcd also offers the function diverge_hcl() that constructs diverging palettes (based on HCL colors) particularly aimed at filled.contour() or image() plots. See http://epub.wu-wien.ac.at/dyn/openURL?id=oai:epub.wu-wien.ac.at:epub-wu-01_abd for some more background information. Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] maximum likelihood estimation of 5 parameters
Hi Guys, it would be great if you could help me with a MLE problem in R. I am trying to evaluate the maximum likelihood estimates of theta = (a1, b1, a2, b2, P) which defines a mixture of a Poisson distribution and two gamma prior distributions (where the Poisson means have a gamma distribution, actually 2 gammas and P is the mixing factor). The likelihood function for theta is L(theta) = Pi,j{P f(Nij; a1, b1, Eij) + (1 – P) f(Nij; a2, b2, Eij),} The maximum likelihood estimate of theta is the vector that maximizes the above equation (the values of N and E are given). The authors of the article I read say that the maximization involves an iterative search in the five dimensional parameter space, where each iteration involves computing log[L(theta)] and its first and second-order derivatives. In test runs it is suggested that the maximization typically takes between 5 and 15 iterations from the starting point theta = (a1 = 0.2, b1 = 0.1, a2 = 2, b2 = 4, P = 1/3). Now I have done maximization of a gamma-poisson mixture before (1 poisson, 1 gamma) successfully and I could determine correctly alpha (a) and beta(a). But this one above is giving me ridiculously large unusable values (for example P should not be above 1 and sometimes I get values of 500!) or even negative values! I know the values I should be obtaining with my samples shouldn't be far from the staring points. Is there a way to help me solve this issue? Thanks. -- View this message in context: http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf2925364.html#a8177473 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with plot() and POSIXt dates
So maybe i will finaly succeed asking the right way, Hy all, I'm plotting graphs using plot() function, they are on X axes POSIX dates: POSIXt oldClass POSIXct POSIXlt I can't figure out why sometimes it prints the month and days and sometimes it prints the unix timestamps. Here is an example that reproduce my problem the data's to use with it can be downloaded from http://www.alliance-ir.net/r-project/demo.tar.gz http://www.alliance-ir.net/r-project/demo.tar.gz : limite_x-structure(as.numeric(read.table(limite_x)), class=c(POSIXt,POSIXct)) limite_y-as.numeric(read.table(limite_y)) test-as.numeric(read.table(y_values)) abscisse_test-structure(as.numeric(read.table(x_values)), class=c(POSIXt,POSIXct)) plot((test/10)~abscisse_test,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) length_test-length(test) #if i cut the var test to be closer to xlim values, i know that xlim is at the end of test : test_short-test[(length_test-10):length_test] abscisse_test_short-abscisse_test[(length_test-10):length_test] plot((test_short/10)~abscisse_test_short,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) dev.off() sessionInfo() R version 2.4.0 (2006-10-03) i386-pc-linux-gnu locale: [EMAIL PROTECTED];LC_NUMERIC=C;[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;[EMAIL PROTECTED];LC_IDENTIFICATION=C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base COMTE Guillaume Ingénieur Projet Alliance Technologies Projet Philharmonie 24 rue Martre 92110 Clichy Tel : 01 40 87 48 06 Fax : 01 40 87 48 14 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation of 5 parameters
Franco, You can provide lower and upper bounds on the parameters if you use optim with method=L-BFGS-B. Hth, Ingmar From: francogrex [EMAIL PROTECTED] Date: Fri, 5 Jan 2007 04:54:50 -0800 (PST) To: r-help@stat.math.ethz.ch Subject: [R] maximum likelihood estimation of 5 parameters Hi Guys, it would be great if you could help me with a MLE problem in R. I am trying to evaluate the maximum likelihood estimates of theta = (a1, b1, a2, b2, P) which defines a mixture of a Poisson distribution and two gamma prior distributions (where the Poisson means have a gamma distribution, actually 2 gammas and P is the mixing factor). The likelihood function for theta is L(theta) = Pi,j{P f(Nij; a1, b1, Eij) + (1 P) f(Nij; a2, b2, Eij),} The maximum likelihood estimate of theta is the vector that maximizes the above equation (the values of N and E are given). The authors of the article I read say that the maximization involves an iterative search in the five dimensional parameter space, where each iteration involves computing log[L(theta)] and its first and second-order derivatives. In test runs it is suggested that the maximization typically takes between 5 and 15 iterations from the starting point theta = (a1 = 0.2, b1 = 0.1, a2 = 2, b2 = 4, P = 1/3). Now I have done maximization of a gamma-poisson mixture before (1 poisson, 1 gamma) successfully and I could determine correctly alpha (a) and beta(a). But this one above is giving me ridiculously large unusable values (for example P should not be above 1 and sometimes I get values of 500!) or even negative values! I know the values I should be obtaining with my samples shouldn't be far from the staring points. Is there a way to help me solve this issue? Thanks. -- View this message in context: http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf2925364. html#a8177473 Sent from the R help mailing list archive at Nabble.com. __ [EMAIL PROTECTED] thz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coefficients of each local polynomial from loess() or locfit()
Liu, Delong (NIH/CIT) [C] liud2 at mail.nih.gov writes: I want to extract estimated coeffiicents of each local polynomial at given x from loess(), locfit(), or KernSmooth(). Can some experts provide me with suggestions? Thanks. Try cars.lo - loess(dist ~ speed, cars) str(cars.lo) List of 17 $ n: int 50 $ fitted : num [1:50] 5.89 5.89 12.57 12.57 15.37 ... $ residuals: Named num [1:50] -3.894 4.106 -8.568 9.432 0.631 ... ... omitted ..$ cell : num 0.2 ..$ family : chr gaussian ..$ iterations : num 1 $ kd :List of 5 ..$ parameter: Named int [1:7] 1 50 2 19 11 1049 849 .. ..- attr(*, names)= chr [1:7] d n vc nc ... ..$ a: int [1:19] 1 1 1 1 1 1 1 0 0 0 ... ..$ xi : num [1:19] 15 12 19 9 13 17 20 0 0 0 ... ..$ vert : num [1:2] 3.90 25.11 ..$ vval : num [1:22] 5.71 1.72 96.46 10.88 41.21 ... $ call : language loess(formula = dist ~ speed, data = cars) Looks like kd holds information about the polynomials. Then, try getAnywhere(predict.loess) which will show you that the real work is done in function predLoess. Trying again getAnywhere(predLoess) you get an idea how the parameters are used for prediction. fit[inside] - .C(R_loess_ifit, as.integer(kd$parameter), as.integer(kd$a), as.double(kd$xi), as.double(kd$vert), as.double(kd$vval), as.integer(M1), as.double(x.evaluate[inside, ]), fit = double(M1))$fit Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color of opposite sign values in filled.contour
Dear all, especially those of you that kindly provided suggestions yesterday, I was not asking for cool palettes -- even if I now appreciate the pointers -- but I was asking for a way to make the 0 level of a filled contour plot correspond to the neutral color in the color scale when the range of my values is not symmetrical around zero...without hacking into the filled.contour function (which I'm not able to do succesfully). I couldn't get the ggplot package suggested because I'm not running MAC OSX 10.4, unfortunately...all the other suggestions didn't provide a way out...at least that I could recognize. Thanks again -- last time I bother you I promise! claudia tebaldi -- Claudia Tebaldi ISSE/CGD/IMAGe http://www.image.ucar.edu/~tebaldi currently visiting Center for Environmental Science and Policy Stanford University tel: (650) 724-9261 skype: claudia.tebaldi -- Claudia Tebaldi ISSE/CGD/IMAGe http://www.image.ucar.edu/~tebaldi currently visiting Center for Environmental Science and Policy Stanford University tel: (650) 724-9261 skype: claudia.tebaldi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coefficients of each local polynomial from loess() or locfit()
Liu, Delong (NIH/CIT) [C] liud2 at mail.nih.gov writes: I want to extract estimated coeffiicents of each local polynomial at given x from loess(), locfit(), or KernSmooth(). Can some experts provide me with suggestions? Thanks. Before you start on your own, also note Brian Ripleys recent posting on http://article.gmane.org/gmane.comp.lang.r.general/76625 Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with plot() and POSIXt dates
We can force the X axis into the format of our choice by suppressing it in the plot and explicitly calling axis.POSIXct with our choice of format (see ?axis.POSIXct): x - seq(from = as.POSIXct(2005-10-14), to = as.POSIXct(2007-01-02), by = day y - as.numeric(x) plot(y ~ x, xaxt = n) at - x[seq(1, length(x), length = 5)] axis.POSIXct(1, at, at, %y-%b-%d) Actually your data appear to be dates rather than date times so I would use Date class rather than POSIXct. See the Help Desk article on dates and times in R News 4/1. [Try making your examples smaller and more self contained such as shown above.] On 1/5/07, COMTE Guillaume [EMAIL PROTECTED] wrote: So maybe i will finaly succeed asking the right way, Hy all, I'm plotting graphs using plot() function, they are on X axes POSIX dates: POSIXt oldClass POSIXct POSIXlt I can't figure out why sometimes it prints the month and days and sometimes it prints the unix timestamps. Here is an example that reproduce my problem the data's to use with it can be downloaded from http://www.alliance-ir.net/r-project/demo.tar.gz http://www.alliance-ir.net/r-project/demo.tar.gz : limite_x-structure(as.numeric(read.table(limite_x)), class=c(POSIXt,POSIXct)) limite_y-as.numeric(read.table(limite_y)) test-as.numeric(read.table(y_values)) abscisse_test-structure(as.numeric(read.table(x_values)), class=c(POSIXt,POSIXct)) plot((test/10)~abscisse_test,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) length_test-length(test) #if i cut the var test to be closer to xlim values, i know that xlim is at the end of test : test_short-test[(length_test-10):length_test] abscisse_test_short-abscisse_test[(length_test-10):length_test] plot((test_short/10)~abscisse_test_short,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) dev.off() sessionInfo() R version 2.4.0 (2006-10-03) i386-pc-linux-gnu locale: [EMAIL PROTECTED];LC_NUMERIC=C;[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;[EMAIL PROTECTED];LC_IDENTIFICATION=C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base COMTE Guillaume Ingénieur Projet Alliance Technologies Projet Philharmonie 24 rue Martre 92110 Clichy Tel : 01 40 87 48 06 Fax : 01 40 87 48 14 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with plot() and POSIXt dates
Sorry, there was a missing parenthesis after the first statement. Here is the code again: x - seq(from = as.POSIXct(2005-10-14), to = as.POSIXct(2007-01-02), by = day) y - as.numeric(x) plot(y ~ x, xaxt = n) at - x[seq(1, length(x), length = 5)] axis.POSIXct(1, at, at, %y-%b-%d) On 1/5/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: We can force the X axis into the format of our choice by suppressing it in the plot and explicitly calling axis.POSIXct with our choice of format (see ?axis.POSIXct): x - seq(from = as.POSIXct(2005-10-14), to = as.POSIXct(2007-01-02), by = day y - as.numeric(x) plot(y ~ x, xaxt = n) at - x[seq(1, length(x), length = 5)] axis.POSIXct(1, at, at, %y-%b-%d) Actually your data appear to be dates rather than date times so I would use Date class rather than POSIXct. See the Help Desk article on dates and times in R News 4/1. [Try making your examples smaller and more self contained such as shown above.] On 1/5/07, COMTE Guillaume [EMAIL PROTECTED] wrote: So maybe i will finaly succeed asking the right way, Hy all, I'm plotting graphs using plot() function, they are on X axes POSIX dates: POSIXt oldClass POSIXct POSIXlt I can't figure out why sometimes it prints the month and days and sometimes it prints the unix timestamps. Here is an example that reproduce my problem the data's to use with it can be downloaded from http://www.alliance-ir.net/r-project/demo.tar.gz http://www.alliance-ir.net/r-project/demo.tar.gz : limite_x-structure(as.numeric(read.table(limite_x)), class=c(POSIXt,POSIXct)) limite_y-as.numeric(read.table(limite_y)) test-as.numeric(read.table(y_values)) abscisse_test-structure(as.numeric(read.table(x_values)), class=c(POSIXt,POSIXct)) plot((test/10)~abscisse_test,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) length_test-length(test) #if i cut the var test to be closer to xlim values, i know that xlim is at the end of test : test_short-test[(length_test-10):length_test] abscisse_test_short-abscisse_test[(length_test-10):length_test] plot((test_short/10)~abscisse_test_short,type=s,col=lightgreen,xlab=,ylab=,ylim=limite_y,xlim=limite_x) dev.off() sessionInfo() R version 2.4.0 (2006-10-03) i386-pc-linux-gnu locale: [EMAIL PROTECTED];LC_NUMERIC=C;[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED];LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;[EMAIL PROTECTED];LC_IDENTIFICATION=C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base COMTE Guillaume Ingénieur Projet Alliance Technologies Projet Philharmonie 24 rue Martre 92110 Clichy Tel : 01 40 87 48 06 Fax : 01 40 87 48 14 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color of opposite sign values in filled.contour
On 1/5/2007 9:08 AM, Claudia Tebaldi wrote: Dear all, especially those of you that kindly provided suggestions yesterday, I was not asking for cool palettes -- even if I now appreciate the pointers -- but I was asking for a way to make the 0 level of a filled contour plot correspond to the neutral color in the color scale when the range of my values is not symmetrical around zero...without hacking into the filled.contour function (which I'm not able to do succesfully). I couldn't get the ggplot package suggested because I'm not running MAC OSX 10.4, unfortunately...all the other suggestions didn't provide a way out...at least that I could recognize. Thanks again -- last time I bother you I promise! The easiest way would be to use the levels parameter to create breakpoints which aren't linear in the original scale. For example, x - y - seq(-10,10,len=20) z - outer(x, y, function(x,y) x^2+y^2) filled.contour(x,y,z, levels=seq(0,15)^2) If you don't want to transform the spacing between the colors, you just want to change the sequence you get, then you should write your own palette function. For example, mypalette - function(n) { n1 - n %/% 3 n2 - n - n1 + 1 c( colorRampPalette(c(red, white))(n1), colorRampPalette(c(white, blue))(n2)[-1] ) } filled.contour(x,y,z, color=mypalette) Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aov - glm - lmer
I am sorry to ask a trivial question but I am not a statistician. When I need to compare more than two groups in a unbalanced design with SAS system I use PROC GLM (like the example in data5.csv from Cody R. Applied statistics and SAS programming language p.223). R glm gives different results. Thanks Massimo Cressoni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aov - glm - lmer
In R, -- lm fits ordinary least squares linear models; this is likely to be what you want if you were using PROC GLM. -- glm fits generalized linear models (e.g. logit, Poisson, Gaussian, etc.). -- lmer and lme fit mixed models, similar to PROC MIXED Cheers, Hank On Jan 5, 2007, at 10:05 AM, cressonim wrote: I am sorry to ask a trivial question but I am not a statistician. When I need to compare more than two groups in a unbalanced design with SAS system I use PROC GLM (like the example in data5.csv from Cody R. Applied statistics and SAS programming language p.223). R glm gives different results. Thanks Massimo Cressoni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. Dr. Martin Henry H. Stevens, Assistant Professor 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) 529-4243 http://www.cas.muohio.edu/~stevenmh/ http://www.muohio.edu/ecology/ http://www.muohio.edu/botany/ E Pluribus Unum If you send an attachment, please try to send it in a format anyone can read, such as PDF, text, Open Document Format, HTML, or RTF. Please try not to send me MS Word or PowerPoint attachments- Why? See: http://www.gnu.org/philosophy/no-word-attachments.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation of 5 parameters
Franco, You can provide lower and upper bounds on the parameters if you use optim with method=L-BFGS-B. Hth, Ingmar Thanks, but when I use L-BFGS-B it tells me that there is an error in optim(start, f, method = method, hessian = TRUE, ...) : L-BFGS-B needs finite values of 'fn' -- View this message in context: http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf2925364.html#a8180120 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation of 5 parameters
On Fri, 5 Jan 2007, francogrex wrote: [quoting Ingmar Vissar without attribution, contrary to the posting guide.] Franco, You can provide lower and upper bounds on the parameters if you use optim with method=L-BFGS-B. Hth, Ingmar Thanks, but when I use L-BFGS-B it tells me that there is an error in optim(start, f, method = method, hessian = TRUE, ...) : L-BFGS-B needs finite values of 'fn' It sounds as if you have ignored the advice to scale the problem via control options 'fnscale' and 'parscale'. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation of 5 parameters
Franco, Is it possible that you have failed to provide the negative of loglikelihood to optim, since optim, by default, minimizes a function? If you want to do this withput redefining the log-likelihood, you should set fnscale= -1 (as hinted by Prof. Ripley). This would turn the problem into a maximization problem. If this doesn't work, you should provide more details (a reproducible code with actual error message). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of francogrex Sent: Friday, January 05, 2007 10:42 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] maximum likelihood estimation of 5 parameters Franco, You can provide lower and upper bounds on the parameters if you use optim with method=L-BFGS-B. Hth, Ingmar Thanks, but when I use L-BFGS-B it tells me that there is an error in optim(start, f, method = method, hessian = TRUE, ...) : L-BFGS-B needs finite values of 'fn' -- View this message in context: http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf292536 4.html#a8180120 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse on data frames
On Friday 05 January 2007 12:34, Petr Pikal wrote: Hi you could use also another approach in case of data frames A - as.data.frame(A) A0 - -A*log(A) A0[is.na(A0)] - 0 I think you meant A0[which(is.na(A0))] - 0 which changes NaN's to zeroes HTH Petr Regards __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] eval(parse(text vs. get when accessing a function
Dear All, I've read Thomas Lumley's fortune If the answer is parse() you should usually rethink the question.. But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a postfix. E.g. f.1 - function(x) {x + 1} f.2 - function(x) {x + 2} And sometimes I want to call f.1 and some other times f.2 inside another function. I can either do: g - function(x, fpost) { calledf - eval(parse(text = paste(f., fpost, sep = ))) calledf(x) ## do more stuff } Or: h - function(x, fpost) { calledf - get(paste(f., fpost, sep = )) calledf(x) ## do more stuff } Two questions: 1) Why is the second better? 2) By changing g or h I could use do.call instead; why would that be better? Because I can handle differences in argument lists? Thanks, R. -- Ramón Díaz-Uriarte Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse on data frames
``MrJ Man'' wrote: On Friday 05 January 2007 12:34, Petr Pikal wrote: Hi you could use also another approach in case of data frames A - as.data.frame(A) A0 - -A*log(A) A0[is.na(A0)] - 0 I think you meant A0[which(is.na(A0))] - 0 He most certainly DOES NOT mean this! You should try things out before offering gratuitous advice. (a) A0[is.na(A0)] - 0 works perfectly. (b) A0[which(is.na(A0))] - 0 gets it wrong!!! I would have thought that A0[which(is.na(A0),arr.ind=TRUE)] - 0 would work and get it right, but it gives the error message Error in `[-.data.frame`(`*tmp*`, which(is.na(A0), arr.ind = TRUE), value = 0) : only logical matrix subscripts are allowed in replacement which changes NaN's to zeroes HTH Petr cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
Ramon Diaz-Uriarte wrote: Dear All, I've read Thomas Lumley's fortune If the answer is parse() you should usually rethink the question.. But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a postfix. E.g. f.1 - function(x) {x + 1} f.2 - function(x) {x + 2} And sometimes I want to call f.1 and some other times f.2 inside another function. I can either do: g - function(x, fpost) { calledf - eval(parse(text = paste(f., fpost, sep = ))) calledf(x) ## do more stuff } Or: h - function(x, fpost) { calledf - get(paste(f., fpost, sep = )) calledf(x) ## do more stuff } Two questions: 1) Why is the second better? 2) By changing g or h I could use do.call instead; why would that be better? Because I can handle differences in argument lists? Who says that they are better? If the question is how to call a function specified by half of its name, the answer could well be to use parse(), the point is that you should rethink whether that was really the right question. Why not instead, e.g. f - list(1=function(x) {x + 1} , 2=function(x) {x + 2}) h - function(x, fpost) f[[fpost]](x) h(2,2) [1] 4 h(2,1) [1] 3 Thanks, R. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
?? Or to add to what Peter Dalgaard said... (perhaps for the case of many more functions) Why eval(parse())? What's wrong with if then? g - function(fpost,x){if(fpost==1)f.1 else f.2 }(x) or switch() if you have more than 2 possible arguments? I think your remarks reinforce the wisdom of Thomas's axiom . Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ramon Diaz-Uriarte Sent: Friday, January 05, 2007 10:02 AM To: r-help; [EMAIL PROTECTED] Subject: [R] eval(parse(text vs. get when accessing a function Dear All, I've read Thomas Lumley's fortune If the answer is parse() you should usually rethink the question.. But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a postfix. E.g. f.1 - function(x) {x + 1} f.2 - function(x) {x + 2} And sometimes I want to call f.1 and some other times f.2 inside another function. I can either do: g - function(x, fpost) { calledf - eval(parse(text = paste(f., fpost, sep = ))) calledf(x) ## do more stuff } Or: h - function(x, fpost) { calledf - get(paste(f., fpost, sep = )) calledf(x) ## do more stuff } Two questions: 1) Why is the second better? 2) By changing g or h I could use do.call instead; why would that be better? Because I can handle differences in argument lists? Thanks, R. -- Ramón Díaz-Uriarte Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
On Friday 05 January 2007 19:21, Peter Dalgaard wrote: Ramon Diaz-Uriarte wrote: Dear All, I've read Thomas Lumley's fortune If the answer is parse() you should usually rethink the question.. But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a postfix. E.g. f.1 - function(x) {x + 1} f.2 - function(x) {x + 2} And sometimes I want to call f.1 and some other times f.2 inside another function. I can either do: g - function(x, fpost) { calledf - eval(parse(text = paste(f., fpost, sep = ))) calledf(x) ## do more stuff } Or: h - function(x, fpost) { calledf - get(paste(f., fpost, sep = )) calledf(x) ## do more stuff } Two questions: 1) Why is the second better? 2) By changing g or h I could use do.call instead; why would that be better? Because I can handle differences in argument lists? Dear Peter, Thanks for your answer. Who says that they are better? If the question is how to call a function specified by half of its name, the answer could well be to use parse(), the point is that you should rethink whether that was really the right question. Why not instead, e.g. f - list(1=function(x) {x + 1} , 2=function(x) {x + 2}) h - function(x, fpost) f[[fpost]](x) h(2,2) [1] 4 h(2,1) [1] 3 I see, this is direct way of dealing with the problem. However, you first need to build the f list, and you might not know about that ahead of time. For instance, if I build a function so that the only thing that you need to do to use my function g is to call your function f.something, and then pass the something. I am still under the impression that, given your answer, using eval(parse(text is not your preferred way. What are the possible problems (if there are any, that is). I guess I am puzzled by rethink whether that was really the right question. Thanks, R. Thanks, R. -- Ramón Díaz-Uriarte Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
On Friday 05 January 2007 19:35, Bert Gunter wrote: ?? Or to add to what Peter Dalgaard said... (perhaps for the case of many more functions) Why eval(parse())? What's wrong with if then? g - function(fpost,x){if(fpost==1)f.1 else f.2 }(x) or switch() if you have more than 2 possible arguments? I think your remarks reinforce the wisdom of Thomas's axiom . Thanks, Bert, but as with Peter's solution, your solution forces me to build g ahead of time. And again, I am not sure I see why the attempt to avoid eval(parse(text. Best, R. Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ramon Diaz-Uriarte Sent: Friday, January 05, 2007 10:02 AM To: r-help; [EMAIL PROTECTED] Subject: [R] eval(parse(text vs. get when accessing a function Dear All, I've read Thomas Lumley's fortune If the answer is parse() you should usually rethink the question.. But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a postfix. E.g. f.1 - function(x) {x + 1} f.2 - function(x) {x + 2} And sometimes I want to call f.1 and some other times f.2 inside another function. I can either do: g - function(x, fpost) { calledf - eval(parse(text = paste(f., fpost, sep = ))) calledf(x) ## do more stuff } Or: h - function(x, fpost) { calledf - get(paste(f., fpost, sep = )) calledf(x) ## do more stuff } Two questions: 1) Why is the second better? 2) By changing g or h I could use do.call instead; why would that be better? Because I can handle differences in argument lists? Thanks, R. -- Ramón Díaz-Uriarte Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
The other reason for considering which of the different approaches to use would be performance: f.1 - function(x) x+1 f.2 - function(x) x+2 system.time({ + for (i in 1:10){ + eval(parse(text=paste('f.', i%%2+1, sep='')))(i) + } + }) [1] 6.96 0.00 8.32 NA NA system.time({ + for (i in 1:10){ + {if(i %% 2 == 0) f.1 else f.2}(i) + } + }) [1] 0.52 0.00 0.61 NA NA eval(parse...) seems to be an order of magnitude slower. It would make a difference if you were calling it several thousand times; so it depends on your application. On 1/5/07, Ramon Diaz-Uriarte [EMAIL PROTECTED] wrote: On Friday 05 January 2007 19:35, Bert Gunter wrote: ?? Or to add to what Peter Dalgaard said... (perhaps for the case of many more functions) Why eval(parse())? What's wrong with if then? g - function(fpost,x){if(fpost==1)f.1 else f.2 }(x) or switch() if you have more than 2 possible arguments? I think your remarks reinforce the wisdom of Thomas's axiom . Thanks, Bert, but as with Peter's solution, your solution forces me to build g ahead of time. And again, I am not sure I see why the attempt to avoid eval(parse(text. Best, R. Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ramon Diaz-Uriarte Sent: Friday, January 05, 2007 10:02 AM To: r-help; [EMAIL PROTECTED] Subject: [R] eval(parse(text vs. get when accessing a function Dear All, I've read Thomas Lumley's fortune If the answer is parse() you should usually rethink the question.. But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a postfix. E.g. f.1 - function(x) {x + 1} f.2 - function(x) {x + 2} And sometimes I want to call f.1 and some other times f.2 inside another function. I can either do: g - function(x, fpost) { calledf - eval(parse(text = paste(f., fpost, sep = ))) calledf(x) ## do more stuff } Or: h - function(x, fpost) { calledf - get(paste(f., fpost, sep = )) calledf(x) ## do more stuff } Two questions: 1) Why is the second better? 2) By changing g or h I could use do.call instead; why would that be better? Because I can handle differences in argument lists? Thanks, R. -- Ramón Díaz-Uriarte Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with logLik and offsets
The problem is with residuals in model3. Fixed in 2.4.1 patched and later. On Wed, 3 Jan 2007, Jarrod Hadfield wrote: Hi, I'm trying to compare models, one of which has all parameters fixed using offsets. The log-likelihoods seem reasonble in all cases except the model in which there are no free parameters (model3 in the toy example below). Any help would be appreciated. Cheers, Jarrod x-rnorm(100) y-rnorm(100, 1+x) model1-lm(y~x) logLik(model1) sum(dnorm(y, predict(model1), summary(model1)$sigma,log=TRUE)) # no offset - in agreement model2-lm(y~offset(rep(1,100))+x-1) logLik(model2) sum(dnorm(y, predict(model2),summary(model2)$sigma,log=TRUE)) # offset and free parameters - in agreement model3-lm(y~offset(rep(1,100))+offset(x)-1) logLik(model3) sum(dnorm(y, predict(model3),summary(model3)$sigma,log=TRUE)) # offset only - discrepancy sum(predict(model3)-c(1+x)) # yet predict is correct __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] split-plot multiple comparisons
Thank you for your example. I am now using it an example in the MMC Mean--mean Multiple Comparisons plot in the HH package on CRAN. The source for HH_1.17 is on CRAN in Austria as of this morning. The Windows and MacOS binaries will be on CRAN in a few days and will then propagate to the mirrors. Until then, you can get the R Windows binary directly from me at http://astro.ocis.temple.edu/~rmh/HH/HH_1.17.zip ## R Windows binary http://astro.ocis.temple.edu/~rmh/HH/HH_1.17.tar.gz ## source http://astro.ocis.temple.edu/~rmh/HH/HH_1.17_S_WIN386.zip ## S-Plus 8 Windows binary The S-Plus 8 package is also available at http://csan.insightful.com/ After you install and load HH with library(HH) ?MMC will give the complete example. Some comments on the example. The first interaction2wt figure shows parallel traces in the blocks, visually confirming that the blocks appear to be orthogonal to the treatments. The second interaction2wt figure shows a hint of the hibrido:nitrogeno interaction since the P3732 trace is monotone increasing in nitrogen and the others bend back. The MMC plot shows very clearly that the hybrids fall into two different groups, with LH74, P3747, and P3732 in one group and with Mol17 and A632 in the other group. There is a non-significant distinction between LH74 and the two P37** varieties. The MMC plot needs the tiebreaker in this example because observed means for several of the groups are almost identical. The MMC plot is described in Journal of Computational Graphical Statistics 2006, vol. 15, no. 4, pp. 937 - 955 Mean-Mean Multiple Comparison Displays for Families of Linear Contrasts Richard M. Heiberger; Burt Holland Abstract Traditional tabular and graphical displays of results of simultaneous confidence intervals or hypothesis tests are deficient in several respects. Expanding on earlier work, we present new mean-mean multiple comparison graphs that succinctly and compactly display the results of traditional procedures for multiple comparisons of population means or linear contrasts involving means. The MMC plot can be used with unbalanced, multifactor designs with covariates. After reviewing the construction of these displays in the S language (S-Plus and R), we demonstrate their application to four multiple comparison scenarios. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory limits in R loading a dataset and using the package tree
IMHO, R is not good at really large-scale data mining, esp. when the algorithm is complicated. The alternatives are 1. sampling your data; sometimes you really do not need that large number of records and the accuracy might already be good enough when you load less. 2. find an alternative (commercial software) to do this job if you really need to load all. 3. make a wrapper function, sampling your data and load it into R and build model and repeat this process until you get n models. Then you can do like meta-learning or simply majority-win if your problem is classification. HTH, On 1/4/07, domenico pestalozzi [EMAIL PROTECTED] wrote: I think the question is discussed in other thread, but I don't exactly find what I want . I'm working in Windows XP with 2GB of memory and a Pentium 4 - 3.00Ghx. I have the necessity of working with large dataset, generally from 300,000 records to 800,000 (according to the project), and about 300 variables (...but a dataset with 800,000 records could not be large in your opinion...). Because of we are deciding if R will be the official software in our company, I'd like to say if the possibility of using R with these datasets depends only by the characteristics of the engine (memory and processor). In this case we can improve the machine (for example, what memory you reccomend?). For example, I have a dataset of 200,000 records and 211 variables but I can't load the dataset because R doesn't work : I control the loading procedure (read.table in R) by using the windows task-manager and R is blocked when the file paging is 1.10 GB. After this I try with a sample of 100,000 records and I can correctly load tha dataset, but I'd like to use the package tree, but after some seconds ( I use this tree(variable1~., myDataset) ) I obtain the message Reached total allocation of 1014Mb. I'd like your opinion and suggestion, considering that I could improve (in memory) my computer. pestalozzi [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in save function
Hi r-user, I am trying to save same objects in workspace, but the next error message is show in line command, save(area,cent,larg,mran,eddy,dia,vort,slan,file=stat.Rdata) Erro en save.default(area, cent, larg, mran, eddy, dia, vort, slan, file = stat.Rdata) : 5 arguments passed to 'saveToConn' which requires 6 Is there a way to help me solve this problems? Thank Fernando __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation of 5 parameters
Using the inverse logistic transform to replace p by exp(xp)/(1+exp(xp)) allows unconstrained fitting of xp. There may still be problems where xp tends to + or - infinity depending on starting values. francogrex [EMAIL PROTECTED] 01/05/07 11:54 PM Hi Guys, it would be great if you could help me with a MLE problem in R. I am trying to evaluate the maximum likelihood estimates of theta = (a1, b1, a2, b2, P) which defines a mixture of a Poisson distribution and two gamma prior distributions (where the Poisson means have a gamma distribution, actually 2 gammas and P is the mixing factor). The likelihood function for theta is L(theta) = Pi,j{P f(Nij; a1, b1, Eij) + (1 * P) f(Nij; a2, b2, Eij),} The maximum likelihood estimate of theta is the vector that maximizes the above equation (the values of N and E are given). The authors of the article I read say that the maximization involves an iterative search in the five dimensional parameter space, where each iteration involves computing log[L(theta)] and its first and second-order derivatives. In test runs it is suggested that the maximization typically takes between 5 and 15 iterations from the starting point theta = (a1 = 0.2, b1 = 0.1, a2 = 2, b2 = 4, P = 1/3). Now I have done maximization of a gamma-poisson mixture before (1 poisson, 1 gamma) successfully and I could determine correctly alpha (a) and beta(a). But this one above is giving me ridiculously large unusable values (for example P should not be above 1 and sometimes I get values of 500!) or even negative values! I know the values I should be obtaining with my samples shouldn't be far from the staring points. Is there a way to help me solve this issue? Thanks. -- View this message in context: http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf2925364.html#a8177473 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in save function
There is no 'save.default' function in R. You were asked in the posting guide to report the results of sessionInfo(). (It looks like an old version of R.oo might be the culprit, and you might need to update.packages().) On Fri, 5 Jan 2007, fernando espindola wrote: Hi r-user, I am trying to save same objects in workspace, but the next error message is show in line command, save(area,cent,larg,mran,eddy,dia,vort,slan,file=stat.Rdata) Erro en save.default(area, cent, larg, mran, eddy, dia, vort, slan, file = stat.Rdata) : 5 arguments passed to 'saveToConn' which requires 6 Is there a way to help me solve this problems? Thank Fernando __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
Ramon, I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us). Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea. Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them. With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what. With your function, what if the user runs: g(5,3) What should it do? (you have only shown definitions for f.1 and f.2). With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now. If I use the list approach then I will get a subscript out of bounds error rather than running something unintended. 2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again. I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident. 3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved. Personally I have never regretted trying not to underestimate my own future stupidity. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ramon Diaz-Uriarte Sent: Friday, January 05, 2007 11:41 AM To: Peter Dalgaard Cc: r-help; [EMAIL PROTECTED] Subject: Re: [R] eval(parse(text vs. get when accessing a function On Friday 05 January 2007 19:21, Peter Dalgaard wrote: Ramon Diaz-Uriarte wrote: Dear All, I've read Thomas Lumley's fortune If the answer is parse() you should usually rethink the question.. But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a postfix. E.g. f.1 - function(x) {x + 1} f.2 - function(x) {x + 2} And sometimes I want to call f.1 and some other times f.2 inside another function. I can either do: g - function(x, fpost) { calledf - eval(parse(text = paste(f., fpost, sep = ))) calledf(x) ## do more stuff } Or: h - function(x, fpost) { calledf - get(paste(f., fpost, sep = )) calledf(x) ## do more stuff } Two questions: 1) Why is the second better? 2) By changing g or h I could use do.call instead; why would that be better? Because I can handle differences in argument lists? Dear Peter, Thanks for your answer. Who says that they are better? If the question is how to call a function specified by half of its name, the answer could well be to use parse(), the point is that you should rethink whether that was really the right question. Why not instead, e.g. f - list(1=function(x) {x + 1} , 2=function(x) {x + 2}) h - function(x, fpost) f[[fpost]](x) h(2,2) [1] 4 h(2,1) [1] 3 I see, this is direct way of dealing with the problem. However, you first need to build the f list, and you might not know about that ahead of time. For instance, if I build a function so that the only thing that you need to do to use my function g is to call your function f.something, and then pass the something. I am still under the impression that, given your answer, using eval(parse(text is not your preferred way. What are the possible problems (if there are any, that is). I guess I am puzzled by rethink whether that was really the right question. Thanks, R. Thanks, R. -- Ramón Díaz-Uriarte Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462
Re: [R] eval(parse(text vs. get when accessing a function
Greg Snow wrote: Personally I have never regretted trying not to underestimate my own future stupidity. Fortune! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory limits in R loading a dataset and using the packagetree
I agree about sampling, but.. You can go a little further with your hardware. The defaults in R is to play nice and limit your allocation to half the available RAM. Make sure you have a lot of disk swap space (at least 1G with 2G of RAM) and you can set your memory limit to 2G for R. See help(memory.size) and use the memory.limit function Hugues P.s. Someone let me use their 16Gig of RAM linux And I was able to run R-64 bits with top showing 6Gigs of RAM allocated (with suitable --max-mem-size command line parameters at startup for R). -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Friday, January 05, 2007 2:12 PM To: domenico pestalozzi Cc: r-help@stat.math.ethz.ch Subject: Re: [R] memory limits in R loading a dataset and using the packagetree IMHO, R is not good at really large-scale data mining, esp. when the algorithm is complicated. The alternatives are 1. sampling your data; sometimes you really do not need that large number of records and the accuracy might already be good enough when you load less. 2. find an alternative (commercial software) to do this job if you really need to load all. 3. make a wrapper function, sampling your data and load it into R and build model and repeat this process until you get n models. Then you can do like meta-learning or simply majority-win if your problem is classification. HTH, On 1/4/07, domenico pestalozzi [EMAIL PROTECTED] wrote: I think the question is discussed in other thread, but I don't exactly find what I want . I'm working in Windows XP with 2GB of memory and a Pentium 4 - 3.00Ghx. I have the necessity of working with large dataset, generally from 300,000 records to 800,000 (according to the project), and about 300 variables (...but a dataset with 800,000 records could not be large in your opinion...). Because of we are deciding if R will be the official software in our company, I'd like to say if the possibility of using R with these datasets depends only by the characteristics of the engine (memory and processor). In this case we can improve the machine (for example, what memory you reccomend?). For example, I have a dataset of 200,000 records and 211 variables but I can't load the dataset because R doesn't work : I control the loading procedure (read.table in R) by using the windows task-manager and R is blocked when the file paging is 1.10 GB. After this I try with a sample of 100,000 records and I can correctly load tha dataset, but I'd like to use the package tree, but after some seconds ( I use this tree(variable1~., myDataset) ) I obtain the message Reached total allocation of 1014Mb. I'd like your opinion and suggestion, considering that I could improve (in memory) my computer. pestalozzi [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation of 5 parameters
No I have not forgotten to use a negative fnscale to optimize, so as you suggest I will post some parts of the code I am running to show you the errors: n [1] 3 1 4 54 6 58 20 14 3 14 4 65 1 7 9 10 2 4 66 [20] 5 9 7 12 7 55 105 2 5 10 55 5 28 1 1 6 2 1 30 [39] 6 49 7 21 8 7 e [1] 21.763201 1.209070 4.836270 32.644798 19.546600 24.584400 30.226700 [8] 6.045340 14.010100 3.113350 21.015100 12.583100 15.826200 19.458401 [15] 3.891690 1.329970 0.241814 3.143580 13.057900 0.725441 18.136000 [22] 2.187660 6.319900 1.701510 29.654900 36.460999 7.292190 1.215370 [29] 3.209070 19.995001 11.972300 3.455920 0.138539 0.113350 1.360200 [36] 1.889170 1.518890 18.226700 4.050380 27.340099 1.181360 16.370300 [43] 20.589399 25.314899 fr-function(a1,b1,a2,b2,p){ + + w-((gamma(a1+n)))/((gamma(a1)*factorial(n))*(1+(e/b1)^a1)*(1+(b1/e)^n)) + z-((gamma(a2+n)))/((gamma(a2)*factorial(n))*(1+(e/b2)^a2)*(1+(b2/e)^n)) + + sum (log( (p*w)+ ((1-p)*z) )) + + } mle((fr), start=list(a1=0.2,b1=0.1,a2=2,b2=4,p=0.33),method=BFGS,control=list(fnscale=-1)) Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] And with the L-BFGS-B: Error in optim(start, f, method = method, hessian = TRUE, ...) : L-BFGS-B needs finite values of 'fn' AND WITH NELDER-MEAD it doesn't work either (same error), but when I change intial parameters (though I shouldn't, it gives something very weird (negatives or sometimes huge values). Call: mle(minuslogl = (fr), start = list(a1 = 1, b1 = 1, a2 = 10, b2 = 10, p = 0.9), method = Nelder-Mead, control = list(fnscale = -1)) Coefficients: a1 b1 a2 b2 p -2.5035823 0.6236359 26.5562988 12.9604112 -0.1383767 Thanks Ravi Varadhan wrote: Franco, Is it possible that you have failed to provide the negative of loglikelihood to optim, since optim, by default, minimizes a function? If you want to do this withput redefining the log-likelihood, you should set fnscale= -1 (as hinted by Prof. Ripley). This would turn the problem into a maximization problem. If this doesn't work, you should provide more details (a reproducible code with actual error message). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of francogrex Sent: Friday, January 05, 2007 10:42 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] maximum likelihood estimation of 5 parameters Franco, You can provide lower and upper bounds on the parameters if you use optim with method=L-BFGS-B. Hth, Ingmar Thanks, but when I use L-BFGS-B it tells me that there is an error in optim(start, f, method = method, hessian = TRUE, ...) : L-BFGS-B needs finite values of 'fn' -- View this message in context: http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf292536 4.html#a8180120 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf2925364.html#a8186869 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
Peter Dalgaard wrote: Greg Snow wrote: Personally I have never regretted trying not to underestimate my own future stupidity. Fortune! Seconded. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calling confint.glm from within another function
On July 12, 2004 Spencer Graves wrote an email describing essentially the same issue that I would like help on: calling the confint function from within another homemade function. Because he provided many good examples of the problem, I will not reproduce them here but will instead refer readers to the original posting: http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg23826.html It is not clear from the replies to his posting how he managed to solve the problem, so I am reposting a similar note again in hopes of guidance. In particular, I would like to know if and how I need to modify the confint function call so that I can access it from within another function that I have written. Thanks, Brant Inman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
On Fri, 5 Jan 2007, Ramon Diaz-Uriarte wrote: I see, this is direct way of dealing with the problem. However, you first need to build the f list, and you might not know about that ahead of time. For instance, if I build a function so that the only thing that you need to do to use my function g is to call your function f.something, and then pass the something. I am still under the impression that, given your answer, using eval(parse(text is not your preferred way. What are the possible problems (if there are any, that is). I guess I am puzzled by rethink whether that was really the right question. There are definitely situations where parse() is necessary or convenient, or we wouldn't provide it. For example, there are some formula-manipulation problems where it really does seem to be the best solution. The point of my observation was that it is relatively common for people to ask about parse() solutions to problems, but relatively rare to see them in code by experienced R programmers. The 'rethink the question' point is that a narrowly-posed programming problem may suggest parse() as the answer, when thinking more broadly about what you are trying to do may allow a completely different approach [the example of lists is a common one]. The problem with eval(parse()) is not primarily one of speed. A problem with parse() is than manipulating text strings is easy to mess up, since text has so much less structure than code. A problem with eval() is that it is too powerful -- since it can do anything, it is harder to keep track of what it is doing. In one sense this is just a style issue, but I still think my comment is good advice. If you find yourself wanting to use parse() it is a good idea to stop and think about whether there is a better way to do it. Often, there is. Sometimes, there isn't. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calling confint.glm from within another function
On Fri, 2007-01-05 at 17:01 -0600, Inman, Brant A. M.D. wrote: On July 12, 2004 Spencer Graves wrote an email describing essentially the same issue that I would like help on: calling the confint function from within another homemade function. Because he provided many good examples of the problem, I will not reproduce them here but will instead refer readers to the original posting: http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg23826.html It is not clear from the replies to his posting how he managed to solve the problem, so I am reposting a similar note again in hopes of guidance. In particular, I would like to know if and how I need to modify the confint function call so that I can access it from within another function that I have written. Thanks, Brant Inman I'll see your thread and raise you one:-) https://stat.ethz.ch/pipermail/r-devel/2006-December/043952.html Bottom line, it has been fixed in a recent update to the VR bundle. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation of 5 parameters
Franco, It is up to the user of mle() to define a function that is numerically well behaved - even near the boundaries - and/or to provide suitable boundary constraints. Using gamma() in situations in which the argument could be 500 (say) will cause problems. It appears that the maximum of fr is attained for values for which gamma(a2+n) cannot be evaluated using machine arithmetic (and for very small values of 'p', too). Does this function truly have a finite maximum for these data? Dividing by very small numbers is also a potential killer. Trying to exponentiate potentially negative numbers is also asking for trouble. So, you need to write a more robust version of 'fr'. Some suggestions: Use lgamma() or choose() or lchoose() rather than gamma() or factorial() Try to add and subtract logarithms in preference to multiplying and dividing as cancellations usually are more accurate and you are less likely to run into machine accuracy issues. Replace p with plogis(p.x) and consider bounding p.x to keep plogis(x) away from 0 and 1 Establish boundary constraints to keep things like (1+(e/b1)^a1)*(1+(b1/e)^n) from causing problems. Consider log1p(), etc or writing this out longhand and doing some cancellations. Worry about fnscale. It seems unlikely that all args are indeed on the same scale. Use stingy boundary constrants to get a better set of starting values. Progressively relax the constraints and watch how the parameter changes. If after all that, your new fr function still hands you non-finite values, then you need to investigate to find out where they are coming from. There are many ways to do this, but broadly speaking you are 'debugging' your code, so use the manuals and RSiteSearch to figure out what you need to do to become skilled in debugging. On Fri, 5 Jan 2007, francogrex wrote: No I have not forgotten to use a negative fnscale to optimize, so as you suggest I will post some parts of the code I am running to show you the errors: n [1] 3 1 4 54 6 58 20 14 3 14 4 65 1 7 9 10 2 4 66 [20] 5 9 7 12 7 55 105 2 5 10 55 5 28 1 1 6 2 1 30 [39] 6 49 7 21 8 7 e [1] 21.763201 1.209070 4.836270 32.644798 19.546600 24.584400 30.226700 [8] 6.045340 14.010100 3.113350 21.015100 12.583100 15.826200 19.458401 [15] 3.891690 1.329970 0.241814 3.143580 13.057900 0.725441 18.136000 [22] 2.187660 6.319900 1.701510 29.654900 36.460999 7.292190 1.215370 [29] 3.209070 19.995001 11.972300 3.455920 0.138539 0.113350 1.360200 [36] 1.889170 1.518890 18.226700 4.050380 27.340099 1.181360 16.370300 [43] 20.589399 25.314899 fr-function(a1,b1,a2,b2,p){ + + w-((gamma(a1+n)))/((gamma(a1)*factorial(n))*(1+(e/b1)^a1)*(1+(b1/e)^n)) + z-((gamma(a2+n)))/((gamma(a2)*factorial(n))*(1+(e/b2)^a2)*(1+(b2/e)^n)) + + sum (log( (p*w)+ ((1-p)*z) )) + + } mle((fr), start=list(a1=0.2,b1=0.1,a2=2,b2=4,p=0.33),method=BFGS,control=list(fnscale=-1)) Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] And with the L-BFGS-B: Error in optim(start, f, method = method, hessian = TRUE, ...) : L-BFGS-B needs finite values of 'fn' AND WITH NELDER-MEAD it doesn't work either (same error), but when I change intial parameters (though I shouldn't, it gives something very weird (negatives or sometimes huge values). Call: mle(minuslogl = (fr), start = list(a1 = 1, b1 = 1, a2 = 10, b2 = 10, p = 0.9), method = Nelder-Mead, control = list(fnscale = -1)) Coefficients: a1 b1 a2 b2 p -2.5035823 0.6236359 26.5562988 12.9604112 -0.1383767 Thanks Ravi Varadhan wrote: Franco, Is it possible that you have failed to provide the negative of loglikelihood to optim, since optim, by default, minimizes a function? If you want to do this withput redefining the log-likelihood, you should set fnscale= -1 (as hinted by Prof. Ripley). This would turn the problem into a maximization problem. If this doesn't work, you should provide more details (a reproducible code with actual error message). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of francogrex Sent: Friday, January 05, 2007 10:42 AM To: r-help@stat.math.ethz.ch Subject: Re: [R]
[R] Bootstrapping Confidence Intervals for Medians
I apologize for this post. I am new to R (two days) and I have tried and tried to calculated confidence intervals for medians. Can someone help me? Here is my data: institution1 0.21 0.16 0.32 0.69 1.15 0.9 0.87 0.87 0.73 The first four observations compose group 1 and observations 5 through 9 compose group 2. I would like to create a bootstrapped 90% confidence interval on the difference of the medians (n2-n1). I have successfully calculated a permutation test. This shouldn't be as difficult as I am making it, would someone please enlighten me? Please reply to [EMAIL PROTECTED] as I have not subscribed to this list yet. TIA, Greg Gilbert, Faculty Research Associate Department of Biostatistics, Bioinformatics, Epidemiology Medical University of South Carolina __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with gls
Hello R-users, I am using gls function in R to fit a model with certain correlation structure. The medol as: fit.a-gls(y~1,data=test.data,correlation=corAR1(form=~1|aa),method=ML) mu-summary(fit.a)$coefficient With the toy data I made to test, the estimate of mu is exactly equal to the overall mean of y which can not be true. But, if I make a toy data with y more than two replicates (for each level of aa we have more than 2 abservations, for example N=4), the estimates of mu will not be as the same as the overall mean of y. Would you please tell me why this happens? The following is my testing code. Thanks. James require(mvtnorm) rho=-0.5 N=2 sigma.a=2 Rho.a-matrix(c(1,rho^(1:(N-1)))[outer(X=1:N,Y=1:N,function(x,y) 1+abs(x-y))],ncol=N) Sigma.a-sigma.a^2 * Rho/(1-rho.a^2) y-rep(0,N*20) for (ii in 1:20) {y[((ii-1)*N+1):(ii*N)]-rmvnorm(1, mean=rep(0,N), sigma=Sigma.a) } test.data-data.frame(y=y,aa=rep(1:20,each=N,len=N*20)) fit.a-gls(y~1,data=test.data,correlation=corAR1(form=~1|aa),method=ML) mu.a-summary(fit.a)$coefficient rho.a-getVarCov(fit.a)[1,2]/getVarCov(fit.a)[1,1] print(c(mean(y),mu.a)) __ Zhenqiang (James) Lu Department of Statistics Purdue University West Lafayette, IN 47906 TEL: (765) 494-0027 FAX: (765) 494-0558 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.