Re: [R] ERROR: Object not found
Error originates in the customized function ode. When IN!=0, You did not assign a value to dIN which is required in list(c(dP1,dP2,dIN)). On 2010-9-21 2:30, Tianchan Niu wrote: Dear All, I am trying to use ode solver rk4 to solve an ODE system, however, it keeps saying: Error in eval(expr, envir, enclos) : object dIN not found. The sample codes are enclosed as follows, please help me. Thank you very much! rm(list=ls()) library(odesolve) # The ODE system ode- function(t,x,p){ with(as.list(c(x,p)),{ if(IN==0){ dIN- 1 switch- c } else { switch- 0 } dP1- a+b*P1-switch*P1 dP2- a-b*P1+switch*P2 list(c(dP1,dP2,dIN)) }) } # Parameters a- 0.1 b- 0.2 c- 0.5 parms- c(a=a,b=b,c=c) # Initial conditions P10- 100.0 P20- 0.0 IN0- 0.0 xstart- c(P1=P10,P2=P20,IN=IN0) # Time points times- seq(0,10,by=1) out- as.data.frame(rk4(xstart,times,ode,parms)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in eval(expr, envir, enclos)
I don't see a data frame name in the rlm call... Dennis On Mon, Sep 20, 2010 at 7:08 PM, uttara_n nilawar.utt...@gmail.com wrote: I am absolutely new to R and I am aware of only a few basic command lines. I was running a robust regression in R, using the following command line library (MASS) rfdmodel1 - rlm (TotalEmployment_2004 ~ MISSISSIPPI + LOUISIANA + TotalEmployment_2000 + PCWhitePop_2004 + UnemploymentRate_2004 + PCUrbanPop2000 + PCPeopleWithACollegeDegree_2000 + PCPopulation.of.or.over.65.years.of.age_2004) summary (rfdmodel1) But, I keep getting this error: Error in eval(expr, envir, enclos) : object 'TotalEmployment_2004' not found summary (rfdmodel1) Error in summary(rfdmodel1) : object 'rfdmodel1' not found I have tried using a different machine and that did not work either. Any suggestions? Uttara -- View this message in context: http://r.789695.n4.nabble.com/Error-in-eval-expr-envir-enclos-tp2547917p2547917.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating matrices from lists
On 09/21/2010 05:02 AM, Gregory Ryslik wrote: Hi, I think I've found away around that issue. The following works. If this method is inefficient and one has something faster, I'll appreciate it though! lapply(mylist, function(x) as.numeric(as.character(x))) You could avoid making them factors in the first place Otherwise, the common trick is as.numeric(levels(x))[x] or, if you are sure that the levels are always 0/1, c(0,1)[x]. Of course if that is true you could just accept the 1/2 solution and subtract 1 at the end. (But beware that you might have one element as a factor with a single level 1, and that would get treated exactly the same as a factor with only 0...) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] for loop
Hi guys, Im new to R and am having a bit of trouble with what should be a simple loop. It sprobably something very fundamental that im doing wrong. for(i in c(1:520)) { tmp1- ((1-samp.pct[i])^2)*(log(1-theor.pct[i])-log(1-theor.pct[i+1])) tmp2- ((samp.pct[i])^2)*(log(theor.pct[i+1])-log(theor.pct[i])) } ADtest.stat- (-n*(max(theor.pct))) + (n*sum(tmp1)) + (n*sum(tmp2)) names(ADtest.stat) - c(Anderson-Darling test statistic) print(ADtest.stat) samp.pct is a vector containing 520 values between 0 and 1, so is theor.pct. n is 520. After running this code, tmp1 and tmp2 return only one value, where I need around 520 values. Any ideas would be appreciated. Regards, Andre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help for EM algorithm ASAP !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
I created a EM algorithm for Generalized hyperbolic distribution. I want to estimate mutheldaplus, sigmatheldaplus, betasigmaplus in my code. After getting use these value , then my iteration have to be begin of this code. But I can not to do iteration part. Can you help me use my code and get iteration ? Do know any useful code for EM algorithm for Generalized Hyperbolic library(QRMlib) library(ghyp) simulation part simulation-function(n,lambda,mu,thelda,gamma,sigma,beta){ set.seed(235) chi-thelda^2 psi-gamma^2 W - rGIG(n, lambda, chi, psi); Z - rnorm(n,0,1); y-mu + beta * W + sqrt(W) * Z *gamma; for (i in 1:n){ theldastar-rep(0,n) zi-rep(0,n) ti-rep(0,n) muthelda-mu gammathelda-thelda*gamma sigmathelda-(thelda^2)*sigma betathelda-(thelda^2)*sigma*beta lambdastar-lambda-0.5 theldastar[i]-sqrt(1+((y[i]-muthelda)/sigmathelda)^2) gammastar-sqrt((gammathelda^2)+((betathelda/sigmathelda)^2)) klambda1-besselM3(lambdastar+1, x=2, logvalue=FALSE) klambda-besselM3(lambdastar,x=2,logvalue=FALSE) klambda2-besselM3(lambdastar-1,x=2,logvalue=FALSE) zi[i]-((theldastar[i]*klambda1*(theldastar[i]*gammastar))/(gammastar*klambda*theldastar[i]*gammastar)) ti[i]-((gammastar*klambda2*(theldastar[i]*gammastar))/(theldastar[i]*klambda*theldastar[i]*gammastar)) zimean-sum(zi)/n timean-sum(ti)/n mutheldaplus-(zimean*(1/n)* sum((ti[i]*y[i])-mean(y)))/((zimean*timean)-1) betatheldaplus- sum(y[i]- mutheldaplus)/(n*zimean) sigmatheldaplus-((1/n)*sum((ti[i]*((y[i]-mutheldaplus)^2))-(2*betatheldaplus*(y[i]-mutheldaplus))-((betatheldaplus^2)*zi[i]))) print(muthelda) print(mutheldaplus) print(betathelda) print(betatheldaplus) print(sigmathelda) print(sigmatheldaplus) return(ti) } } a-simulation(2,-0.5,0,1,1,1,0) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in eval(expr, envir, enclos)
Oh, I am sorry I did not include the data.frame command line since it loaded the data correctly without any errors. The following is the command line I used to load the data: data - read.table (C:/MUP/Internship/Distance_Statistics/Excel_data/ForModelling_Rev_2.csv , sep= ,, header = TRUE) attach (data) summary (data) -- View this message in context: http://r.789695.n4.nabble.com/Error-in-eval-expr-envir-enclos-tp2547917p2548072.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package for calculating bandwidths
Hi! Is there an R-package with which I can calculate bandwidths via cross validation in a data set?? Greetz, Valentin -- View this message in context: http://r.789695.n4.nabble.com/Package-for-calculating-bandwidths-tp2548091p2548091.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop
What are you doing at iteration 520 with the objects corresponding to the i + 1 index? And there's a reason why you get one value for tmp1 and tmp2 - you replace their values every iteration. Maybe you want tmp1[i] and tmp2[i]? If this is not a recursive calculation, you could easily replace the loop with vectorized calculations, leaving you with cleaner code and, quite possibly, faster execution speed. HTH, Dennis On Mon, Sep 20, 2010 at 11:41 PM, andre bedon andresa...@hotmail.comwrote: Hi guys, Im new to R and am having a bit of trouble with what should be a simple loop. It sprobably something very fundamental that im doing wrong. for(i in c(1:520)) { tmp1- ((1-samp.pct[i])^2)*(log(1-theor.pct[i])-log(1-theor.pct[i+1])) tmp2- ((samp.pct[i])^2)*(log(theor.pct[i+1])-log(theor.pct[i])) } ADtest.stat- (-n*(max(theor.pct))) + (n*sum(tmp1)) + (n*sum(tmp2)) names(ADtest.stat) - c(Anderson-Darling test statistic) print(ADtest.stat) samp.pct is a vector containing 520 values between 0 and 1, so is theor.pct. n is 520. After running this code, tmp1 and tmp2 return only one value, where I need around 520 values. Any ideas would be appreciated. Regards, Andre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: for loop
First up, you've got a problem if your vectors are 520 elements in length because you accessing element i+1 in your loop and when i is 520 you won't have a valid value. Next, you don't really need a for loop at all :) You can do those operations on the whole vectors... e.g. tmp1 - (1 - samp.pct[-n)^2 * diff(log(1 - theor.pct)) (just replaced your explicit subtraction of lagged values with diff) Does that help ? Michael Note, in the line above, the last element is omitted with [-n]. See help page on the diff function if you are not familiar with it. On 21 September 2010 16:41, andre bedon andresa...@hotmail.com wrote: Hi guys, Im new to R and am having a bit of trouble with what should be a simple loop. It sprobably something very fundamental that im doing wrong. for(i in c(1:520)) { tmp1- ((1-samp.pct[i])^2)*(log(1-theor.pct[i])-log(1-theor.pct[i+1])) tmp2- ((samp.pct[i])^2)*(log(theor.pct[i+1])-log(theor.pct[i])) } ADtest.stat- (-n*(max(theor.pct))) + (n*sum(tmp1)) + (n*sum(tmp2)) names(ADtest.stat) - c(Anderson-Darling test statistic) print(ADtest.stat) samp.pct is a vector containing 520 values between 0 and 1, so is theor.pct. n is 520. After running this code, tmp1 and tmp2 return only one value, where I need around 520 values. Any ideas would be appreciated. Regards, Andre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bivariate vector numerical integration with infinite range
Dear list, I'm seeking some advice regarding a particular numerical integration I wish to perform. The integrand f takes two real arguments x and y and returns a vector of constant length N. The range of integration is [0, infty) for x and [a,b] (finite) for y. Since the integrand has values in R^N I did not find a built-in function to perform numerical quadrature, so I wrote my own after some inspiration from a post in R-help, library(statmod) ## performs 2D numerical integration ## using Gauss-Legendre quadrature ## with N points for x and y vAverage - function(f, a1,b1, a2,b2, N=5, ...){ GL - gauss.quad(N) nodes - GL$nodes weights - GL$weights C2 - (b2 - a2) / 2 D2 - (b2 + a2) / 2 y - nodes*C2 + D2 C1 - (b1 - a1) / 2 D1 - (b1 + a1) / 2 x - nodes*C1 + D1 value - 0 for (ii in seq_along(x)){ tmp - 0 for (jj in seq_along(y)){ tmp - tmp + C1 * weights[jj] * f(x[jj], y[ii], ...) } value - value + C2 * weights[ii] * tmp } value } ## test function, the result is pi for y=1 f - function(x, y) { res - 1 / (sqrt(x)*(1+x)) c(res, res/2, 2*res) } ## Transformation rule from Numerical Recipes ## to deal with the [0, infty) range of x mixedrule - function(x, y, f, ...) { t - exp(pi*sinh(x)) dtdx - t*(pi*cosh(x)) f(t, y, ...)*dtdx } vAverage(mixedrule, -4, 4, 0.0, 1, 20, f) - c(pi, pi/2, 2*pi) ## -3.535056e-06 -1.767528e-06 -7.070112e-06 So it seems to work. I wonder though if I may have missed an easier (and more reliable) way to perform such integration using base functions or an add-on package that I may have overlooked. Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] diagnosing download.file() problems
I'm accessing around 95 tar files on an FTP server ranging in size between 10 and 40MB a piece. while certainly can click on them and download them outside of R, I'd like to have my script do it. Retrieving the ftp directory with RCurl works fine (about 90% of the time) but downloading the files by looping through all the files is a random process. I may get 1-8 files download and then it throws an error cannot open URL sometimes I only can get 1 file before this error. with tryCatch() I've been able to do some clean up after the crash, but automating this whole download process has turned into a bit of a circus. The parameters (url, destfile, mode) are all correct in the download.file call as the second attempt at a url will often succeed. Is there anyway to get a deeper look at the cause of the problem? I've tried closing all connections in between each download. any pointers would be welcomed. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combined plot: Scatter + density plot
Hi Ralf, I am sure there is a function to do exactly that but I am not sure what is the name and the package (Try plotrix or gregmisc...). To help you with your problem we would need a reproducible example with your data, not the example which obviously works. As a guess, when you run info - hist(x) it should save you all the information regarding bin sizes etc into info and you can start from there. That should also work with probabilities. Best Jannis --- Ralf B ralf.bie...@gmail.com schrieb am Di, 21.9.2010: Von: Ralf B ralf.bie...@gmail.com Betreff: [R] Combined plot: Scatter + density plot An: r-help Mailing List r-help@r-project.org Datum: Dienstag, 21. September, 2010 05:34 Uhr Hi, in order to save space for a publication, it would be nice to have a combined scatter and density plot similar to what is shows on http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=78 I wonder if anybody perhaps has already developed code for this and is willing to share. This is the reproducible code for the histogram version obtained from the site: def.par - par(no.readonly = TRUE) # save default, for resetting... x - pmin(3, pmax(-3, rnorm(50))) y - pmin(3, pmax(-3, rnorm(50))) xhist - hist(x, breaks=seq(-3,3,0.5), plot=FALSE) yhist - hist(y, breaks=seq(-3,3,0.5), plot=FALSE) top - max(c(xhist$counts, yhist$counts)) xrange - c(-3,3) yrange - c(-3,3) nf - layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE) par(mar=c(3,3,1,1)) plot(x, y, xlim=xrange, ylim=yrange, xlab=, ylab=) par(mar=c(0,3,1,1)) barplot(xhist$counts, axes=FALSE, ylim=c(0, top), space=0) par(mar=c(3,0,1,1)) barplot(yhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE) par(def.par) I am basically stuck from line 6 where the bin information from the histogram is used for determining plotting sizes. Density are different and don't have (equal) bins and their size would need to be determined differently. I wonder if somebody here has created such a diagram already and is willing to share ideas/code/pointers to similar examples. Your effort is highly appreciated. Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NA problem
Dear R list I have a problem with NA, which should be a string, but R seems that it doesn't recognize it. What I do is first give the format command to my data frame: format.data.frame(mydata,big.mark= ) so I give a blank as thousand separator. All my records in my data frame become strings, so instead of having NA I have NA. I try to convert NA in .,but it seems that R doesn't recognize NA. Someone knows why and how to treats those NA?? Thanks for your attention __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT: Is randomization for targeted cancer therapies ethical?
On 09/21/2010 05:02 AM, David Winsemius wrote: ... Not all university libraries have access via that link and efforts at identifying the citation in Pubmed failed, so could I request a more complete citation, please? Oh never mind I got it with Google. If anyone else needs the ISSN of JASA in which both the cited article and Armitage's reply appeared for the purposes of JSTOR access, it's 01621459. For those of us not fortunate enough to have access to JStor directly (and how I miss it), try this link: http://ml.stat.purdue.edu/bayes/writings/anscombe.sequentialmedical.1963.pdf Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] diagnosing download.file() problems
On Tue, Sep 21, 2010 at 9:39 AM, steven mosher mosherste...@gmail.com wrote: I'm accessing around 95 tar files on an FTP server ranging in size between 10 and 40MB a piece. while certainly can click on them and download them outside of R, I'd like to have my script do it. Retrieving the ftp directory with RCurl works fine (about 90% of the time) but downloading the files by looping through all the files is a random process. I may get 1-8 files download and then it throws an error cannot open URL sometimes I only can get 1 file before this error. with tryCatch() I've been able to do some clean up after the crash, but automating this whole download process has turned into a bit of a circus. The parameters (url, destfile, mode) are all correct in the download.file call as the second attempt at a url will often succeed. Is there anyway to get a deeper look at the cause of the problem? I've tried closing all connections in between each download. any pointers would be welcomed. Sounds to me like the FTP server is operating some kind of rate limiting. Do you have access to the server log files, or the server administrator, or perhaps the server's terms and conditions to see if its so :) Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA problem
Short answer: don't do that. The format function is for preparing data for output. Do your data manipulations on a data frame you keep for such use, and only use format to prepare for output. n.via...@libero.it n.via...@libero.it wrote: Dear R list I have a problem with NA, which should be a string, but R seems that it doesn't recognize it. What I do is first give the format command to my data frame: format.data.frame(mydata,big.mark= ) so I give a blank as thousand separator. All my records in my data frame become strings, so instead of having NA I have NA. I try to convert NA in .,but it seems that R doesn't recognize NA. Someone knows why and how to treats those NA?? Thanks for your attention __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting and subsetting
All the solutions in this thread so far use the lapply(split(...)) paradigm either directly or indirectly. That paradigm doesn't scale. That's the likely source of quite a few 'out of memory' errors and performance issues in R. data.table doesn't do that internally, and it's syntax is pretty easy. tmp - data.table(index = gl(2,20), foo = rnorm(40)) tmp[, .SD[head(order(-foo),5)], by=index] index index.1 foo [1,] 1 1 1.9677303 [2,] 1 1 1.2731872 [3,] 1 1 1.1100931 [4,] 1 1 0.8194719 [5,] 1 1 0.6674880 [6,] 2 2 1.2236383 [7,] 2 2 0.9606766 [8,] 2 2 0.8654497 [9,] 2 2 0.5404112 [10,] 2 2 0.3373457 As you can see it currently repeats the group column which is a shame (on the to do list to fix). Matthew http://datatable.r-forge.r-project.org/ -- View this message in context: http://r.789695.n4.nabble.com/Sorting-and-subsetting-tp2547360p2548319.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combined plot: Scatter + density plot
Sorry Ralf, I was in a hurry when I replied to your plot. The function you are looking for is: dx=density(x) plot(dx) That works with the code you send in your mail. Just adjust the plot limits and change x and y for the density plot of y and it works. A good reference can be found here: http://onertipaday.blogspot.com/2007/05/rotating-distribution-plot-by-90.html Hope that helped more than my last post! Jannis --- Ralf B ralf.bie...@gmail.com schrieb am Di, 21.9.2010: Von: Ralf B ralf.bie...@gmail.com Betreff: [R] Combined plot: Scatter + density plot An: r-help Mailing List r-help@r-project.org Datum: Dienstag, 21. September, 2010 05:34 Uhr Hi, in order to save space for a publication, it would be nice to have a combined scatter and density plot similar to what is shows on http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=78 I wonder if anybody perhaps has already developed code for this and is willing to share. This is the reproducible code for the histogram version obtained from the site: def.par - par(no.readonly = TRUE) # save default, for resetting... x - pmin(3, pmax(-3, rnorm(50))) y - pmin(3, pmax(-3, rnorm(50))) xhist - hist(x, breaks=seq(-3,3,0.5), plot=FALSE) yhist - hist(y, breaks=seq(-3,3,0.5), plot=FALSE) top - max(c(xhist$counts, yhist$counts)) xrange - c(-3,3) yrange - c(-3,3) nf - layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE) par(mar=c(3,3,1,1)) plot(x, y, xlim=xrange, ylim=yrange, xlab=, ylab=) par(mar=c(0,3,1,1)) barplot(xhist$counts, axes=FALSE, ylim=c(0, top), space=0) par(mar=c(3,0,1,1)) barplot(yhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE) par(def.par) I am basically stuck from line 6 where the bin information from the histogram is used for determining plotting sizes. Density are different and don't have (equal) bins and their size would need to be determined differently. I wonder if somebody here has created such a diagram already and is willing to share ideas/code/pointers to similar examples. Your effort is highly appreciated. Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] predict.lrm ( Design package) poor performance?
Thanks Frank, I have one small question regarding this, understand you are very busy and if you cant answer i would greatly appreciate any thoughts from the list. Split-sample validation is not reliable unless you have say 10,000 samples to begin with I am a little confused. When i ran the model with a binary response i had a Brier score of 0.199 and a C-value of 0.73. When i looked at the data, 75% of the predictions were correct. However when i ran the model with a ordinal response, the Brier score was 0.201 and the c-value 0.677 which to me, albeit with limited knowledge in the area, suggests the model performance worse but not by a great deal. However when i compare the predicted mean values with the real data i only get 45% accuracy, it is this discrepancy i don't understand. I appreciate you need a large dataset to partition it, but surely if the c-value and brier score are similar then the number predicted correct should also be similar? Thanks Chris On 20 Sep 2010, at 14:46, Frank Harrell wrote: A few comments; sorry I don't have time for any more. - Combining categories is almost always a bad idea - It can be harder to discriminate more categories but that's only because the task is more difficult - Split-sample validation is not reliable unless you have say 10,000 samples to begin with; otherwise 100 fold repeats of 10-fold cross-validation, or (faster) 200 or so bootstraps is better. Be sure to re-do all modeling decisions from scratch for each re-sample. If you did no statistical variable selection this may not be an issue. - You don't need .lrm after predict - Not sure why your variable names are all upper case; harder to read this way Good luck Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 20 Sep 2010, Chris Mcowen wrote: Dear Professor Harell I am familier with binary models, however i am now trying to get predictions from a ordinal model and have a question. I have a data set made up of 12 categorical predictors, the response variable is classed as 1,2,3,4,5,6, this relates to threat level of the species ( on the IUCN rating). Previously i have combined levels 1 and 2 to form = non threatened and then combined 3-6 to form threatened, and run a binary model. I have tested the performance of this based on the brier score (0.198) and the AUC or C value (0.75). I also partitioned the data set into training and test data and used the predict function to get a predicted probability for the newdata. When visualising the results with a cutoff value calculated with epi, roughly 75% of the time the prediction was correct. Now i am interested in predicting the threat level of a species not purely as threatened or not but to specific IUCN levels. I have used the predict.lrm function (predict.lrm(model1, type=fitted.ind)) to generate probabilities for each level, and also (predict(model1, traist, type=fitted)) see below. When i call the model the Brier score is 0.201 and C value 0.677. However when i inspect the output and relate it to the corresponding species ( for which i know the true IUCN rating) the model performs very badly, only getting 43% correct. Interestingly i have noticed the probabilities are always highest for levels 1 and 6, on no occasion do levels 2,3,4 or 5 have high probabilities? I am unsure if this is just because the model can not discriminate between the various levels due to insufficient data? Or if i am doing something wrong, if this is the case i would greatly appreciate any advice or suggestion of a different method. Thanks in advance, Chris model1 - lrm(EXTINCTION~BS*FR+WO+LIF+REG+ALT+BIO+HAB+PD+SEA, x=TRUE,y=TRUE) predict.lrm(model1, type=fitted.ind) EXTINCTION=1 EXTINCTION=2 EXTINCTION=3 EXTINCTION=4 EXTINCTION=5 1 0.19748393 0.05895033 0.12811186 0.086140778 0.068137104 2 0.27790178 0.07247496 0.14384976 0.087487677 0.064584865 3 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 4 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 5 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 6 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 7 0.13928899 0.04558050 0.10636220 0.00389 0.065500459 8 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 9 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 100.33865077 0.07915126 0.14744522 0.083923247 0.059387585 EXTINCTION=6 1 0.46117600 2 0.35370096 3 0.39223038 4 0.39223038 5 0.39223038 6 0.39223038 7 0.56549746 8 0.39223038 9 0.39223038 predict(model1, traist, type=fitted) y=2 y=3 y=4 y=5 y=6 1 0.80251607 0.74356575 0.61545388 0.52931311 0.46117600 2 0.72209822 0.64962327 0.50577351
Re: [R] Interpolate? a line
I would like to thank you very much for making this clear. It seems that the solution you suggested is right one as the second attempt does find all the cells that are touched. Now I ll try to find out how much the line gets into one of this cell as every cell affects acts like a weight. The more you are in a cell the higher the weight will be. Best Regards Alex From: David Winsemius dwinsem...@comcast.net Cc: Rhelp list r-help@r-project.org Sent: Fri, September 17, 2010 6:36:51 PM Subject: Re: [R] Interpolate? a line On Sep 17, 2010, at 7:22 AM, Alaios wrote: I would like to thank you again for your help. But it seems that the floor function (ceiling, round too) create more dots in the matrix that line really touches. You said cells not dots. Are you trying to change the problem now? My concern is rather that it can still miss cells. unique( floor( cbind( seq(2,62, by=0.1), linefn(seq(2,62, by=0.1)) ) ) ) You can see that in the picture below http://yfrog.com/5blineswj So, how to select the only the cells that the line touches? If you had taken my suggestion of overlaying a grid rather than plotting dots that fail to represent a cell (which I was taking to be a square of dimension 1 x 1) you would see that my solution was correct (at least to the point of not missing any cells so defined that were touched up to a tolerance of 0.01 cell units. If you want to define cells differently, then it's your turn to step up and get mathematically precise. Calculus still works if you define neighborhoods as hyperspheres s rather than epsilon by delta hyper-rectangles. # Here is the the illustrated sequence of getting to what I am calling my final answer, # even though it could still miss an occasional cell. interp - approx(c(2, 62), c(3, 34), method=linear, xout=2:62) m - matrix(c(interp$x, round(interp$y)), ncol=2) tie - m[,2] == c(-Inf, m[-nrow(m),2]) m - m[ !tie, ] plot(m) # plots points lines(c(2,62), c(3, 34)) # overlay line for comparison #you can add a grid with abline(v=2:62, h=3:34) ## First attempt at integer values of x linefn - function(x) 3+((34-3)/(62-2)) *(x-2) findInterval(linefn(2:62), 3:34) # Second attempt at 0.1 intervals # cellidxs - unique( floor( cbind( seq(2,62, by=0.1), # There will be many duplicates after rounding down linefn(seq(2,62, by=0.1)) ) ) ) # the same function that just gets a y value rect(cellidxs[,1], cellidxs[,2], cellidxs[,1]+1, cellidxs[,2]+1, col=red) #redraw line : lines(2:62, 3+(34-3)/(62-2)*(0:60)) # That is the first plot with coarse tolerances #Third attempt: # Now calculate a set of cell ids with tolerances that at ten-fold more numerous cellid2 -unique( floor(cbind(seq(2,62, by=0.01), linefn(seq(2,62, by=0.01) )) ) ) NROW(cellid2) # 91 cells rect(cellid2[,1], cellid2[,2], cellid2[,1]+1, cellid2[,2]+1, col=blue) rect(cellidxs[,1], cellidxs[,2], cellidxs[,1]+1, cellidxs[,2]+1, col=red) lines(2:62, 3+(34-3)/(62-2)*(0:60)) --Best David. I would like to thank you in advance for your help best Regards Alex From: David Winsemius dwinsem...@comcast.net Cc: Rhelp list r-help@r-project.org Sent: Wed, September 15, 2010 1:55:10 PM Subject: Re: [R] Interpolate? a line On Sep 15, 2010, at 7:24 AM, David Winsemius wrote: Replacing context: Hello everyone. I have created a 100*100 matrix in R. Let's now say that I have a line that starts from (2,3) point and ends to the (62,34) point. In other words this line starts at cell (2,3) and ends at cell (62,34). Is it possible to get by some R function all the matrix's cells that this line transverses? I would like to thank you for your feedback. Best Regards Alex On Sep 15, 2010, at 6:52 AM, Michael Bedward wrote: Hello Alex, Here is one way to do it. It works but it's not pretty :) If you want an alternative, consider that produces the Y cell indices (since the x cell indices are already 2:62): linefn - function(x) 3+((34-3)/(62-2)) *(x-2) findInterval(linefn(2:62), 3:34) [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 [28] 14 15 15 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 [55] 28 29 29 30 30 31 32 # that seems off by two linefn(62) [1] 34 linefn(2) [1] 3 # but that checks out and I realized those were just indices for the 3:34 findInterval vector (3:34)[findInterval(linefn(2:62), 3:34)] [1] 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 [28] 16 17 17 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 [55] 30 31 31 32 32 33 34 ( no rounding and I think the logic is clearer.) But I also realized it didn't enumerate all the the cells were crossed either, only indicating which cell was associated with an integer value of x. Also would have even more serious problems
[R] Finding (Ordered Subvectors)
Dear All, Consider a simple example a-c(1,4,3,0,4,5,6,9,3,4) b-c(0,4,5) c-c(5,4,0) I would like to be able to tell whether a sequence is contained (the order of the elements does matter) in another one e.g. in the example above, b is a subsequence of a, whereas c is not. Since the order matters, I cannot treat the sequences above as sets (also, elements are repeated). Does anyone know a smart way of achieving that? Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding (Ordered Subvectors)
Convert to strings and use grep functions. using c for variable is a bad idea. a - paste(a, collapse=) b - paste(b, collapse=) d - paste(d, collapse=) grepl(b,a) grepl(d,a) Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.l...@gmail.com On Sep 21, 2010, at 6:31 AM, Lorenzo Isella wrote: a-c(1,4,3,0,4,5,6,9,3,4) b-c(0,4,5) c-c(5,4,0) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can ucminf be installed in 64 bit R and one more question?
On 20/09/2010 8:36 PM, Hey Sky wrote: Hey, R Users my windows is 64 bit windows 7. I am trying to install the package ucminf into my 64 bit version R but cannot. the package I downloaded is from http://cran.r-project.org/web/packages/ucminf/index.html and I installed it with the install from local zip files, due to I did not connect my computer to internet. did anyone meet this problem and is there a version of ucminf for 64 bit R? Binary installs are specific to particular R versions. There are several versions of R with 64 bit Windows builds now; which one are you using? Duncan Murdoch the question is: why the ucminf (for 32 bit R) with option hessian=3 always give hessian matrix while most of other methods failed (includiing the option hessian=1 which using numDeriv)? thanks for any information Nan from Montreal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] change y axis distance
Hi I got a barplot that has values between 0-150 and the y-axis shows the steps 0 50 100 and 150 but I would like to change it to 0 10 20 30... ...130 140 150 Dont really know the word in english so sry about it beeing abit confusing :) Thx for your help Joel -- View this message in context: http://r.789695.n4.nabble.com/change-y-axis-distance-tp2548363p2548363.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change y axis distance
On Tue, 2010-09-21 at 03:55 -0700, Joel wrote: Hi I got a barplot that has values between 0-150 and the y-axis shows the steps 0 50 100 and 150 but I would like to change it to 0 10 20 30... ...130 140 150 Dont really know the word in english so sry about it beeing abit confusing :) Thx for your help Joel If you want to draw the ticks at specific locations rather than the ones R chooses for you, turn off the axes in your plot call and then cook your own axis using the axis() function. Here's an example using dummy data: set.seed(123) mdat - rpois(10, 10) names(mdat) - LETTERS[1:10] barplot(mdat, axes = FALSE) ## label at 1, 2, 3 etc not 2, 4, 6, etc axis(side = 2, at = seq(1, max(mdat), by = 2)) For you specific axis call, try: axis(side = 2, at = seq(0, 150, by = 10)) HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help for EM algorithm ASAP !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
The urgency and the vague description of your problem strongly suggest that this is homework. This list is not for homework---see the posting guide at the bottom of every message. Nonetheless since I know this problem reasonably well I will offer some comments. QRMlib is a package created to accompany a book. If you read that book you would see that it fits the generalized hyperbolic to data using the EM algorithm. If you have QRMlib you have an implementation of the EM algorithm. Also why write code to simulate from the generalized hyperbolic (y in your simulation function below) when you have QRMlib and ghyp, both of which have functions for simulating from the generalized hyperbolic? Your code is pretty difficult to follow, with random indenting and zero comments. The structure of the iteration is totally confused as well. Not too many marks if you handed something like this in to me to grade. David Scott On 21/09/2010 5:32 p.m., snes1...@hotmail.com wrote: I created a EM algorithm for Generalized hyperbolic distribution. I want to estimate mutheldaplus, sigmatheldaplus, betasigmaplus in my code. After getting use these value , then my iteration have to be begin of this code. But I can not to do iteration part. Can you help me use my code and get iteration ? Do know any useful code for EM algorithm for Generalized Hyperbolic library(QRMlib) library(ghyp) simulation part simulation-function(n,lambda,mu,thelda,gamma,sigma,beta){ set.seed(235) chi-thelda^2 psi-gamma^2 W- rGIG(n, lambda, chi, psi); Z- rnorm(n,0,1); y-mu + beta * W + sqrt(W) * Z *gamma; for (i in 1:n){ theldastar-rep(0,n) zi-rep(0,n) ti-rep(0,n) muthelda-mu gammathelda-thelda*gamma sigmathelda-(thelda^2)*sigma betathelda-(thelda^2)*sigma*beta lambdastar-lambda-0.5 theldastar[i]-sqrt(1+((y[i]-muthelda)/sigmathelda)^2) gammastar-sqrt((gammathelda^2)+((betathelda/sigmathelda)^2)) klambda1-besselM3(lambdastar+1, x=2, logvalue=FALSE) klambda-besselM3(lambdastar,x=2,logvalue=FALSE) klambda2-besselM3(lambdastar-1,x=2,logvalue=FALSE) zi[i]-((theldastar[i]*klambda1*(theldastar[i]*gammastar))/(gammastar*klambda*theldastar[i]*gammastar)) ti[i]-((gammastar*klambda2*(theldastar[i]*gammastar))/(theldastar[i]*klambda*theldastar[i]*gammastar)) zimean-sum(zi)/n timean-sum(ti)/n mutheldaplus-(zimean*(1/n)* sum((ti[i]*y[i])-mean(y)))/((zimean*timean)-1) betatheldaplus- sum(y[i]- mutheldaplus)/(n*zimean) sigmatheldaplus-((1/n)*sum((ti[i]*((y[i]-mutheldaplus)^2))-(2*betatheldaplus*(y[i]-mutheldaplus))-((betatheldaplus^2)*zi[i]))) print(muthelda) print(mutheldaplus) print(betathelda) print(betatheldaplus) print(sigmathelda) print(sigmatheldaplus) return(ti) } } a-simulation(2,-0.5,0,1,1,1,0) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change y axis distance
On 09/21/2010 08:55 PM, Joel wrote: Hi I got a barplot that has values between 0-150 and the y-axis shows the steps 0 50 100 and 150 but I would like to change it to 0 10 20 30... ...130 140 150 Hi Joel, This is a combination of the pretty calculation of axis tick intervals and the silent omission of labels that appear to be crowded. You can probably get the fifteen labels by adding: # set the axis labels to all horizontal text par(las=1) # add the yaxt argument to plot plot(...,yaxt=n,...) # display the axis axis(2,at=seq(0,150,by=10),labels=seq(0,150,by=10)) If some of the labels are missing, you might have to turn to staxlab in the plotrix package or a similar function. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating table from data frame
Hey, I have a dataset where two columns are factors and another column consists of values. Each combination of factors can only have a single value assigned to it. I'd like to represent this as a matrix or table where the rows are the first column factors and the columns the second column factors. So that each cell a_ij in the matrix represents the associated value for the factor combination ij. When no such value exists for the combination the value should be 0. I've tried playing around with tables to get this to work, but I can't seem to get it right. I've also had little luck when trying to find a solution to this. Any help would be much appreciated! Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] removed data is still there!
I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What am I missing? Thanks in advance. summary(iris$Species) setosa versicolor virginica 50 50 50 nrow(iris) [1] 150 iris1 - iris[iris$Species == 'setosa',] nrow(iris1) [1] 50 summary(iris1$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris1, plot=1) iris2 - subset(iris, Species == 'setosa') nrow(iris2) [1] 50 summary(iris2$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris2, plot=1) -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548440.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
example(factor) iris1$Species - factor(iris1$Species, drop=T) will get you what you need. Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.l...@gmail.com On Sep 21, 2010, at 7:41 AM, pdb wrote: I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What am I missing? Thanks in advance. summary(iris$Species) setosa versicolor virginica 50 50 50 nrow(iris) [1] 150 iris1 - iris[iris$Species == 'setosa',] nrow(iris1) [1] 50 summary(iris1$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris1, plot=1) iris2 - subset(iris, Species == 'setosa') nrow(iris2) [1] 50 summary(iris2$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris2, plot=1) -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548440.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT: Is randomization for targeted cancer therapies ethical?
From: jlu...@ria.buffalo.edu Clearly inferior treatments are unethical. The Big Question is: What constitute clearly? Who or How to decide what is clearly? I'm sure there are plenty of people who don't understand much Statistics and are perfectly willing to say the results on the two cousins show the conventional treatment is clearly inferior. Sure, on these two cousins we can say so, but what about others? Donald Berry at MD Anderson in Houston TX and Jay Kadane at Carnegie Mellon have been working on more ethical designs within the Bayesian framework. In particular, response adaptive designs reduce the assignment to and continuation of patients on inferior treatments. I've heard LJ Wei talked about this kinds of designs (don't remember if they are Bayesian) more than a dozen years ago. Don't know how common they are in use. Andy Bert Gunter gunter.ber...@gene.com Sent by: r-help-boun...@r-project.org 09/20/2010 01:31 PM To r-help@r-project.org cc Subject [R] OT: Is randomization for targeted cancer therapies ethical? Hi Folks: **Off Topic** Those interested in clinical trials may find the following of interest: http://www.nytimes.com/2010/09/19/health/research/19trial.html It concerns the ethicality of randomizing those with life-threatening disease to relatively ineffective SOC when new biologically targeted therapies appear to be more effective. While the context may be new, the debate, itself, is not: Tukey wrote (or maybe it was talked -- I can't remember for sure) about this about 30 years ago. I'm sure many other also have done so. Cheers, Bert -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change y axis distance
Thx for all your help! -- View this message in context: http://r.789695.n4.nabble.com/change-y-axis-distance-tp2548363p2548461.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
Removing elements from a factor does not change the levels of the factor. ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens pdb Verzonden: dinsdag 21 september 2010 13:42 Aan: r-help@r-project.org Onderwerp: [R] removed data is still there! I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What am I missing? Thanks in advance. summary(iris$Species) setosa versicolor virginica 50 50 50 nrow(iris) [1] 150 iris1 - iris[iris$Species == 'setosa',] nrow(iris1) [1] 50 summary(iris1$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris1, plot=1) iris2 - subset(iris, Species == 'setosa') nrow(iris2) [1] 50 summary(iris2$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris2, plot=1) -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp25 48440p2548440.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Indexing sublists inside lists.
I would like to thank you very much for your reply. Actually I would like to ask you if there is a small list called fred: fred - list(happy = 1:10, name = squash) and a big list called bigfred that included fred list 5 times bigfred - rep(fred,5) Is it possible somehow to index all these sublists(fred) inside bigfred with a more direct way like bigfred[1] shows the first sublist fred bigfred[2][2] shows the second sublist fred, the second element of the fred list So far I found some way to do this by refering to the sublists by the following: bigfred[1+index*length(fred)] where index shows the beginning of a sublist. I would like to thank you in advance for your help Best Regards Alex From: David Winsemius dwinsem...@comcast.net Cc: Dennis Murphy djmu...@gmail.com; Rhelp r-help@r-project.org Sent: Tue, September 14, 2010 3:55:39 PM Subject: Re: [R] Object oriented programming in R. On Sep 14, 2010, at 9:29 AM, Alaios wrote: I would like to thank you very much all that you helped me so far. So I tried to check how the following works fred - list(happy = 1:10, name = squash) rep(fred, 5) This returns the following : fred[1] $happy [1] 1 2 3 4 5 6 7 8 9 10 fred[2] $name [1] squash Not on my machine: fred - list(happy = 1:10, name = squash) rep(fred, 5) $happy [1] 1 2 3 4 5 6 7 8 9 10 $name [1] squash $happy [1] 1 2 3 4 5 6 7 8 9 10 $name [1] squash $happy [1] 1 2 3 4 5 6 7 8 9 10 $name [1] squash $happy [1] 1 2 3 4 5 6 7 8 9 10 $name [1] squash $happy [1] 1 2 3 4 5 6 7 8 9 10 $name [1] squash What I am trying to do is to address the number 5 of the fred[1] $happy value. I tried something like fred[1][5] fred[1,5] but it didn't work Almost: fred[[1]][5] [1] 5 I would like to thank you in advance for your help Best Regards Alex From: Dennis Murphy djmu...@gmail.com Cc: Rhelp r-help@r-project.org Sent: Tue, September 14, 2010 3:13:37 PM Subject: Re: [R] Object oriented programming in R. Hi: You could create a list of lists, where the outer list would be between agents and the inner list within agents. The inner list could have the 'matrices and one list' as separate components for each agent. Of course, you would have to be able to keep all of this straight :) HTH, Dennis Here are some more information: I would like to create some agents that span over a specific area map.Every agent needs to have its own data structures like one or two matrices and one list. I think that the best way to do this is to create objects and every instance of an object will be used for a single agent. The number of agents is not predetermined and it varies for any execution. So I read this value from the command line interface and then I would like to initiate so many objects as the agents. I think that the best way to do that is to create using a for loop a list containing as many objects as the agents are. I would like to thank you in advance for your help Best Regards Alex From: jim holtman jholt...@gmail.com Cc: Tal Galili tal.gal...@gmail.com; Rhelp r-help@r-project.org Sent: Tue, September 14, 2010 1:40:37 PM Subject: Re: [R] Object oriented programming in R. It depends on what you mean by objects. If you are just looking at creating many named variables that are going to hold values (e.g., reading in data from several files that you want to correlate separately), then consider the use of 'lists'. Can you provide a little more detail on exactly the problem that you are trying to solve, and then maybe we can propose a solution. Thank you very much. I checked the tutorials that on that list but still I do not know how to create many objects of the same type. Can you please help me with that? Best Regards Alex From: Tal Galili tal.gal...@gmail.com Cc: Rhelp r-help@r-project.org Sent: Tue, September 14, 2010 10:11:36 AM Subject: Re: [R] Object oriented programming in R. Hello Alaios, I see a bunch of good materials here: http://www.google.co.il/search?sourceid=chromeie=UTF-8q=Object+oriented+programming+in+R R R Did you look into them ? Contact Hello everyone. I would like to create many objects with R. Does R support objects? The number of objects needed is not predetermined and it is a parameter specified by the user. If the user selects to create many objects like 100, would it be possible to handle each one by some index? I would like to thank you in advance for your help. Best Regards Alex David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
Re: [R] bivariate vector numerical integration with infinite range
Baptiste; You should see if this meets your requirements: help(adaptIntegrate, package=cubature) (I got errors when I ran the code and NaN's when I looked at the output of test function, f.) vAverage(mixedrule, -4, 4, 0.0, 1, 20, f) - c(pi, pi/2, 2*pi) Error: object 'mixedrule' not found f(-4,0) [1] NaN NaN NaN Warning message: In sqrt(x) : NaNs produced f(-3.9,0) [1] NaN NaN NaN Warning message: In sqrt(x) : NaNs produced f(-3.9,0.1) [1] NaN NaN NaN Warning message: In sqrt(x) : NaNs produced -- David On Sep 21, 2010, at 4:11 AM, baptiste auguie wrote: Dear list, I'm seeking some advice regarding a particular numerical integration I wish to perform. The integrand f takes two real arguments x and y and returns a vector of constant length N. The range of integration is [0, infty) for x and [a,b] (finite) for y. Since the integrand has values in R^N I did not find a built-in function to perform numerical quadrature, so I wrote my own after some inspiration from a post in R-help, library(statmod) ## performs 2D numerical integration ## using Gauss-Legendre quadrature ## with N points for x and y vAverage - function(f, a1,b1, a2,b2, N=5, ...){ GL - gauss.quad(N) nodes - GL$nodes weights - GL$weights C2 - (b2 - a2) / 2 D2 - (b2 + a2) / 2 y - nodes*C2 + D2 C1 - (b1 - a1) / 2 D1 - (b1 + a1) / 2 x - nodes*C1 + D1 value - 0 for (ii in seq_along(x)){ tmp - 0 for (jj in seq_along(y)){ tmp - tmp + C1 * weights[jj] * f(x[jj], y[ii], ...) } value - value + C2 * weights[ii] * tmp } value } ## test function, the result is pi for y=1 f - function(x, y) { res - 1 / (sqrt(x)*(1+x)) c(res, res/2, 2*res) } ## Transformation rule from Numerical Recipes ## to deal with the [0, infty) range of x mixedrule - function(x, y, f, ...) { t - exp(pi*sinh(x)) dtdx - t*(pi*cosh(x)) f(t, y, ...)*dtdx } vAverage(mixedrule, -4, 4, 0.0, 1, 20, f) - c(pi, pi/2, 2*pi) ## -3.535056e-06 -1.767528e-06 -7.070112e-06 So it seems to work. I wonder though if I may have missed an easier (and more reliable) way to perform such integration using base functions or an add-on package that I may have overlooked. Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bivariate vector numerical integration with infinite range
baptiste auguie baptiste.auguie at googlemail.com writes: Dear list, I'm seeking some advice regarding a particular numerical integration I wish to perform. The integrand f takes two real arguments x and y and returns a vector of constant length N. The range of integration is [0, infty) for x and [a,b] (finite) for y. Since the integrand has values in R^N I did not find a built-in function to perform numerical quadrature, so I wrote my own after some inspiration from a post in R-help, The function adaptIntegral() in the 'cubature' package integrates multi-valued functions over n-dimensional finite hypercubes, as do the functions in 'R2Cuba'. If the hypercube is (partly) infinite, a transformation such as x -- 1/x per infinite axis (as in NR) has to be applied. For two dimensions, another approach could be to apply the integrate() function twice, because this 1-dimensional integration function can handle infinite intervals. Hint: First inegrate over the finite dimension(s). Example: Integrate sin(x)/exp(y) for 0 = x = pi, 0 = y = Inf (value: 2) f1 - function(y) 1/exp(y) f2 - function(x) sin(x) * integrate(f1, 0, Inf)$value integrate(f2, 0, pi) # 2 with absolute error 2.2e-14 Note that the absolute error is not correct, as the first integration has its own error term. You have to do your own error estimation. Hans Werner [...] So it seems to work. I wonder though if I may have missed an easier (and more reliable) way to perform such integration using base functions or an add-on package that I may have overlooked. Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding (Ordered Subvectors)
This function might be helpful: bleh - function(a, b) { where - list() matches - 0 first - which(a == b[1]) for (i in first) { seq.to.match - seq(i, length = length(b)) if (identical(a[seq.to.match], b)) { matches - matches + 1 where[[matches]] - seq.to.match } } return(where) } a-c(3,4,3,0,4,5,6,9,3,4) b-c(0,4,5) c-c(5,4,0) d-c(3,4) bleh(a, b) bleh(a, c) bleh(a, d) Cheers, Gustavo. On Tue, Sep 21, 2010 at 11:31 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, Consider a simple example a-c(1,4,3,0,4,5,6,9,3,4) b-c(0,4,5) c-c(5,4,0) I would like to be able to tell whether a sequence is contained (the order of the elements does matter) in another one e.g. in the example above, b is a subsequence of a, whereas c is not. Since the order matters, I cannot treat the sequences above as sets (also, elements are repeated). Does anyone know a smart way of achieving that? Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] puzzle with integrate over infinite range
Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
Thanks, but that was what I just discovered myself the hard way. What I really wanted to know was how to solve this issue. -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548527.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combined plot: Scatter + density plot
On Sep 21, 2010, at 1:34 AM, Ralf B wrote: Hi, in order to save space for a publication, it would be nice to have a combined scatter and density plot similar to what is shows on http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=78 I wonder if anybody perhaps has already developed code for this and is willing to share. The code for that graphic (and all graphics at that site) is linked from the page on which it appears. All you need to do is click on the R icon that is just lateral to the words Download or view: http://addictedtor.free.fr/graphiques/sources/source_78.R -- David. This is the reproducible code for the histogram version obtained from the site: def.par - par(no.readonly = TRUE) # save default, for resetting... x - pmin(3, pmax(-3, rnorm(50))) y - pmin(3, pmax(-3, rnorm(50))) xhist - hist(x, breaks=seq(-3,3,0.5), plot=FALSE) yhist - hist(y, breaks=seq(-3,3,0.5), plot=FALSE) top - max(c(xhist$counts, yhist$counts)) xrange - c(-3,3) yrange - c(-3,3) nf - layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE) par(mar=c(3,3,1,1)) plot(x, y, xlim=xrange, ylim=yrange, xlab=, ylab=) par(mar=c(0,3,1,1)) barplot(xhist$counts, axes=FALSE, ylim=c(0, top), space=0) par(mar=c(3,0,1,1)) barplot(yhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE) par(def.par) I am basically stuck from line 6 where the bin information from the histogram is used for determining plotting sizes. Density are different and don't have (equal) bins and their size would need to be determined differently. I wonder if somebody here has created such a diagram already and is willing to share ideas/code/pointers to similar examples. Your effort is highly appreciated. Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bivariate vector numerical integration with infinite range
Hi, thanks for the pointer to cubature (which i had probably dismissed too quickly). Your tests with f should not work: the domain of f(x,.) is restricted to positive reals, but this domain of integration is then transformed in mixedrule() to map the semi-infinite range to a more reasonable domain (for example [-4,4]). Thanks, baptiste On 21 September 2010 14:16, David Winsemius dwinsem...@comcast.net wrote: Baptiste; You should see if this meets your requirements: help(adaptIntegrate, package=cubature) (I got errors when I ran the code and NaN's when I looked at the output of test function, f.) vAverage(mixedrule, -4, 4, 0.0, 1, 20, f) - c(pi, pi/2, 2*pi) Error: object 'mixedrule' not found f(-4,0) [1] NaN NaN NaN Warning message: In sqrt(x) : NaNs produced f(-3.9,0) [1] NaN NaN NaN Warning message: In sqrt(x) : NaNs produced f(-3.9,0.1) [1] NaN NaN NaN Warning message: In sqrt(x) : NaNs produced -- David On Sep 21, 2010, at 4:11 AM, baptiste auguie wrote: Dear list, I'm seeking some advice regarding a particular numerical integration I wish to perform. The integrand f takes two real arguments x and y and returns a vector of constant length N. The range of integration is [0, infty) for x and [a,b] (finite) for y. Since the integrand has values in R^N I did not find a built-in function to perform numerical quadrature, so I wrote my own after some inspiration from a post in R-help, library(statmod) ## performs 2D numerical integration ## using Gauss-Legendre quadrature ## with N points for x and y vAverage - function(f, a1,b1, a2,b2, N=5, ...){ GL - gauss.quad(N) nodes - GL$nodes weights - GL$weights C2 - (b2 - a2) / 2 D2 - (b2 + a2) / 2 y - nodes*C2 + D2 C1 - (b1 - a1) / 2 D1 - (b1 + a1) / 2 x - nodes*C1 + D1 value - 0 for (ii in seq_along(x)){ tmp - 0 for (jj in seq_along(y)){ tmp - tmp + C1 * weights[jj] * f(x[jj], y[ii], ...) } value - value + C2 * weights[ii] * tmp } value } ## test function, the result is pi for y=1 f - function(x, y) { res - 1 / (sqrt(x)*(1+x)) c(res, res/2, 2*res) } ## Transformation rule from Numerical Recipes ## to deal with the [0, infty) range of x mixedrule - function(x, y, f, ...) { t - exp(pi*sinh(x)) dtdx - t*(pi*cosh(x)) f(t, y, ...)*dtdx } vAverage(mixedrule, -4, 4, 0.0, 1, 20, f) - c(pi, pi/2, 2*pi) ## -3.535056e-06 -1.767528e-06 -7.070112e-06 So it seems to work. I wonder though if I may have missed an easier (and more reliable) way to perform such integration using base functions or an add-on package that I may have overlooked. Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT: Is randomization for targeted cancer therapies ethical?
On Sep 21, 2010, at 6:52 AM, Liaw, Andy wrote: From: jlu...@ria.buffalo.edu Clearly inferior treatments are unethical. The Big Question is: What constitute clearly? Who or How to decide what is clearly? I'm sure there are plenty of people who don't understand much Statistics and are perfectly willing to say the results on the two cousins show the conventional treatment is clearly inferior. Sure, on these two cousins we can say so, but what about others? Donald Berry at MD Anderson in Houston TX and Jay Kadane at Carnegie Mellon have been working on more ethical designs within the Bayesian framework. In particular, response adaptive designs reduce the assignment to and continuation of patients on inferior treatments. I've heard LJ Wei talked about this kinds of designs (don't remember if they are Bayesian) more than a dozen years ago. Don't know how common they are in use. Andy They are becoming more popular as the FDA itself becomes more comfortable with both Bayesian approaches and adaptive designs. The FDA released draft guidance earlier this year on adaptive designs for drugs and biologics: http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM201790.pdf and final guidance for Bayesian approaches in device trials: http://www.fda.gov/downloads/MedicalDevices/DeviceRegulationandGuidance/GuidanceDocuments/ucm071121.pdf There are also several presentations at this week's 2010 FDA Workshop in DC: http://www.amstat.org/meetings/fdaworkshop/index.cfm?fuseaction=onlineprogram Beyond the ethical issues raised, I believe that industry sees these techniques as opportunities to streamline phased trial design, resulting in a compression of the overall timeline to make decisions on the viability of products in the development pipeline. HTH, Marc Schwartz Bert Gunter gunter.ber...@gene.com Sent by: r-help-boun...@r-project.org 09/20/2010 01:31 PM To r-help@r-project.org cc Subject [R] OT: Is randomization for targeted cancer therapies ethical? Hi Folks: **Off Topic** Those interested in clinical trials may find the following of interest: http://www.nytimes.com/2010/09/19/health/research/19trial.html It concerns the ethicality of randomizing those with life-threatening disease to relatively ineffective SOC when new biologically targeted therapies appear to be more effective. While the context may be new, the debate, itself, is not: Tukey wrote (or maybe it was talked -- I can't remember for sure) about this about 30 years ago. I'm sure many other also have done so. Cheers, Bert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding (Ordered Subvectors)
On Sep 21, 2010, at 6:31 AM, Lorenzo Isella wrote: Dear All, Consider a simple example a-c(1,4,3,0,4,5,6,9,3,4) b-c(0,4,5) c-c(5,4,0) I would like to be able to tell whether a sequence is contained (the order of the elements does matter) in another one e.g. in the example above, b is a subsequence of a, whereas c is not. Since the order matters, I cannot treat the sequences above as sets (also, elements are repeated). Does anyone know a smart way of achieving that? grep(paste(c, collapse=#), paste(a, collapse=#)) integer(0) grep(paste(b, collapse=#), paste(a, collapse=#)) [1] 1 Looking at that output I am wondering if you might need to also put markers at the ends of the arguments. grep(paste(#,b,#, collapse=#), paste(#,a,#, collapse=#)) [1] 1 # To prevent a match like c(1,2,3) with c(101,2,303). There is also an istrings package in the BioConductor repository that provides more extensive string matching facilities. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor
Hi Chuan, I'm forwarding your question to the list because I haven't used the rimage package... It's best if you post questions to the list anyway because you are more likely to get a fast and useful answer. On 20 September 2010 23:03, chuan zun liang wrote: Dear Michael: I am so sorry,disturb again.I can plot jpeg image in R under package rimage But How I can it into pixel image?I want exact some part of image.I found this function in package spatstat...Is it possible I do it?Thank a lot Chuan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict.lrm ( Design package) poor performance?
Thanks Frank, I have one small question regarding this, understand you are very busy and if you cant answer i would greatly appreciate any thoughts from the list. Split-sample validation is not reliable unless you have say 10,000 samples to begin with I am a little confused. When i ran the model with a binary response i had a Brier score of 0.199 and a C-value of 0.73. When i looked at the data, 75% of the predictions were correct. However when i ran the model with a ordinal response, the Brier score was 0.201 and the c-value 0.677 which to me, albeit with limited knowledge in the area, suggests the model performance worse but not by a great deal. However when i compare the predicted mean values with the real data i only get 45% accuracy, it is this discrepancy i don't understand. I appreciate you need a large dataset to partition it, but surely if the c-value and brier score are similar then the number predicted correct should also be similar? Thanks Chris On 20 Sep 2010, at 14:46, Frank Harrell wrote: A few comments; sorry I don't have time for any more. - Combining categories is almost always a bad idea - It can be harder to discriminate more categories but that's only because the task is more difficult - Split-sample validation is not reliable unless you have say 10,000 samples to begin with; otherwise 100 fold repeats of 10-fold cross-validation, or (faster) 200 or so bootstraps is better. Be sure to re-do all modeling decisions from scratch for each re-sample. If you did no statistical variable selection this may not be an issue. - You don't need .lrm after predict - Not sure why your variable names are all upper case; harder to read this way Good luck Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 20 Sep 2010, Chris Mcowen wrote: Dear Professor Harell I am familier with binary models, however i am now trying to get predictions from a ordinal model and have a question. I have a data set made up of 12 categorical predictors, the response variable is classed as 1,2,3,4,5,6, this relates to threat level of the species ( on the IUCN rating). Previously i have combined levels 1 and 2 to form = non threatened and then combined 3-6 to form threatened, and run a binary model. I have tested the performance of this based on the brier score (0.198) and the AUC or C value (0.75). I also partitioned the data set into training and test data and used the predict function to get a predicted probability for the newdata. When visualising the results with a cutoff value calculated with epi, roughly 75% of the time the prediction was correct. Now i am interested in predicting the threat level of a species not purely as threatened or not but to specific IUCN levels. I have used the predict.lrm function (predict.lrm(model1, type=fitted.ind)) to generate probabilities for each level, and also (predict(model1, traist, type=fitted)) see below. When i call the model the Brier score is 0.201 and C value 0.677. However when i inspect the output and relate it to the corresponding species ( for which i know the true IUCN rating) the model performs very badly, only getting 43% correct. Interestingly i have noticed the probabilities are always highest for levels 1 and 6, on no occasion do levels 2,3,4 or 5 have high probabilities? I am unsure if this is just because the model can not discriminate between the various levels due to insufficient data? Or if i am doing something wrong, if this is the case i would greatly appreciate any advice or suggestion of a different method. Thanks in advance, Chris model1 - lrm(EXTINCTION~BS*FR+WO+LIF+REG+ALT+BIO+HAB+PD+SEA, x=TRUE,y=TRUE) predict.lrm(model1, type=fitted.ind) EXTINCTION=1 EXTINCTION=2 EXTINCTION=3 EXTINCTION=4 EXTINCTION=5 1 0.19748393 0.05895033 0.12811186 0.086140778 0.068137104 2 0.27790178 0.07247496 0.14384976 0.087487677 0.064584865 3 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 4 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 5 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 6 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 7 0.13928899 0.04558050 0.10636220 0.00389 0.065500459 8 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 9 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 100.33865077 0.07915126 0.14744522 0.083923247 0.059387585 EXTINCTION=6 1 0.46117600 2 0.35370096 3 0.39223038 4 0.39223038 5 0.39223038 6 0.39223038 7 0.56549746 8 0.39223038 9 0.39223038 predict(model1, traist, type=fitted) y=2 y=3 y=4 y=5 y=6 1 0.80251607 0.74356575 0.61545388 0.52931311 0.46117600 2 0.72209822 0.64962327 0.50577351
[R] rcom and safearray type of data
Hello there, I started to use rcom package and there were no problems until I tried to call some external function (method) which returns safearray type of data where either me or rcom or something else fails. Specifically I connected to a database and called a method doing something like this require(rcom) V8-comCreateObject(V81.COMConnector) con-comInvoke(V8,Connect,File=K:/) result-comGetProperty(con,ExternalFunctions) result$Test() which returned me just NULL. I tried to call the same method from excel vba by doing Set V8 = CreateObject(V81.COMConnector) Set con = V8.Connect(File=K:\) result = con.ExternalFunctions.Test() ActiveSheet.Range(A1, Cells(UBound(result, 2) + 1, UBound(result, 1) + 1)) = WorksheetFunction.Transpose(result) which returned me what I expected to be returned - some kind of data.frame with symbols and numbers. Does anyone have any idea how one can retrieve such kind of data in R? I use the latest rcom 2.2-3.1 (tried both precompiled one and compiled from tar.gz). Many thanks in advance! Kind regards, Alex -- View this message in context: http://r.789695.n4.nabble.com/rcom-and-safearray-type-of-data-tp2548552p2548552.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combined plot: Scatter + density plot
You can get exactly the plot you want (except with relative frequency polygons instead of the histograms on the marginals, but they are equivalent) with the package chplot (which is available from CRAN)! Use just one group (class, sample, category -- i.e., omit the conditioning variable as we call it in the package manual) and use the options to show the points and to switch off the convex hull and the display of centre and variability (if you really want to, although multiple groups and the summarised display are the main features of the augmented convex hull plots). Still, if you have many points, do consider the chplots proper, or their curved variant (i.e., bivariate density contour instead of the points cloud and confidence ellipse instead of the cross with descriptive statistics and smoothed marginal densities instead of the frequency poligons). Regards, Gaj --- Assist. Prof. Gaj Vidmar, PhD Univ. of Ljubljana, Fac. of Medicine, Inst. for Biostatistics and Medical Informatics Ralf B ralf.bie...@gmail.com wrote in message news:aanlktikjv_nupqt2e2btajkji3q9oabgsv+uokpqp...@mail.gmail.com... Hi, in order to save space for a publication, it would be nice to have a combined scatter and density plot similar to what is shows on http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=78 I wonder if anybody perhaps has already developed code for this and is willing to share. This is the reproducible code for the histogram version obtained from the site: def.par - par(no.readonly = TRUE) # save default, for resetting... x - pmin(3, pmax(-3, rnorm(50))) y - pmin(3, pmax(-3, rnorm(50))) xhist - hist(x, breaks=seq(-3,3,0.5), plot=FALSE) yhist - hist(y, breaks=seq(-3,3,0.5), plot=FALSE) top - max(c(xhist$counts, yhist$counts)) xrange - c(-3,3) yrange - c(-3,3) nf - layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE) par(mar=c(3,3,1,1)) plot(x, y, xlim=xrange, ylim=yrange, xlab=, ylab=) par(mar=c(0,3,1,1)) barplot(xhist$counts, axes=FALSE, ylim=c(0, top), space=0) par(mar=c(3,0,1,1)) barplot(yhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE) par(def.par) I am basically stuck from line 6 where the bin information from the histogram is used for determining plotting sizes. Density are different and don't have (equal) bins and their size would need to be determined differently. I wonder if somebody here has created such a diagram already and is willing to share ideas/code/pointers to similar examples. Your effort is highly appreciated. Thanks a lot, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating table from data frame
Try this: d - data.frame(A = letters[1:10], B = sample(letters[11:20]), C = sample(10)) xtabs(C ~ A + B, d) On Tue, Sep 21, 2010 at 8:39 AM, ZeMajik zema...@gmail.com wrote: Hey, I have a dataset where two columns are factors and another column consists of values. Each combination of factors can only have a single value assigned to it. I'd like to represent this as a matrix or table where the rows are the first column factors and the columns the second column factors. So that each cell a_ij in the matrix represents the associated value for the factor combination ij. When no such value exists for the combination the value should be 0. I've tried playing around with tables to get this to work, but I can't seem to get it right. I've also had little luck when trying to find a solution to this. Any help would be much appreciated! Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA problem
On Sep 21, 2010, at 5:28 AM, Jeff Newmiller wrote: Short answer: don't do that. The format function is for preparing data for output. Do your data manipulations on a data frame you keep for such use, and only use format to prepare for output. That is excellent advice. But to answer the question for the situation where the problem is less global, or in which the person were determined to press on To substitute . for NA then the is.na function is needed: cvec -c(1, 2 400.3, NA) cvec[is.na(cvec)] - . Note(s): NA is not NA, and nothing ever, ever =='s NA is.na(NA) [1] TRUE is.na(NA) [1] FALSE NA == NA [1] NA NA == NA [1] NA -- David. n.via...@libero.it n.via...@libero.it wrote: Dear R list I have a problem with NA, which should be a string, but R seems that it doesn't recognize it. What I do is first give the format command to my data frame: format.data.frame(mydata,big.mark= ) so I give a blank as thousand separator. All my records in my data frame become strings, so instead of having NA I have NA. I try to convert NA in .,but it seems that R doesn't recognize NA. Someone knows why and how to treats those NA?? Thanks for your attention __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bivariate vector numerical integration with infinite range
Thanks, adaptIntegrate() seems perfectly suited, I'll just need to figure a transformation rule for the infinite limit. The suggestion of x-1/x does not seem to work here because it also transforms 0 into -infinity. I think exp(pi* sinh(x)) could be a better choice, according to Numerical Recipes. Thanks, baptiste On 21 September 2010 14:26, Hans W Borchers hwborch...@googlemail.com wrote: baptiste auguie baptiste.auguie at googlemail.com writes: Dear list, I'm seeking some advice regarding a particular numerical integration I wish to perform. The integrand f takes two real arguments x and y and returns a vector of constant length N. The range of integration is [0, infty) for x and [a,b] (finite) for y. Since the integrand has values in R^N I did not find a built-in function to perform numerical quadrature, so I wrote my own after some inspiration from a post in R-help, The function adaptIntegral() in the 'cubature' package integrates multi-valued functions over n-dimensional finite hypercubes, as do the functions in 'R2Cuba'. If the hypercube is (partly) infinite, a transformation such as x -- 1/x per infinite axis (as in NR) has to be applied. For two dimensions, another approach could be to apply the integrate() function twice, because this 1-dimensional integration function can handle infinite intervals. Hint: First inegrate over the finite dimension(s). Example: Integrate sin(x)/exp(y) for 0 = x = pi, 0 = y = Inf (value: 2) f1 - function(y) 1/exp(y) f2 - function(x) sin(x) * integrate(f1, 0, Inf)$value integrate(f2, 0, pi) # 2 with absolute error 2.2e-14 Note that the absolute error is not correct, as the first integration has its own error term. You have to do your own error estimation. Hans Werner [...] So it seems to work. I wonder though if I may have missed an easier (and more reliable) way to perform such integration using base functions or an add-on package that I may have overlooked. Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HOW to create image like this?
http://r.789695.n4.nabble.com/file/n2548152/25jfmyx.jpg HOW to create image like this? **tp://i55.tinypic.com/25jfmyx.jpg[/IMG] I don't known how to create the image above or which function can create this image? -- View this message in context: http://r.789695.n4.nabble.com/HOW-to-create-image-like-this-tp2548152p2548152.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA problem
On Sep 21, 2010, at 11:28 , Jeff Newmiller wrote: Short answer: don't do that. The format function is for preparing data for output. Do your data manipulations on a data frame you keep for such use, and only use format to prepare for output. But isn't that what the OP is doing? It is actually a bit odd that the formatting functions do not allow the NA code to be set. Probably, the easiest workaround goes like this aqf - format(airquality) aqf[is.na(airquality)] - - head(aqf) Ozone Solar.R Wind Temp Month Day 141 190 7.4 67 5 1 236 118 8.0 72 5 2 312 149 12.6 74 5 3 418 313 11.5 62 5 4 5 - - 14.3 56 5 5 628 - 14.9 66 5 6 n.via...@libero.it n.via...@libero.it wrote: Dear R list I have a problem with NA, which should be a string, but R seems that it doesn't recognize it. What I do is first give the format command to my data frame: format.data.frame(mydata,big.mark= ) so I give a blank as thousand separator. All my records in my data frame become strings, so instead of having NA I have NA. I try to convert NA in .,but it seems that R doesn't recognize NA. Someone knows why and how to treats those NA?? Thanks for your attention __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for calculating bandwidths
Would nobody like to answer me? -- View this message in context: http://r.789695.n4.nabble.com/Package-for-calculating-bandwidths-tp2548091p2548509.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival curve mean adjusted for covariate: NEED TO DO IN NEXT 2 HOURS, PLEASE HELP
Hi I am trying to determine the mean of a Weibull function that has been fit to a data set, adjusted for a categorical covariate , gender (0=male,1=female). Here is my code: library(survival) survdata-read.csv(data.csv) ##Fit Weibull model to data WeiModel-survreg(Surv(survdata$Time,survdata$Status)~survdata$gender) summary(WeiModel) P-pweibull(n, scale=exp(WeiModel$coef[1]), shape=1/WeiModel$scale) ##Return mean scale-exp(WeiModel$coef[1]) shape-1/WeiModel$scale mean - scale*gamma(1+1/shape) mean The problem is that the mean is based on the baseline coefficient which assumes the population is male (= 0). I want an adjusted mean which isnt assuming the whole population is male, or female - so the baseline coefficient is completely adjusted for gender. Help ASAP would be much appreciated! Thanks! Mark -- View this message in context: http://r.789695.n4.nabble.com/Survival-curve-mean-adjusted-for-covariate-NEED-TO-DO-IN-NEXT-2-HOURS-PLEASE-HELP-tp2548484p2548484.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
On Sep 21, 2010, at 8:39 AM, pdb wrote: Thanks, but that was what I just discovered myself the hard way. What I really wanted to know was how to solve this issue. Although that was _not_ what you requested in your first post. 2 options: ?table ?factor iris1$Species -factor(iris$Species) # removes extraneous levels -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548527.html Sent from the R help mailing list archive at Nabble.com. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW to create image like this?
On Tue, Sep 21, 2010 at 8:58 AM, zcrself zcrs...@gmail.com wrote: http://r.789695.n4.nabble.com/file/n2548152/25jfmyx.jpg HOW to create image like this? **tp://i55.tinypic.com/25jfmyx.jpg[/IMG] My first response is On an empty stomach, with a handy supply of anti-migraine tablets. I don't known how to create the image above or which function can create this image? Is this a standard representation of some genetic thing? If so, then it might be worth someone writing a function to do it, but that would mean first defining a precise specification for the graphic. There's nothing you can't do in R with only the lines and points functions Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
On Sep 21, 2010, at 9:04 AM, David Winsemius wrote: On Sep 21, 2010, at 8:39 AM, pdb wrote: Thanks, but that was what I just discovered myself the hard way. What I really wanted to know was how to solve this issue. Although that was _not_ what you requested in your first post. 2 options: ?table ?factor iris1$Species -factor(iris$Species) # removes extraneous levels And that was not what I meant to type. Meant for factor to be applied to second dataframe.: iris1$Species -factor(iris1$Species) # removes extraneous levels -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for calculating bandwidths
Bandwidth of what? Internet traffic/some sort of smoother? Your question is not clear to me. Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Tue, Sep 21, 2010 at 8:59 AM, Brocker84 v.broc...@web.de wrote: Hi! Is there an R-package with which I can calculate bandwidths via cross validation in a data set?? Greetz, Valentin -- View this message in context: http://r.789695.n4.nabble.com/Package-for-calculating-bandwidths-tp2548091p2548091.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
Hi, I knew about that way already, with factor(). Isn't there another possibility, directly at the subsetting step? That would be of great help iris1 - iris[iris$Species == 'setosa',] ## I mean here Ivan Le 9/21/2010 15:14, David Winsemius a écrit : On Sep 21, 2010, at 9:04 AM, David Winsemius wrote: On Sep 21, 2010, at 8:39 AM, pdb wrote: Thanks, but that was what I just discovered myself the hard way. What I really wanted to know was how to solve this issue. Although that was _not_ what you requested in your first post. 2 options: ?table ?factor iris1$Species -factor(iris$Species) # removes extraneous levels And that was not what I meant to type. Meant for factor to be applied to second dataframe.: iris1$Species -factor(iris1$Species) # removes extraneous levels -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW to create image like this?
This looks like it has been created in Circos: http://mkweb.bcgsc.ca/circos/ HTH Peter On Tue, Sep 21, 2010 at 9:58 AM, zcrself zcrs...@gmail.com wrote: http://r.789695.n4.nabble.com/file/n2548152/25jfmyx.jpg HOW to create image like this? **tp://i55.tinypic.com/25jfmyx.jpg[/IMG] I don't known how to create the image above or which function can create this image? -- View this message in context: http://r.789695.n4.nabble.com/HOW-to-create-image-like-this-tp2548152p2548152.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW to create image like this?
On Sep 21, 2010, at 3:58 AM, zcrself wrote: http://r.789695.n4.nabble.com/file/n2548152/25jfmyx.jpg HOW to create image like this? **tp://i55.tinypic.com/25jfmyx.jpg[/IMG] I don't known how to create the image above or which function can create this image? THe corresponding author in the article in which that illustration appeared, Science 20 NOVEMBER 2009 VOL 326, p 1113, is listed as: rwil...@wustl.edu -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival curve mean adjusted for covariate: NEED TO DO IN NEXT 2 HOURS, PLEASE HELP
mark.fisher123 marko.fisher2008 at gmail.com writes: [snip] library(survival) survdata-read.csv(data.csv) ##Fit Weibull model to data WeiModel-survreg(Surv(survdata$Time,survdata$Status)~survdata$gender) summary(WeiModel) P-pweibull(n, scale=exp(WeiModel$coef[1]), shape=1/WeiModel$scale) ##Return mean scale-exp(WeiModel$coef[1]) shape-1/WeiModel$scale mean - scale*gamma(1+1/shape) The problem is that the mean is based on the baseline coefficient which assumes the population is male (= 0). I want an adjusted mean which isnt assuming the whole population is male, or female - so the baseline coefficient is completely adjusted for gender. Well, you will get different answers depending on the sex ratio of the population. If you want to predict based on the existing population, then just go ahead and fit the model without including gender as a covariate (~1 rather than ~gender). The other possibility would be to use coef(WeiModel)[1]+coef(WeiModel)[2]/2 as the (log-scale) scale parameter, but I'm not sure that would be correct. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mode across lists of matrices
Hi Everyone, I am interested in taking the mode over several thousand matrices. I show an example below. For the [1,1] entry of my mode matrix that I want to create I would like to have a 2. For the [1,2] entry I would want a 2. For the [2,2] entry it would be 4 and so forth. Earlier, I was working with continuous cases and thus each (n,m) element was simply an average. I was able to then do element-wise addition and counting using the Reduce function and then the average would be totalsum/totalcount where the NA terms were discounted. Here, it's not exactly a binary case so Reduce doesn't quite work as well. I'm open to suggestions but would as always like to avoid long loops as that will significantly bump up running time over several thousand trees. Similarly, I would not like to do a lot of sorts to find the mode either... Thanks for your help! mymats [[1]] [,1] [,2] [,3] [1,]021 [2,]233 [3,]212 [[2]] [,1] [,2] [,3] [1,]124 [2,]244 [3,]345 [[3]] [,1] [,2] [,3] [1,]231 [2,]342 [3,]513 [[4]] [,1] [,2] [,3] [1,]242 [2,]1 NA2 [3,]231 [[5]] [,1] [,2] [,3] [1,] NA21 [2,]241 [3,]132 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
Ivan Calandra ivan.calandra at uni-hamburg.de writes: Hi, I knew about that way already, with factor(). Isn't there another possibility, directly at the subsetting step? That would be of great help iris1 - iris[iris$Species == 'setosa',] ## I mean here Ivan Not as far as I know. See gdata::drop.levels , or droplevels() in R 2.12.0. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help interpreting a model summary
David Winsemius wrote: On Sep 19, 2010, at 5:59 PM, zozio32 wrote: Thanks for you're long answer. I have to say, I am not fully sure of what you're meaning everywhere. As I said, I am merely following a recipe book, and when things depart from it I am a bit lost. I'll try to answer to each of your paragraphs: 3: I was not wanting to include 3-way interactions, but that's the only way I found to include a 2 way interaction in my piecewise linear model. Yes. That makes sense to me. I could obviously include only angleNoise*reflection, but I thought that was not very consistent with the fact that reflection variable was split in 2. I could may be define the point of separation, and then create 2 separate models in the form lm(weightedDiff ~ angleNoise*reflection). Yes, I see that you did, and that may be helpful in understanding the estimates. The question I would ask you is why you are putting a breakpoint in a physical model. Unless there is a phase change or some discontinuity in effects at that point, I think the breakpoint looks artificial. Is refelction somehow physically connected with the breakpoint? Some sort of acoustic timing phenomenom? Or reflected wave effect in a hydraulic model? I am generating waves in a wave tank and I have some of it bouncing back from the tank wall to mess up my measures, and i want to quantify this effect. weightedDiff is a measure of the difference between the target spectra and what I am measuring. For that I have generated virtual wave elevation with different level of reflection and I am analysing the results of my waves measurement method. Basically, I can observe a strong curvature in my data weightedDiff as a function of reflection. Now, I can try to model this curvature by a square term, a linear piece wise linear model or may be a log model in the form y = a*log(x)+b.I don't like the square model as it will not go to +inf with reflection - +inf. So I've tried the second option with what I thought was not too bad results ... I think I'll investigate the log option now. i have to say that speaking to someone definitely clears up my mind on this. And R is not the cup of tea of people in my department. I am not really thinking that there is a breaking point, but that the influence of the angleNoise perturbations (level of uncertainty in the direction of propagation of my waves) is negligible when the reflection get too high. If I could identify this point, or a king of limit between 2 zones ( one with reflection low enough that angleNoise as to be taken into account, one with reflection so high that angleNoise do not matter any more) that will be helpful. I merely thought that my formulation was a way to combine them together. Basically, i am expecting both parameters to degrade my signal, but I'll not be surprised if passed a certain level of reflection, having noise or not in my angles is not really relevant, hence the interaction parameter. The piecewise linear model is a way to take into account the curvature in the data that I can observe on a straight scatter plot. 1: Thanks for the first part, i think I can make sens of it. ;) I guess I can ignore this parameter in that case. By the way, which type of Anova you refering to: creating a factor with high and low level of interation, and fitting the interation between angleNoise and this new factor? If you took a model with all of the data and fit first a model without the break point and one with the breakpoint and then looked at the output of anova(model1) and annova(model2) the difference in deviance across the two models is distributed (asymptotically anyway) as a chi- square statistic with the difference in number of degrees of freedom. That's a much better basis for deciding whether the addition of the term is statistically significant. yeah, I did that and there is definitely no match for the linear model without the break or square term, or anything. I just need to model this curvature somehow to get a decent model here. As I am using virtual wave elevation, I have over 600 observations point, so the anova test results are not ambiguous at all. 2: first, i was mislead by the meaning of this factor. i only encounter the version were it's TRUE, not FALSE which is the difference. I think I also use important in a wrong way. I should have used significant instead. You had specified a model that that terms with both (reflection = Break[xMin]) and (reflection Break[xMin]) and the lm program threw away all the levels with reflection = Break[xMin]. If you had only specified the the model with only reflection = Break[xMin] you would have gotten an identical model as far as predictions were concerned, but the signs would have been reversed for any of the levels with the inequality term in them. After, i have to admit that I am lost when
Re: [R] puzzle with integrate over infinite range
Hi, Thanks for the tip, but it's still mysterious to me. Reading ?integrate did not give me much hint as to what relative accuracy means in this context. I looked at the source of integrate.c but it's still not clear to me how the default value of rel.tol (10^-4 for me) is not enough to prevent a completely wrong answer (the error is much larger than this). Obviously, I'm worried now that I may not always choose a good value of ref.tol, if picked arbitrarily without my understanding of what it means. Thanks, baptiste On Sep 21, 2010, at 3:38 PM, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival curve mean adjusted for covariate: NEED TO DO IN NEXT 2 HOURS, PLEASE HELP
On Sep 21, 2010, at 15:43 , Ben Bolker wrote: mark.fisher123 marko.fisher2008 at gmail.com writes: [snip] library(survival) survdata-read.csv(data.csv) ##Fit Weibull model to data WeiModel-survreg(Surv(survdata$Time,survdata$Status)~survdata$gender) summary(WeiModel) P-pweibull(n, scale=exp(WeiModel$coef[1]), shape=1/WeiModel$scale) ##Return mean scale-exp(WeiModel$coef[1]) shape-1/WeiModel$scale mean - scale*gamma(1+1/shape) The problem is that the mean is based on the baseline coefficient which assumes the population is male (= 0). I want an adjusted mean which isnt assuming the whole population is male, or female - so the baseline coefficient is completely adjusted for gender. Well, you will get different answers depending on the sex ratio of the population. If you want to predict based on the existing population, then just go ahead and fit the model without including gender as a covariate (~1 rather than ~gender). The other possibility would be to use coef(WeiModel)[1]+coef(WeiModel)[2]/2 as the (log-scale) scale parameter, but I'm not sure that would be correct. I'd do the per-gender mean first, then the mixture mean by weighting according to gender proportions (possibly equal to those of the data). That would be (a) answering the question as posed and (b) based on a correct model. -pd (BTW, it is *very* tempting to cite Simon Blomberg's tagline with a Subject header like that...) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
You could try pnorm also: shiftedGaussR - function(x0 = 500) { sd - 100/sqrt(2) int - pnorm(0, x0, sd, lower.tail=FALSE, log.p=TRUE) exp(int + log(sd) + 0.5 * log(2*pi)) } shiftedGaussR(500) [1] 177.2454 shiftedGauss(500) [1] 177.2454 -Matt On Tue, 2010-09-21 at 09:38 -0400, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
You are dealing with functions that are non-zero over a very small interval, so you have to be very careful. There is no method that is going to be totally foolproof. Having said that, I have always felt that the default tolerance in integrate is too liberal (i.e. too large). I always use rel.tol of 1.e-08 (roughly, sqrt(machine epsilon)) in my computations, and I also increase subdivisions to 500. Ravi. From: baptiste Auguié [mailto:baptiste.aug...@googlemail.com] Sent: Tuesday, September 21, 2010 9:58 AM To: Ravi Varadhan Cc: 'baptiste auguie'; 'r-help' Subject: Re: [R] puzzle with integrate over infinite range Hi, Thanks for the tip, but it's still mysterious to me. Reading ?integrate did not give me much hint as to what relative accuracy means in this context. I looked at the source of integrate.c but it's still not clear to me how the default value of rel.tol (10^-4 for me) is not enough to prevent a completely wrong answer (the error is much larger than this). Obviously, I'm worried now that I may not always choose a good value of ref.tol, if picked arbitrarily without my understanding of what it means. Thanks, baptiste On Sep 21, 2010, at 3:38 PM, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mode across lists of matrices
Try this: mode - function(x, ...) as.numeric(names(which.max(table(x apply(array(unlist(mymats), dim = c(length(mymats), dim(mymats[[1]]))), 1:2, mode) On Tue, Sep 21, 2010 at 10:47 AM, Gregory Ryslik rsa...@comcast.net wrote: Hi Everyone, I am interested in taking the mode over several thousand matrices. I show an example below. For the [1,1] entry of my mode matrix that I want to create I would like to have a 2. For the [1,2] entry I would want a 2. For the [2,2] entry it would be 4 and so forth. Earlier, I was working with continuous cases and thus each (n,m) element was simply an average. I was able to then do element-wise addition and counting using the Reduce function and then the average would be totalsum/totalcount where the NA terms were discounted. Here, it's not exactly a binary case so Reduce doesn't quite work as well. I'm open to suggestions but would as always like to avoid long loops as that will significantly bump up running time over several thousand trees. Similarly, I would not like to do a lot of sorts to find the mode either... Thanks for your help! mymats [[1]] [,1] [,2] [,3] [1,]021 [2,]233 [3,]212 [[2]] [,1] [,2] [,3] [1,]124 [2,]244 [3,]345 [[3]] [,1] [,2] [,3] [1,]231 [2,]342 [3,]513 [[4]] [,1] [,2] [,3] [1,]242 [2,]1 NA2 [3,]231 [[5]] [,1] [,2] [,3] [1,] NA21 [2,]241 [3,]132 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combined plot: Scatter + density plot
On Tue, Sep 21, 2010 at 8:34 AM, Ralf B ralf.bie...@gmail.com wrote: in order to save space for a publication, it would be nice to have a combined scatter and density plot similar to what is shows on Not quite the same thing, but I like the scatterplots in Rcmdr, which feature boxplots instead of density graphs: require(car) data(iris) scatterplot(Petal.Width~Sepal.Length, reg.line=lm, smooth=TRUE, spread=TRUE, boxplots='xy', span=0.5, data=iris) Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
Thanks, I'll do that too from now on. It strikes me that in a case such as this one it may be safer to use a truncated, finite interval around the region where the integrand is non-zero, rather than following the advice of ?integrate to use Inf as integration limit. At least one wouldn't risk to get an entirely wrong result depending on a choice of rel.tol. Regarding this parameter, is there a simple interpretation of how it affected the result in the context of my example? Thanks again, baptiste On Sep 21, 2010, at 4:04 PM, Ravi Varadhan wrote: You are dealing with functions that are non-zero over a very small interval, so you have to be very careful. There is no method that is going to be totally foolproof. Having said that, I have always felt that the default tolerance in integrate is too liberal (i.e. too large). I always use rel.tol of 1.e-08 (roughly, sqrt(machine epsilon)) in my computations, and I also increase subdivisions to 500. Ravi. From: baptiste Auguié [mailto:baptiste.aug...@googlemail.com] Sent: Tuesday, September 21, 2010 9:58 AM To: Ravi Varadhan Cc: 'baptiste auguie'; 'r-help' Subject: Re: [R] puzzle with integrate over infinite range Hi, Thanks for the tip, but it's still mysterious to me. Reading ?integrate did not give me much hint as to what relative accuracy means in this context. I looked at the source of integrate.c but it's still not clear to me how the default value of rel.tol (10^-4 for me) is not enough to prevent a completely wrong answer (the error is much larger than this). Obviously, I'm worried now that I may not always choose a good value of ref.tol, if picked arbitrarily without my understanding of what it means. Thanks, baptiste On Sep 21, 2010, at 3:38 PM, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] labels in (box)plot
Dear users, I would like all the ticks on a boxplot (x and y) to be labeled I have checked all the par() arguments but couldn't find what I'm looking for Here is an example to show it: df - structure(list(SPECSHOR = structure(c(1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 4L, 4L), .Label = c(cotau, dibic, eqgre, gicam), class = factor), Sq122.median = c(2.335835, 1.76091, 1.64717, 1.285505, 1.572405, 1.86761, 1.82541, 1.62458, 0.157813, 0.864523)), .Names = c(SPECSHOR, Sq122.median), class = data.frame, row.names = c(9L, 16L, 23L, 74L, 83L, 90L, 98L, 109L, 121L, 139L)) par(mfrow=c(2,2)) ## not necessary in that example, but I do need it with my real data boxplot(Sq122.median~SPECSHOR, data=df, horizontal=TRUE) ## the upper label on the y-axis (gicam) is not shown I know I can decrease the size of the labels by setting cex.axis=0.8, but I would prefer that all labels are always there, independent of their size, without me setting their size explicitly, and even if they then overlap. Am I clear? Is it possible, and how? Thanks in advance Ivan -- Ivan CALANDRA PhD Student University of Hamburg [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting and subsetting
On Tue, Sep 21, 2010 at 3:09 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: All the solutions in this thread so far use the lapply(split(...)) paradigm either directly or indirectly. That paradigm doesn't scale. That's the likely source of quite a few 'out of memory' errors and performance issues in R. This is a good point. It is not nearly as straightforward as the syntax for data.table (which seems to order and select in one step...very nice!), but this should be less memory intensive: tmp - data.frame(index = gl(2,20), foo = rnorm(40)) tmp - tmp[order(tmp$index, tmp$foo) , ] # find location of first instance of each level and add 0:4 to it x - sapply(match(levels(tmp$index), tmp$index), `+`, 0:4) tmp[x, ] data.table doesn't do that internally, and it's syntax is pretty easy. tmp - data.table(index = gl(2,20), foo = rnorm(40)) tmp[, .SD[head(order(-foo),5)], by=index] index index.1 foo [1,] 1 1 1.9677303 [2,] 1 1 1.2731872 [3,] 1 1 1.1100931 [4,] 1 1 0.8194719 [5,] 1 1 0.6674880 [6,] 2 2 1.2236383 [7,] 2 2 0.9606766 [8,] 2 2 0.8654497 [9,] 2 2 0.5404112 [10,] 2 2 0.3373457 As you can see it currently repeats the group column which is a shame (on the to do list to fix). Matthew http://datatable.r-forge.r-project.org/ -- View this message in context: http://r.789695.n4.nabble.com/Sorting-and-subsetting-tp2547360p2548319.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW to create image like this?
Should anyone feel like reinventing that coloured wheel in R, the arcTextGrob() function in gridExtra answered a more basic query on R-help a few months ago: draw text labels on a circle and connect them with arcs. It might be a starting point. baptiste On Sep 21, 2010, at 3:14 PM, Barry Rowlingson wrote: On Tue, Sep 21, 2010 at 8:58 AM, zcrself zcrs...@gmail.com wrote: http://r.789695.n4.nabble.com/file/n2548152/25jfmyx.jpg HOW to create image like this? **tp://i55.tinypic.com/25jfmyx.jpg[/IMG] My first response is On an empty stomach, with a handy supply of anti-migraine tablets. I don't known how to create the image above or which function can create this image? Is this a standard representation of some genetic thing? If so, then it might be worth someone writing a function to do it, but that would mean first defining a precise specification for the graphic. There's nothing you can't do in R with only the lines and points functions Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for calculating bandwidths
Sorry, bandwidth via cross validation for a kernel smoothing. -- View this message in context: http://r.789695.n4.nabble.com/Package-for-calculating-bandwidths-tp2548091p2548630.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot xyz data in 2D
Hello, I want to make an 2D plot of .xyz data. So plot on x-axis and y-axis and use a color scale for the z. For every x-location along a cross section of the soil at several depths a resistivity must be displayed. This must result in a picture/graph which shows the resistivity for that cross section. The depths differ for each x-location. How can I do that? In fact it is an cross section with the following format (simple example) X Depth Resistivity -3 0 1 -2 -0.25 2 -2 0 1 -1 -0.5 3 -1 -0.25 2 -1 0 1 0 -0.75 2 0 -0.5 3 0 -0.25 2 0 0 1 1 -0.5 3 1 -0.25 2 1 0 1 2 -0.25 2 2 0 2 3 0 2 Acacia Water Jan van Beaumontstraat 1 2805 RN Gouda The Netherlands Tel. office: +31(0) 182686424 Tel. mobile: +31(0) 6 12143599 Fax: +31(0)182686239 Email: jacob.oosterw...@acaciawater.com mailto:ja...@acaciawater.com Website: www.acaciawater.com http://www.acaciawater.com/ Acacia Water Jan van Beaumontstraat 1 2805 RN Gouda The Netherlands Tel. office: +31(0) 182686424 Tel. mobile: +31(0) 6 12143599 Fax: +31(0)182686239 Email: jacob.oosterw...@acaciawater.com mailto:ja...@acaciawater.com Website: www.acaciawater.com http://www.acaciawater.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to assemble data of the same variable?
hi i'm francesco, i'am a new r user. I have a dataset with these variables [1] timestamp categoria latlong ammontare_euro [6] provincia risoluzione and i want to make an analjsy on the data of variable 'categoria'. now the variable 'categoria' has 98 different variable c=levels(X[caregoria]) length(c) [1] 98 how can i create a plot (eg. an histogram) for each the categoria's variable for the variable 'ammontare_euro'? yhank you fra -- View this message in context: http://r.789695.n4.nabble.com/how-to-assemble-data-of-the-same-variable-tp2548650p2548650.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hadley Wickham - Training in London, Data Visualisation in R
Hadley Wickham London: 1st - 2nd November 2010 Data Visualisation in R: Harnessing the power of ggplot2 to produce elegant data graphics Mango Solutions is delighted to offer a one-off 2 day training course with Dr. Hadley Wickham, R Project Data Visualisation Guru and creator of ggplot 2. The course is a must for any R user looking to improve their data visualisation skills. Suitable for any level of R experience, it will especially benefit those who know how to get data in and out of R and who can do some basic modelling. Places will be limited on what will be a very popular course. Course Content: Day One Basic graphics - How to create scatterplots, and how to add extra variables with aesthetics (like colour, shape and size) or facetting. Jittering, histograms and reordering. Coding strategy best practices. - Data: fuel economy of US cars. Graphics for large data - Histograms and bar charts for displaying distributional summaries. More boxplots. Other techniques for overcoming overplotting when drawing scatterplots of large datasets. - Data: prices and characteristics of 50,000 diamonds. Data manipulation and transformation - Group-wise summaries and transformations to add extra information to your plots. How to visualise time series. - Data: trends in US baby names over the last 120 years. Data analysis case study - Iteration between graphics, transformation and modelling. More practice using ddply, and combining it with other R functions. - Data: US baby names Day Two Visualising space - What is map data and how can you draw it? Working with shape files. Basic ggplot2 theory. How to combine multiple layers. Combining maps with data. Choropleths/themeatic maps, proportional symbol maps. - Data: Texas mortality and worldwide TB infections ggplot2 theory and graphical critique - How to critique a graphic (how is a graphic like pumpkin pie?) What is the layered grammar of graphics and why is it important? Using the grammar to create richer plots. Polishing your plots for presentation - Tweaking your plots for maximum presentation impact. Introduction to colour theory. Labels, legends and axes. Tweaking the plot themes. Where to next - General development advice. Hadley Wickham is the Dobelman Family Junior Chair in Statistics at Rice University. He is an active member of the R community, has written and contributed to over 20 R packages and won the John Chambers Award for Statistical Computing for his work developing tools for data reshaping and visualisation. His research focuses on how to make data analysis better, faster and easier with a particular emphasis on the use of visualisation to better understand data and models. For more information please contact: train...@mango-solutions.com 01249 767700 or visit www.mango-solutions.com Venue to be confirmed but will be a central London location. Price for 2 day course £1,000 ex. VAT Sarah Lewis T: +44 (0)1249 767700 Ext: 200 F: +44 (0)1249 767707 M: +44 (0)7746 224226 www.mango-solutions.com Unit 2 Greenways Business Park Bellinger Close Chippenham Wilts SN15 1BN UK LEGAL NOTICE This message is intended for the use o...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW to create image like this?
On Tue, Sep 21, 2010 at 6:14 AM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Tue, Sep 21, 2010 at 8:58 AM, zcrself zcrs...@gmail.com wrote: http://r.789695.n4.nabble.com/file/n2548152/25jfmyx.jpg HOW to create image like this? **tp://i55.tinypic.com/25jfmyx.jpg[/IMG] My first response is On an empty stomach, with a handy supply of anti-migraine tablets. This seems worthy of the fortunes package. I don't known how to create the image above or which function can create this image? Is this a standard representation of some genetic thing? If so, then it might be worth someone writing a function to do it, but that would mean first defining a precise specification for the graphic. There's nothing you can't do in R with only the lines and points functions Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] partial dbRDA or CCA with two distance objects in Vegan.
I am trying to use the cca/rda/capscale functions in vegan to analyse genetic distance data ( provided as a dist object calculated using dist.genpop in package adegenet) with geographic distance partialled out ( provided as a distance object using dist function in veganthis method is attempting to follow the method used by Geffen et al 2004 as suggested by Legendre and . FORTIN (2010). I cannot see how to introduce the Conditioning ( partialled) second dist matrix. as you can see from the code snippet below, the two dist objects are of the same dimensions. - I get an error using capscale: Error in qr.fitted(Q, Xbar) : 'qr' and 'y' must have the same number of rows or cca Error in weighted.mean.default(newX[, i], ...) : 'x' and 'w' must have the same length when using a conditioning distance object instead of a variable (Clade) of the same length as the constraints ( Latitude and Longitude) I would be grateful, for any pointers on this, ie which test is the appropriate one to use ( I believe capscale since it is similar to distance-based redundancy analysis (Legendre Anderson 1999)) and whether this test is indeed equivalent to the approach suggested by Legendre Fortin, (Geffen et al used DISTLM). many thanks Nevil Amos ACB Monash University references (Geffen, E., M. J. Anderson, et al. (2004). Climate and habitat barriers to dispersal in the highly mobile grey wolf. Molecular Ecology 13(8): 2481-2490.) LEGENDRE, P. and M.-J. FORTIN (2010). Comparison of the Mantel test and alternative approaches for detecting complex multivariate relationships in the spatial analysis of genetic data. Molecular ecology resources early copy online Snippet from analysis script: Gen_Dist-dist.genpop(mygenpop,method=2,diag=F,upper=F) str(Gen_Dist) Class 'dist' atomic [1:666] 0.866 0.757 0.813 0.872 0.887 ... ..- attr(*, Labels)= Named chr [1:37] 4879 4883 4884 4885 ... .. ..- attr(*, names)= chr [1:37] 01 02 03 04 ... ..- attr(*, Size)= int 37 ..- attr(*, call)= language dist.genpop(x = mygenpop, method = 2, diag = F, upper = F) ..- attr(*, Diag)= logi FALSE ..- attr(*, Upper)= logi FALSE ..- attr(*, method)= chr Edwards str(geog) Class 'dist' atomic [1:666] 6.61 4.19 14.6 16.71 16.68 ... ..- attr(*, Size)= int 37 ..- attr(*, Labels)= chr [1:37] 2 5 6 7 ... ..- attr(*, Diag)= logi FALSE ..- attr(*, Upper)= logi FALSE ..- attr(*, method)= chr euclidean ..- attr(*, call)= language dist(x = XY) myDbRDA-cca(Gen_Dist ~ Latitude+Longitude+Condition(Clade),data = mydata) myDbRDA-cca(Gen_Dist ~ Latitude+Longitude+Condition(geog),data = mydata) Error in weighted.mean.default(newX[, i], ...) : 'x' and 'w' must have the same length __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] definition of colorpalette
Dear group, I have recognized a strange behaviour of palette(). I tried to find any explanation but failed so far (or even didnt understood the idea behind - what is most probable). My original plan was to define a palette, save it in a variable and use it later for an image-plot. One reason why I tried to store the palette in a variable was, because I wanted to change individual values (e.g. the first value to gray). Interestingly, the palette is not defined correctly in the first run, but in the second run. Simple example: rm(list=ls()) a - palette(rainbow(6)) a [1] red #FF4C00 #FF9900 #FFE500 #CCFF00 #80FF00 #33FF00 #00FF19 #00FF66 #00FFB2 cyan#00B3FF #0066FF #0019FF #3300FF #8000FF #CC00FF [18] #FF00E6 #FF0099 #FF004D a - palette(rainbow(6)) a [1] red yellow green cyanbluemagenta ### Interestingly, this works at the first time palette(rainbow(20)) # six color rainbow plot(rnorm(20),col=1:20) as well as palette(rainbow(6)) a - palette() a [1] red yellow green cyanbluemagenta So, it seems to be that first a palette has to be defined (or set as to be used) and then the vector can be assigned to a variable. I dont understand why. Thank you in advance for your help and explanation. Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot xyz data in 2D
Dear Jacob, Do you want something like this? Using dummy data instead of yours. Copy-paste the output of dput() if you want to pass your data in a format that is easy to use. library(ggplot2) dataset - expand.grid(X = -3:3, Depth = seq(0, -1, by = -0.25)) dataset$Resistivity - rnorm(nrow(dataset), mean = 1) ggplot(dataset, aes(x = X, y = Depth, fill = Resistivity)) + geom_tile() HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Jacob Oosterwijk Verzonden: dinsdag 21 september 2010 15:50 Aan: r-help@r-project.org Onderwerp: [R] plot xyz data in 2D Hello, I want to make an 2D plot of .xyz data. So plot on x-axis and y-axis and use a color scale for the z. For every x-location along a cross section of the soil at several depths a resistivity must be displayed. This must result in a picture/graph which shows the resistivity for that cross section. The depths differ for each x-location. How can I do that? In fact it is an cross section with the following format (simple example) X Depth Resistivity -3 0 1 -2 -0.25 2 -2 0 1 -1 -0.5 3 -1 -0.25 2 -1 0 1 0 -0.75 2 0 -0.5 3 0 -0.25 2 0 0 1 1 -0.5 3 1 -0.25 2 1 0 1 2 -0.25 2 2 0 2 3 0 2 Acacia Water Jan van Beaumontstraat 1 2805 RN Gouda The Netherlands Tel. office: +31(0) 182686424 Tel. mobile: +31(0) 6 12143599 Fax: +31(0)182686239 Email: jacob.oosterw...@acaciawater.com mailto:ja...@acaciawater.com Website: www.acaciawater.com http://www.acaciawater.com/ Acacia Water Jan van Beaumontstraat 1 2805 RN Gouda The Netherlands Tel. office: +31(0) 182686424 Tel. mobile: +31(0) 6 12143599 Fax: +31(0)182686239 Email: jacob.oosterw...@acaciawater.com mailto:ja...@acaciawater.com Website: www.acaciawater.com http://www.acaciawater.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot xyz data in 2D
On Sep 21, 2010, at 9:49 AM, Jacob Oosterwijk wrote: Hello, I want to make an 2D plot of .xyz data. So plot on x-axis and y-axis and use a color scale for the z. For every x-location along a cross section of the soil at several depths a resistivity must be displayed. This must result in a picture/graph which shows the resistivity for that cross section. The depths differ for each x-location. How can I do that? ?image In fact it is an cross section with the following format (simple example) Put your data in an R object and then present dput(object), or offer a web link to a text file Don't expect us to fix up mangled data the you ran through HTML. -- David. X Depth Resistivity -3 0 1 -2 -0.25 2 -2 0 1 -1 -0.5 3 -1 -0.25 2 -1 0 1 0 -0.75 2 0 -0.5 3 0 -0.25 2 0 0 1 1 -0.5 3 1 -0.25 2 1 0 1 2 -0.25 2 2 0 2 3 0 2 Acacia Water Jan van Beaumontstraat 1 2805 RN Gouda The Netherlands Tel. office: +31(0) 182686424 Tel. mobile: +31(0) 6 12143599 Fax: +31(0)182686239 Email: jacob.oosterw...@acaciawater.com mailto:ja...@acaciawater.com Website: www.acaciawater.com http://www.acaciawater.com/ Acacia Water Jan van Beaumontstraat 1 2805 RN Gouda The Netherlands Tel. office: +31(0) 182686424 Tel. mobile: +31(0) 6 12143599 Fax: +31(0)182686239 Email: jacob.oosterw...@acaciawater.com mailto:ja...@acaciawater.com Website: www.acaciawater.com http://www.acaciawater.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble with Optimization in Alabama Package
Hello, This is my first post to the help request list, so I'm going to err on the side of giving too much information. I'm working on writing a simulation in which agents will make repeated production and exchange decisions with randomly chosen partners. The idea is, all agents can produce two goods which they want to consume, they choose a value t in [0,10] which sets their production time allocated to each good. Then they are matched with another individual with whom they will trade according to various bargaining algorithms. I want to write a general purpose optimization that takes their initial allocations (what they produced) and computes, for example, the solution which maximizes the product of their utilities from the final allocation, subject to the constraint that their utilities must be at least as high as if they didn't trade at all. Since constrOptim.nl finds the maximizing set of parameters based on the initial par argument fed to it, I had to include an additional argument to the functions so I could bring the parameters of the utility functions and the initial production into the optimization. I've written the following code but I keep getting strange errors: library(alabama) # y is a vector of initial allocations and CobbDouglas preference parameters #y=c(r1,r2,b1,b2,alpha1,alpha2) y=c(5,50,60,5,.3,.7) # p1=c(r1bar,r1bar,b1bar,b1bar) a feasible allocation p1=c(15,40,50,15) fn=function(x,...){ return(-1*(((x[1]^y[5])*(x[3]^(1-y[5])))*(((x[2]^y[6])*(x[4]^(1-y[6])) } hin=function(x,...){ h=rep(NA,2) h[1]=((x[1]^y[5])*(x[3]^(1-y[5])))-((y[1]^y[5])*(y[3]^(1-y[5]))) h[2]=((x[2]^y[6])*(x[4]^(1-y[6])))-((y[2]^y[6])*(y[4]^(1-y[6]))) return(h) } heq=function(x,...){ h=rep(NA,2) h[1]=x[1]+x[2]-y[1]-y[2] h[2]=x[3]+x[4]-y[3]-y[4] return(h) } ans2=constrOptim.nl(par=p1,fn=fn,hin=hin,heq=heq,control.outer=list(itmax= 1000,mu0=.1),list(y,y,y)) Any advice or explanation of my errors would be greatly appreciated! -- Erik O. Kimbrough Department of Economics (AE1) School of Business and Economics Maastricht University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] definition of colorpalette
On Sep 21, 2010, at 10:50 AM, Daniel Stepputtis R wrote: Dear group, I have recognized a strange behaviour of palette(). I tried to find any explanation but failed so far (or even didnt understood the idea behind - what is most probable). My original plan was to define a palette, save it in a variable and use it later for an image-plot. One reason why I tried to store the palette in a variable was, because I wanted to change individual values (e.g. the first value to gray). Interestingly, the palette is not defined correctly in the first run, but in the second run. No, you have misunderstood what was happening. Read the Value section of hte help page. The returned value of a call to palette is the _old_ settings, (just like the par function). It allows a programming strategy like: oldvals -func_with_side_effect(new_settings) Do stuff with new_settings in place func(oldvals) Simple example: rm(list=ls()) a - palette(rainbow(6)) a [1] red #FF4C00 #FF9900 #FFE500 #CCFF00 #80FF00 #33FF00 #00FF19 #00FF66 #00FFB2 cyan#00B3FF #0066FF #0019FF #3300FF #8000FF #CC00FF [18] #FF00E6 #FF0099 #FF004D a - palette(rainbow(6)) a [1] red yellow green cyanbluemagenta ### Interestingly, this works at the first time palette(rainbow(20)) # six color rainbow plot(rnorm(20),col=1:20) as well as palette(rainbow(6)) a - palette() a [1] red yellow green cyanbluemagenta So, it seems to be that first a palette has to be defined (or set as to be used) and then the vector can be assigned to a variable. I dont understand why. Thank you in advance for your help and explanation. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in eval(expr, envir, enclos)
I used the following command line: Regression - read.table (C:/MUP/Internship/Distance_Statistics/Excel_data/Robust_Regression/ForModelling_Rev_Rob.csv , sep= ,, header = TRUE) attach (Regression) summary (Regression) library (MASS) rlm(TotalEmployment_2004 ~ MISSISSIPPI + LOUISIANA + TotalEmployment_2000 + PCWhitePop_2004 + UnemploymentRate_2004 + PCUrbanPop2000 + PCPeopleWithACollegeDegree_2000 + PCPopulation.of.or.over.65.years.of.age_2004, data = rfdmodel1) But, it is still giving me the same error. Uttara -- View this message in context: http://r.789695.n4.nabble.com/Error-in-eval-expr-envir-enclos-tp2547917p2548804.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fleiss Control Size Formula
I am trying to compute the smallest control group size using Fleiss formula. Assuming that my population is 5,000, the smallest expected lift (smallest detectable change) is 5%, and the highest likely rate in the control group is 50%, what is the minimum control group size assuming a 95% confidence interval and power of 80%? I am more interested in the adjusted Fleiss formula to compute the control group size. Thanks, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bivariate vector numerical integration with infinite range
baptiste auguie baptiste.auguie at googlemail.com writes: Thanks, adaptIntegrate() seems perfectly suited, I'll just need to figure a transformation rule for the infinite limit. The suggestion of x-1/x does not seem to work here because it also transforms 0 into -infinity. I think exp(pi* sinh(x)) could be a better choice, according to Numerical Recipes. Yes, that's one way. But you can also split the integral in two parts, one from 0 to 1 and then from 1 to Inf. The first one is a finite hypercube and the second can be transformed with x -- 1/x into [0, 1]. I usually prefer the second approach for higher-dimensional applications as the Jacobian appears to be simpler. In the literature you will find discussions on how far out the finite hypercube should reach for lowering the absolute error. Hans Werner Thanks, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in eval(expr, envir, enclos)
Hello Uttara, What happens if you try something like this? Regression - read.table(C:/MUP/Internship/Distance_Statistics/Excel_data/Robust_Regression/ForModelling_Rev_Rob.csv, sep= ,, header = TRUE) library (MASS) rfdmodel1 - rlm(TotalEmployment_2004 ~ MISSISSIPPI + LOUISIANA + TotalEmployment_2000 + PCWhitePop_2004 + UnemploymentRate_2004 + PCUrbanPop2000 + PCPeopleWithACollegeDegree_2000 + PCPopulation.of.or.over.65.years.of.age_2004, data = Regression) summary(rfdmodel1) Rather than using attach() on your data, you can just use the variable name you assigned it to (here Regression), and pass that to the 'data' argument of rlm(). So the data that rlm() is fitting the model to is Regression, and the output of rlm() (that is the actual model) is assigned to rfdmodel1. Hope that helps, Josh On Tue, Sep 21, 2010 at 8:02 AM, uttara_n nilawar.utt...@gmail.com wrote: I used the following command line: Regression - read.table (C:/MUP/Internship/Distance_Statistics/Excel_data/Robust_Regression/ForModelling_Rev_Rob.csv , sep= ,, header = TRUE) attach (Regression) summary (Regression) library (MASS) rlm(TotalEmployment_2004 ~ MISSISSIPPI + LOUISIANA + TotalEmployment_2000 + PCWhitePop_2004 + UnemploymentRate_2004 + PCUrbanPop2000 + PCPeopleWithACollegeDegree_2000 + PCPopulation.of.or.over.65.years.of.age_2004, data = rfdmodel1) But, it is still giving me the same error. Uttara -- View this message in context: http://r.789695.n4.nabble.com/Error-in-eval-expr-envir-enclos-tp2547917p2548804.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting and subsetting
Probably true, thats cunning, but look at base::match. The first thing it does is coerce factor to character (an allocate and copy needed internally). data.table doesn't do that either, see data.table:::sortedmatch. I made first basic steps towards a proper reproducible test suite (timings.Rnw). Perhaps this example could be added there; PDF is on the homepage. One test is 340 times faster and the other is 13 times faster. More examples would be good. Matthew http://datatable.r-forge.r-project.org/ Joshua Wiley jwiley.ps...@gmail.com wrote in message news:aanlktimyuvl9suj65ktzqvpnyn+ep8ubu3mxxhhrd...@mail.gmail.com... On Tue, Sep 21, 2010 at 3:09 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: All the solutions in this thread so far use the lapply(split(...)) paradigm either directly or indirectly. That paradigm doesn't scale. That's the likely source of quite a few 'out of memory' errors and performance issues in R. This is a good point. It is not nearly as straightforward as the syntax for data.table (which seems to order and select in one step...very nice!), but this should be less memory intensive: tmp - data.frame(index = gl(2,20), foo = rnorm(40)) tmp - tmp[order(tmp$index, tmp$foo) , ] # find location of first instance of each level and add 0:4 to it x - sapply(match(levels(tmp$index), tmp$index), `+`, 0:4) tmp[x, ] data.table doesn't do that internally, and it's syntax is pretty easy. tmp - data.table(index = gl(2,20), foo = rnorm(40)) tmp[, .SD[head(order(-foo),5)], by=index] index index.1 foo [1,] 1 1 1.9677303 [2,] 1 1 1.2731872 [3,] 1 1 1.1100931 [4,] 1 1 0.8194719 [5,] 1 1 0.6674880 [6,] 2 2 1.2236383 [7,] 2 2 0.9606766 [8,] 2 2 0.8654497 [9,] 2 2 0.5404112 [10,] 2 2 0.3373457 As you can see it currently repeats the group column which is a shame (on the to do list to fix). Matthew http://datatable.r-forge.r-project.org/ -- View this message in context: http://r.789695.n4.nabble.com/Sorting-and-subsetting-tp2547360p2548319.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Web forum - should I make one?
Hello, this might be little off-topic, but still... I have not found any official web forum for R users. Did I look good? If not, I'm sorry and shame on me! :-) Such forums are very probably the most common way to share information about all sorts of problems, to ask and get answer. See http://forums.opensuse.org/ as an example. Admins, developers, users, do You think to have such forum would be useful? If yes, I can create and manage it. If You have any idea about its possible functionality and content, let me know. If it would help R community, I will invest my time and energy to it. :-) Best regards, -- Vojtěch Zeisek Department of Botany, Faculty of Science, Charles Uni., Prague, CZ Institute of Botany, Academy of Science, Czech Republic Community of the openSUSE GNU/Linux https://www.natur.cuni.cz/faculty-en?set_language=en http://www.ibot.cas.cz/?p=indexamp;site=en http://www.opensuse.org/ http://web.natur.cuni.cz/~zeisek/ signature.asc Description: This is a digitally signed message part. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble with Optimization in Alabama Package
I will take a look and get back to you. Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Kimbrough Sent: Tuesday, September 21, 2010 11:01 AM To: R-help@r-project.org Subject: [R] Trouble with Optimization in Alabama Package Hello, This is my first post to the help request list, so I'm going to err on the side of giving too much information. I'm working on writing a simulation in which agents will make repeated production and exchange decisions with randomly chosen partners. The idea is, all agents can produce two goods which they want to consume, they choose a value t in [0,10] which sets their production time allocated to each good. Then they are matched with another individual with whom they will trade according to various bargaining algorithms. I want to write a general purpose optimization that takes their initial allocations (what they produced) and computes, for example, the solution which maximizes the product of their utilities from the final allocation, subject to the constraint that their utilities must be at least as high as if they didn't trade at all. Since constrOptim.nl finds the maximizing set of parameters based on the initial par argument fed to it, I had to include an additional argument to the functions so I could bring the parameters of the utility functions and the initial production into the optimization. I've written the following code but I keep getting strange errors: library(alabama) # y is a vector of initial allocations and CobbDouglas preference parameters #y=c(r1,r2,b1,b2,alpha1,alpha2) y=c(5,50,60,5,.3,.7) # p1=c(r1bar,r1bar,b1bar,b1bar) a feasible allocation p1=c(15,40,50,15) fn=function(x,...){ return(-1*(((x[1]^y[5])*(x[3]^(1-y[5])))*(((x[2]^y[6])*(x[4]^(1-y[6])) } hin=function(x,...){ h=rep(NA,2) h[1]=((x[1]^y[5])*(x[3]^(1-y[5])))-((y[1]^y[5])*(y[3]^(1-y[5]))) h[2]=((x[2]^y[6])*(x[4]^(1-y[6])))-((y[2]^y[6])*(y[4]^(1-y[6]))) return(h) } heq=function(x,...){ h=rep(NA,2) h[1]=x[1]+x[2]-y[1]-y[2] h[2]=x[3]+x[4]-y[3]-y[4] return(h) } ans2=constrOptim.nl(par=p1,fn=fn,hin=hin,heq=heq,control.outer=list(itmax= 1000,mu0=.1),list(y,y,y)) Any advice or explanation of my errors would be greatly appreciated! -- Erik O. Kimbrough Department of Economics (AE1) School of Business and Economics Maastricht University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem building package
Hello, I'm running into the following error: C:\Program Files\R\Rtools\bin\sh.exe: *** fatal error - couldn't allocate heap, Win32 error 487, base 0x7A, top 0x7b, reserve_size 61440, allocsize 65536, page_const 4096 I've re-installed Rtools and R (2.10.1) and followed all the prescriptions. I've checked that I only have a version of the gygwin dlls in my machine. Has anyone else found this problem? Any known memory conflicts with common software? Leo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] partial dbRDA or CCA with two distance objects in Vegan.
On 21/09/10 17:40 PM, Nevil Amos nevil.a...@gmail.com wrote: I am trying to use the cca/rda/capscale functions in vegan to analyse genetic distance data ( provided as a dist object calculated using dist.genpop in package adegenet) with geographic distance partialled out ( provided as a distance object using dist function in veganthis method is attempting to follow the method used by Geffen et al 2004 as suggested by Legendre and . FORTIN (2010). I cannot see how to introduce the Conditioning ( partialled) second dist matrix. as you can see from the code snippet below, the two dist objects are of the same dimensions. - I get an error using capscale: Error in qr.fitted(Q, Xbar) : 'qr' and 'y' must have the same number of rows or cca Error in weighted.mean.default(newX[, i], ...) : 'x' and 'w' must have the same length when using a conditioning distance object instead of a variable (Clade) of the same length as the constraints ( Latitude and Longitude) I would be grateful, for any pointers on this, ie which test is the appropriate one to use ( I believe capscale since it is similar to distance-based redundancy analysis (Legendre Anderson 1999)) and whether this test is indeed equivalent to the approach suggested by Legendre Fortin, (Geffen et al used DISTLM). Nevil, You cannot use cca() for dissimilarity data. If you have dissimilarity data, you must use capscale() which runs db-RDA. Even there, your constraints (variables on the right hand side of the formula) must be rectangular data and not dissimilarities. AFAIK, people have changed their dissimilarities into a PCNM structure when they want to partial out the distance effect. That is one of the few original possibilities since data must be rectangular (rows and columns). Cheers, jari oksanen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Colorramp in Maptools, how to choose min and max values for the fg= argument
Hello, I am using maptools for plooting geographical data. The colour of the region indicates some region dependent value (population for example). I pass the colours of the regions to the plot.Map function by defining the foreground colour: jet.colors = colorRampPalette(c(#7F, blue, #007FFF, cyan, #7FFF7F, yellow, #FF7F00, red, #7F)) n=100 cols - jet.colors(n) Intervalls = cut(as.numeric(Vector),n, na.rm=TRUE) fgs -cols[Intervalls] ## is there some option/possibility here to choose colours differently ?? plot.Map(Shape.map, fg=fgs, ol=NA, bty=n, xlab=, ylab=, xaxt=n, yaxt=n) My problem is the following: I would like to choose minimal and maximal value of the colour palette, different from the minimal and maximal values of the Vector. For example: Vector=c(2,3,4,5,6), but the maximal colour should correspond to 10 and the minimal to 0. I know, that maptools is not the most actual solution to plot maps, but I am building on an old code and therefore do not want to change it. I would very much appreciate any help or hints!! Katrin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can ucminf be installed in 64 bit R and one more question?
Hey, Duncan thanks for your reply. I am not sure which version i have installed but I downloaded it from http://cran.skazkaforyou.com/. when I check the R installed, it says 2.11.1. I do not know I answered your question or not. if not, where I can find them? (in fact, I did not notice/find there are many versions of 64 bit R.) Nan - Original Message From: Duncan Murdoch murdoch.dun...@gmail.com To: Hey Sky heyskywal...@yahoo.com Cc: R r-help@r-project.org Sent: Tue, September 21, 2010 6:51:57 AM Subject: Re: [R] Can ucminf be installed in 64 bit R and one more question? On 20/09/2010 8:36 PM, Hey Sky wrote: Hey, R Users my windows is 64 bit windows 7. I am trying to install the package ucminf into my 64 bit version R but cannot. the package I downloaded is from http://cran.r-project.org/web/packages/ucminf/index.html and I installed it with the install from local zip files, due to I did not connect my computer to internet. did anyone meet this problem and is there a version of ucminf for 64 bit R? Binary installs are specific to particular R versions. There are several versions of R with 64 bit Windows builds now; which one are you using? Duncan Murdoch the question is: why the ucminf (for 32 bit R) with option hessian=3 always give hessian matrix while most of other methods failed (includiing the option hessian=1 which using numDeriv)? thanks for any information Nan from Montreal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doing operations by grouping variable
Thanks, Bill and Michael, you have answered the question I asked, but not the one I wished to ask I want to obtain the maximum in each group of variables, so I could scale each variable by the maximum for its group. If I use tapply, as in the example below, there's a mismatch in dimensions of the output of tapply [5] and the data frame with the variables[25]. group = rep(1:5, each=5) # define grouping variable variable = rnorm(25)# generate data d - data.frame(group,variable) # bundle together in a data frame d$scaled - d$variable/(with(d,tapply(variable,group,max))) # crash and burn Dr. Seth W. Bigelow Biologist, USDA-FS Pacific Southwest Research Station 1731 Research Park Drive, Davis California bill.venab...@csiro.au 09/20/2010 06:24 PM To michael.bedw...@gmail.com, sbige...@fs.fed.us, r-help@r-project.org cc Subject RE: [R] Doing operations by grouping variable That's if the variables are visible. If they are only in the data frame it's not much more difficult d - data.frame(group = rep(1:5, each=5), variable = rnorm(25)) with(d, tapply(variable, group, max)) (Tip: avoid using attach().) Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Bedward Sent: Tuesday, 21 September 2010 11:15 AM To: Seth W Bigelow; Rhelp Subject: Re: [R] Doing operations by grouping variable Not sure why you think tapply is awkward. Your example would be... group - rep(1:5, each=5) variable - rnorm(25) tapply(variable, group, max) ...which looks quite elegant to me :) Meanwhile, the reason your expression doesn't work is that you are asking mistakenly for elements 1:5 repeatedly from the variable col. If you just type d$variable[ d$group ] and compare the values to your variable vector this should be clear. Michael On 21 September 2010 10:59, Seth W Bigelow sbige...@fs.fed.us wrote: I'm writing an expression that requires searching a vector according to group. As an example, I want to find the maximum value in each of 5 groups. group=rep(1:5, each=5) # create grouping variable variable=rnorm(25) # generate data d - data.frame(group,variable) # make data frame max(d$variable[d$group])# try expression that doesn't work I'm expecting a vector containing the maximum variable value, per group. What am I doing wrong? I know I can use aggregate, tapply, etc. but that seems awkward and bulky, is there a simpler way? Dr. Seth W. Bigelow Biologist, USDA-FS Pacific Southwest Research Station 1731 Research Park Drive, Davis California [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] when i create data.frame is time variable and Censor variable should be equal?
Error in data.frame(times = NonCensored.data, censor = Censored.data) : arguments imply differing number of rows: 14, 6 Anan Halabi Reliability Eng, RD HP Scitex Tel: 972-9-8924648 mobil: 972-52-6624231 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.