[R] [optim/bbmle] function returns NA at ... distance from x
Dear R helpers, I try to find the model parameters using mle2 (bbmle package). As I try to optimize the likelihood function the following error message occurs: Error in grad.default(objectivefunction, coef) : function returns NA at 1e-040.001013016911639890.0003166929388711890.000935163594829395 distance from x. In addition: Warning message: In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p) : Gradient not computable after method Nelder-Mead I can't figure out what that means exactly and how to fix it. I understand that mle2 uses optim (or in my case optimx) to optimize the likelihood function. As I use the Nelder-Mead method it should not be a problem if the function returns NA at any iteration (as long as the initial values don't return NA). Can anyone help me with that? Here a small example of my code that reproduces the problem: library(plyr) library(optimx) ### Sample data ### x - c(1,1,4,2,3,0,1,6,0,0) tx - c(30.14, 5.14, 24.43, 10.57, 25.71, 0.00, 14.14, 32.86, 0.00, 0.00) T - c(32.57, 29.14, 33.57, 34.71, 27.71, 38.14, 36.57, 37.71, 35.86, 30.57) data - data.frame(x=x, tx=tx, T=T) ### Likelihood function ### Likelihood - function(data, r, alpha, s, beta) { with(data, { if (r=0 | alpha=0 | s=0 | beta=0) return (NaN) f - function(x, tx, T) { g - function(y) (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1)) integrate(g, tx, T)$value } integral - mdply(data, f) L - exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1)) f - sum(log(L)) return (f) }) } ### ML estimation function ### Estimate_parameters_MLE - function(data, initValues) { llhd - function(r, alpha, s, beta) { return (Likelihood(data, r, alpha, s, beta)) } library(bbmle) fit - mle2(llhd, initValues, skip.hessian=TRUE, optimizer=optimx, method=Nelder-Mead, control=list(maxit=1e8)) return (fit) } ### Parameter estimation ### Likelihood(data=data, r=0.5, alpha=10, s=0.7, beta=10) ### check initial parameters -- -72.75183 -- initial parameters do return value MLE_estimation - Estimate_parameters_MLE(data=data, list(r=0.5, alpha=10, s=0.7, beta=10)) 'Error in grad.default(objectivefunction, coef) : function returns NA at 1e-040.001013016911639890.0003166929388711890.000935163594829395 distance from x. In addition: Warning message: In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p) : Gradient not computable after method Nelder-Mead' Best regards, Carlos - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail: carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How-to add to LDA ggplot axes the Percentage of variance explained
Hi, How can I add to LDA ggplot axes the Percentages of variance explained? Script: /require(MASS) require(ggplot2) iris.lda-lda(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris) datPred-data.frame(Species=predict(iris.lda)$class,predict(iris.lda)$x) ggplot(datPred, aes(x=LD1, y=LD2, col=Species) ) + geom_point( size = 4, aes(color = Species)) / Thanks -- View this message in context: http://r.789695.n4.nabble.com/How-to-add-to-LDA-ggplot-axes-the-Percentage-of-variance-explained-tp4673603.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rmpi installs before R-3.0.0 but not since
With R-3.0.1 Loading required package: Rmpi Failed with error: ‘package ‘Rmpi’ was built before R 3.0.0: please re-install it’ And when I try to reinstall Rmpi, I get this after a whole bunch of 'yes's checking mpi.h usability... no checking mpi.h presence... no checking for mpi.h... no configure: error: Cannot find mpi.h header file ERROR: configuration failed for package ‘Rmpi’ And attempting to go back With R-2.15.3, I get these warnings: Warning messages: 1: package ‘gbm’ was built under R version 3.0.1 2: package ‘survival’ was built under R version 3.0.1 [1] Error in socketConnection(\localhost\, port = port, server = TRUE, blocking = TRUE, : \n cannot open the connection\n So there's no going back. Where do I look for reasons why Rmpi can't find mpi.h header even though it was findable before 3.0.1? I know not to take that message too literally since there is a file mpi.h that bash can find. Something else is being hinted at but what? sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_NZ.UTF-8LC_COLLATE=en_NZ.UTF-8 [5] LC_MONETARY=en_NZ.UTF-8LC_MESSAGES=en_NZ.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel splines grDevices utils stats graphics methods [8] base other attached packages: [1] gbm_2.1 survival_2.37-4 cairoDevice_2.19 lattice_0.20-15 loaded via a namespace (and not attached): [1] grid_3.0.1 multicore_0.1-7 TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rmpi installs before R-3.0.0 but not since
Hello, Maybe this link might help: http://www.stats.uwo.ca/faculty/yu/Rmpi/install.htm Regards, Pascal 2013/8/13 Patrick Connolly p_conno...@slingshot.co.nz With R-3.0.1 Loading required package: Rmpi Failed with error: package Rmpi was built before R 3.0.0: please re-install it And when I try to reinstall Rmpi, I get this after a whole bunch of 'yes's checking mpi.h usability... no checking mpi.h presence... no checking for mpi.h... no configure: error: Cannot find mpi.h header file ERROR: configuration failed for package Rmpi And attempting to go back With R-2.15.3, I get these warnings: Warning messages: 1: package gbm was built under R version 3.0.1 2: package survival was built under R version 3.0.1 [1] Error in socketConnection(\localhost\, port = port, server = TRUE, blocking = TRUE, : \n cannot open the connection\n So there's no going back. Where do I look for reasons why Rmpi can't find mpi.h header even though it was findable before 3.0.1? I know not to take that message too literally since there is a file mpi.h that bash can find. Something else is being hinted at but what? sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_NZ.UTF-8LC_COLLATE=en_NZ.UTF-8 [5] LC_MONETARY=en_NZ.UTF-8LC_MESSAGES=en_NZ.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel splines grDevices utils stats graphics methods [8] base other attached packages: [1] gbm_2.1 survival_2.37-4 cairoDevice_2.19 lattice_0.20-15 loaded via a namespace (and not attached): [1] grid_3.0.1 multicore_0.1-7 TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lme4 and syntax of random factors
Dear R-users, Iâve been looking at the lmer function (lme4 package) in order to set up a mixed linear model and something about the syntax of the random effects eludes me. Iâd like a hand with understanding a specific point, if someone does master this function⦠Letâs say that I have 2 random effects, A (e.g. species, k=2) and B (e.g. individuals, n=100). I made some research about model syntax, and I have the understanding that everything at the left side of the random âparameterâ is about SLOPE and everything at the right side about intercept : ⦠+ (1 |B) would give me an intercept per individual. ⦠+ (1 |A) would give me an intercept per species. ⦠+ (1 |A:B) would give me an intercept per individuals with nested effect (individual inside species). I would like to have random slopes per species. So I thought I could do something like that : ⦠+ (A |B) so to have an intercept per individual and a slope value per species. Graphically, I would therefore obtain 100 lines with 100 different intercepts and 2 possible slopes (1 per species). However, when I extract random parameter values (ranef()), I have : ·        First column is the intercept : varying values per line (individuals), so OK ·        2nd and 3rd column are Species 1 and 2: I have different across individuals (without obvious pattern: I do not have similar values for individual of the same species), which is not what I was expecting (1 value per species : the slope parameter). Is the mistake Iâm doing (or in my understanding of lme4) obvious to somebody? With regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hungarian R User's Group (Gergely Dar?czi)
Hi Gergely! This sounds so exciting - so R is turning 20 years old? How did you set up your R users' group? What are the best practices in going about to set one up? I would be keen on establishing one here in Windhoek, Namibia. Pancho Mulongeni Research Assistant PharmAccess Foundation 1 Fouché Street Windhoek West Windhoek Namibia Tel: +264 61 419 000 Fax: +264 61 419 001/2 Mob: +264 81 4456 286 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] F-test question
G'day I try do compute some F-statistics of a singular spectrum analysis of a timeseries sv I run: require(Rssa) s - ssa(sv) summary(sv) Min. 1st Qu. MedianMean 3rd Qu.Max. -4.238 2.761 6.594 6.324 10.410 15.180 r1 - reconstruct(s,groups = list(1:5)) r2 - reconstruct(s,groups = list(1:6)) SSE_M1 - sum(residuals(r1)^2) SSE_M2 - sum(residuals(r2)^2) df.num - r1$df - r2$df df.den - r2$df F - ((SSE_M2 - SSE_M1) / df.num) / (SSE_M1 / df.den) and eventually p.value - 1 - pf(F, df.num, df.den) Error in pf(F, df.num, df.den) : Non-numeric argument to mathematical function summary(df.num) Min. 1st Qu. MedianMean 3rd Qu.Max. summary(df.den) Length Class Mode 0 NULL NULL summary(F) Min. 1st Qu. MedianMean 3rd Qu.Max. I need to compute the p.value, but something is going wrong, and I can't see what. Any help would be very much appreciated ingo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hungarian R User's Group (Gergely Dar?czi)
Hi Pancho, there are already a bunch of R User Groups around the world: http://rwiki.sciviews.org/doku.php?id=rugs:r_user_groups The Revolution Analytics guys has posted some tips on how to found one ( http://www.revolutionanalytics.com/news-events/r-user-group/how-to-start-r-user-group.php) and they also offer some sponsorship. What I did here was fairly simple: I wrote a few mails to some some of my contacts who speak R, also posted some messages on mailing lists and forums that we should come together. I already got some great feedback, so hopefully there will be an active Hungarian RUG soon, where we can have talks, workshops and tutorials on R related topics, or simply to get to know some other useRs and their field of interest. So it does sounds exciting indeed, and we will see about the results soon. And yes: AFAIK R was announced exactly 20 years ago, so that's a great time to found RUG(s) :) Best, Gergely On 13 August 2013 13:29, Pancho Mulongeni p.mulong...@namibia.pharmaccess.org wrote: Hi Gergely! This sounds so exciting - so R is turning 20 years old? How did you set up your R users' group? What are the best practices in going about to set one up? I would be keen on establishing one here in Windhoek, Namibia. Pancho Mulongeni Research Assistant PharmAccess Foundation 1 Fouché Street Windhoek West Windhoek Namibia Tel: +264 61 419 000 Fax: +264 61 419 001/2 Mob: +264 81 4456 286 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] F-test question
Hello, r1$df and r2$df don't exist. Regards, Pascal 2013/8/13 Ingo Wardinski i...@gfz-potsdam.de G'day I try do compute some F-statistics of a singular spectrum analysis of a timeseries sv I run: require(Rssa) s - ssa(sv) summary(sv) Min. 1st Qu. MedianMean 3rd Qu.Max. -4.238 2.761 6.594 6.324 10.410 15.180 r1 - reconstruct(s,groups = list(1:5)) r2 - reconstruct(s,groups = list(1:6)) SSE_M1 - sum(residuals(r1)^2) SSE_M2 - sum(residuals(r2)^2) df.num - r1$df - r2$df df.den - r2$df F - ((SSE_M2 - SSE_M1) / df.num) / (SSE_M1 / df.den) and eventually p.value - 1 - pf(F, df.num, df.den) Error in pf(F, df.num, df.den) : Non-numeric argument to mathematical function summary(df.num) Min. 1st Qu. MedianMean 3rd Qu.Max. summary(df.den) Length Class Mode 0 NULL NULL summary(F) Min. 1st Qu. MedianMean 3rd Qu.Max. I need to compute the p.value, but something is going wrong, and I can't see what. Any help would be very much appreciated ingo __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [optim/bbmle] function returns NA at
1) Why use Nelder-Mead with optimx when it is an optim() function. You are going from New York to Philadelphia via Beijing because of the extra overhead. The NM method is there for convenience in comparisons. 2) NM cannot work with NA when it wants to compute the centroid of points and search direction. So you've got to find a way to make sure your likelihood is properly defined. This seems to be the issue for about 90% of failures with optim(x) or other ML methods in my recent experience. Note that returning a large value (and make it a good deal smaller than the .Machine$double.xmax, say that number *1e-6 to avoid computation troubles) often works, but it is a quick and dirty fix. JN On 13-08-13 06:00 AM, r-help-requ...@r-project.org wrote: Message: 36 Date: Tue, 13 Aug 2013 10:38:05 +0200 From: Carlos Nashercarlos.nas...@googlemail.com To:r-help@r-project.org Subject: [R] [optim/bbmle] function returns NA at ... distance from x Message-ID: CAP=bvwpxj991fbyt9ou5x1jf9nol3vtq1svtjvw82jwfjyz...@mail.gmail.com Content-Type: text/plain Dear R helpers, I try to find the model parameters using mle2 (bbmle package). As I try to optimize the likelihood function the following error message occurs: Error in grad.default(objectivefunction, coef) : function returns NA at 1e-040.001013016911639890.0003166929388711890.000935163594829395 distance from x. In addition: Warning message: In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p) : Gradient not computable after method Nelder-Mead I can't figure out what that means exactly and how to fix it. I understand that mle2 uses optim (or in my case optimx) to optimize the likelihood function. As I use the Nelder-Mead method it should not be a problem if the function returns NA at any iteration (as long as the initial values don't return NA). Can anyone help me with that? Here a small example of my code that reproduces the problem: library(plyr) library(optimx) ### Sample data ### x - c(1,1,4,2,3,0,1,6,0,0) tx - c(30.14, 5.14, 24.43, 10.57, 25.71, 0.00, 14.14, 32.86, 0.00, 0.00) T - c(32.57, 29.14, 33.57, 34.71, 27.71, 38.14, 36.57, 37.71, 35.86, 30.57) data - data.frame(x=x, tx=tx, T=T) ### Likelihood function ### Likelihood - function(data, r, alpha, s, beta) { with(data, { if (r=0 | alpha=0 | s=0 | beta=0) return (NaN) f - function(x, tx, T) { g - function(y) (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1)) integrate(g, tx, T)$value } integral - mdply(data, f) L - exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1)) f - sum(log(L)) return (f) }) } ### ML estimation function ### Estimate_parameters_MLE - function(data, initValues) { llhd - function(r, alpha, s, beta) { return (Likelihood(data, r, alpha, s, beta)) } library(bbmle) fit - mle2(llhd, initValues, skip.hessian=TRUE, optimizer=optimx, method=Nelder-Mead, control=list(maxit=1e8)) return (fit) } ### Parameter estimation ### Likelihood(data=data, r=0.5, alpha=10, s=0.7, beta=10) ### check initial parameters -- -72.75183 -- initial parameters do return value MLE_estimation - Estimate_parameters_MLE(data=data, list(r=0.5, alpha=10, s=0.7, beta=10)) 'Error in grad.default(objectivefunction, coef) : function returns NA at 1e-040.001013016911639890.0003166929388711890.000935163594829395 distance from x. In addition: Warning message: In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p) : Gradient not computable after method Nelder-Mead' Best regards, Carlos - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail:carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Understanding S4 method dispatch
Hi all, Any insight into the code below would be appreciated - I don't understand why two methods which I think should have equal distance from the call don't. Thanks! Hadley # Create simple class hierarchy setClass(A, NULL) setClass(B, A) a - new(A) b - new(B) setGeneric(f, function(x, y) standardGeneric(f)) setMethod(f, signature(A, A), function(x, y) A-A) setMethod(f, signature(B, B), function(x, y) B-B) # These work as I expect f(a, a) f(b, b) setClass(AB, contains = c(A, B)) ab - new(AB) # Why does this return B-B? Shouldn't both methods be an equal distance? f(ab, ab) # These both return distance 1, as I expected extends(AB, A, fullInfo=TRUE)@distance extends(AB, B, fullInfo=TRUE)@distance # So why is signature(B, B) closer than signature(A, A) -- Chief Scientist, RStudio http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coxph diagnostics
That's the primary reason for the plot: so that you can look and think. The test statistic is based on whether a LS line fit to the plot has zero slope. For larger data sets you can sometimes have a significant p-value but good agreement with proportional hazards. It's much like an example from Lincoln Moses' begining statistics book (now out of print, so rephrasing from memory). Suppose that you flip a coin 10,000 times and get 5101 heads. What can you say? a. The coin is not perfectly fair (p.05). b. But it is darn close to perfect! As a referee I would be comfortable using that coin to start a football game. The Cox model gives an average hazard ratio, averaged over time. When proportional hazards holds that value is a complete summary-- nothing else is needed.When it does not hold, the average may still be useful, or not, depending on the degree of change over time. Terry Therneau On 08/13/2013 05:00 AM, r-help-requ...@r-project.org wrote: Thanks to Bert and G?ran for your responses. To answer G?ran's comment, yes I did plot the Schoenfeld residuals using plot.cox.zph and the lines look horizontal (slope = 0) to me, which makes me think that it contradicts the results of cox.zph. What alternatives do I have if I assume proportional assumption of coxph does not hold? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Faster R algorithms than AlgDesign?
Hello! I have a very large experimental design space (all possible combinations of all possible levels of several factors). For example, 'allcand' below has 1,875,000 possible combinations of 9 factors. allcand-expand.grid(a1=as.factor(1:5),a2=as.factor(1:5),a3=as.factor(1:5),a4=as.factor(1:5), a5=as.factor(1:5),a6=as.factor(1:3),a7=as.factor(1:5),a8=as.factor(1:5),a9=as.factor(1:8)) dim(allcand) My ultimate goal is to grab a subset of 10,000 out of those 1.875 million candidates, such that the resulting 10,000 are as orthogonal as possible. Usually, I use package AlgDesign for such tasks. However, my design space is so large that it is taking too long even to grab 100 out 1,875,000 - like that: library(AlgDesign) system.time(mydes-optFederov(~.,data=allcand,nTrials=100)) # It took me on my machine 38 min. Is there a package that could do something like this faster? Thank you very much! -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding S4 method dispatch
Hadley, The class AB inherits from A and from B, but B already inherits from class A. So actually you only have an object of class B in your object of class AB. When you call the function f R looks for a method f for AB objects. It does not find such a method and looks for a method of the object inherited from, B. Such a method is present and is then executed. The inheritance structure has to be changed. The behavior is actually desired, as if this behavior weren't given a diamond class inheritance would be fatal. Best Simon On Aug 13, 2013, at 3:08 PM, Hadley Wickham h.wick...@gmail.com wrote: Hi all, Any insight into the code below would be appreciated - I don't understand why two methods which I think should have equal distance from the call don't. Thanks! Hadley # Create simple class hierarchy setClass(A, NULL) setClass(B, A) a - new(A) b - new(B) setGeneric(f, function(x, y) standardGeneric(f)) setMethod(f, signature(A, A), function(x, y) A-A) setMethod(f, signature(B, B), function(x, y) B-B) # These work as I expect f(a, a) f(b, b) setClass(AB, contains = c(A, B)) ab - new(AB) # Why does this return B-B? Shouldn't both methods be an equal distance? f(ab, ab) # These both return distance 1, as I expected extends(AB, A, fullInfo=TRUE)@distance extends(AB, B, fullInfo=TRUE)@distance # So why is signature(B, B) closer than signature(A, A) -- Chief Scientist, RStudio http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lme4 and syntax of random factors
Robert U tacsunday at yahoo.fr writes: Dear R-users, [snip] This question probably belongs on r-sig-mixed-mod...@r-project.org . Followups there, please. Let's say that I have 2 random effects, A (e.g. species, k=2) and B (e.g. individuals, n=100). I made some research about model syntax, and I have the understanding that everything at the left side of the random parameter is about SLOPE and everything at the right side about intercept : You really can't practically fit a random effect to 2 species (see http://glmm.wikidot.com/faq#fixed_vs_random + (1 |B) would give me an intercept per individual. + (1 |A) would give me an intercept per species. yes + (1 |A:B) would give me an intercept per individuals with nested effect (individual inside species) This would be the same as (1|B) if the individuals are uniquely identified. Otherwise you probably want (1|A/B) [except that you can't really fit a random effect for k=2, as discussed above] I would like to have random slopes per species. So I thought I could do something like that : Probably not feasible. + (A |B) so to have an intercept per individual and a slope value per species. Graphically, I would therefore obtain 100 lines with 100 different intercepts and 2 possible slopes (1 per species). However, when I extract random parameter values (ranef()), I have : what variable is your slope with respect to? Suppose it's time. Then I would recommend ~ A*time + (1|A:B) which will fit a (FIXED effect) interaction between species and time (different slopes and intercepts for each species), and a random intercept per individual. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making Sure your matrices are even
Try this S2 - data.frame(Group=rep(S, length(S)), Cat=factor(S)) B2 - data.frame(Group=rep(B, length(B)), Cat=factor(B)) V2 - data.frame(Group=rep(V, length(V)), Cat=factor(V)) table(rbind(S2, B2, V2)) - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Docbanks84 Sent: Monday, August 12, 2013 5:31 PM To: r-help@r-project.org Subject: [R] Making Sure your matrices are even Hi, I am trying to do a chi sqaure on a set of values, and my different groups are not even. Is there away to add arbetrary symbols or #s to make the matrices even? Or do I need to do a different type of pvalue analysis? S-1:86 B-1:15 V-1:45 table(S) S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 85 86 1 1 chisq.test(table(S,B,V)) Error in table(S, B, V) : all arguments must have the same length -- View this message in context: http://r.789695.n4.nabble.com/Making-Sure-your-matrices-are-even -tp4673598.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding S4 method dispatch
The class AB inherits from A and from B, but B already inherits from class A. So actually you only have an object of class B in your object of class AB. When you call the function f R looks for a method f for AB objects. It does not find such a method and looks for a method of the object inherited from, B. Such a method is present and is then executed. The inheritance structure has to be changed. The behavior is actually desired, as if this behavior weren't given a diamond class inheritance would be fatal. Are you sure? That behaviour doesn't agree with the description of method dispatch given in ?Methods, not with getClass(AB) which shows that AB inherits from both A and B. (I totally agree that this is a bad idea, and unlikely to be useful in real life, but I'm trying to understand the details of S4 dispatch) getClass(AB) Class AB [in .GlobalEnv] Slots: Name: .xData Class: NULL Extends: Class B, directly Class A, directly Class .NULL, by class A, distance 2 Class NULL, by class A, distance 3, with explicit coerce Class OptionalFunction, by class A, distance 4, with explicit coerce Class optionalMethod, by class A, distance 4, with explicit coerce -- Chief Scientist, RStudio http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coxph diagnostics
Thank you for your response, Terry. To put the discussion into perspective, my data set is quite large with over 160,000 samples and 38 variables. The event is true for all samples in this dataset. The distribution is zero-inflated (i.e. most events occur at time = 0). The result of the cox.zph looks like this: cox.zph(coxph1) rhochisqp agency1 -1.05e-02 9.06e+00 2.62e-03 agency2 -5.48e-03 2.47e+00 1.16e-01 agency3 -6.47e-03 3.45e+00 6.34e-02 agency4 -6.86e-03 3.87e+00 4.90e-02 agency5 -5.56e-03 2.54e+00 1.11e-01 agency6 -6.79e-03 3.79e+00 5.16e-02 agency7 -4.78e-03 1.88e+00 1.71e-01 agency8 -1.34e-02 1.48e+01 1.22e-04 agency9 -2.78e-03 6.34e-01 4.26e-01 agency10 -6.15e-03 3.11e+00 7.78e-02 agency11 4.82e-04 1.91e-02 8.90e-01 agency12 -4.38e-03 1.58e+00 2.09e-01 agency13 -1.02e-03 8.54e-02 7.70e-01 agency14 -5.44e-03 2.43e+00 1.19e-01 agency15 1.01e-02 8.41e+00 3.73e-03 agency16 -1.81e-03 2.70e-01 6.04e-01 agency17 -3.14e-03 8.12e-01 3.67e-01 agency18 -6.59e-03 3.57e+00 5.88e-02 agency19 1.60e-03 2.12e-01 6.46e-01 agency20 -1.24e-02 1.27e+01 3.74e-04 agency21 -9.02e-03 6.69e+00 9.68e-03 agency22 -5.84e-03 2.81e+00 9.38e-02 agency23 3.99e-03 1.31e+00 2.52e-01 agency24 -9.18e-03 6.93e+00 8.50e-03 agency25 -4.75e-03 1.86e+00 1.73e-01 category1 -1.31e-02 1.43e+01 1.60e-04 category2 1.34e-04 1.47e-03 9.69e-01 category3 7.61e-03 4.75e+00 2.92e-02 category4 -6.65e-03 3.69e+00 5.48e-02 category5 -7.78e-03 4.97e+00 2.58e-02 category6 -8.64e-03 6.12e+00 1.34e-02 fav_count 1.32e-02 1.46e+01 1.32e-04 fow_count -1.83e-02 2.50e+01 5.70e-07 fri_count 9.20e-03 6.89e+00 8.67e-03 stat_count 1.01e-02 9.08e+00 2.58e-03 ht 1.37e-02 1.53e+01 9.08e-05 ul 1.36e-02 1.52e+01 9.67e-05 um -1.12e-02 1.04e+01 1.24e-03 pos -5.92e-04 2.90e-02 8.65e-01 neg 6.44e-03 3.39e+00 6.56e-02 acti 2.24e-03 4.12e-01 5.21e-01 anat 3.48e-03 9.96e-01 3.18e-01 chemi -7.82e-03 5.04e+00 2.47e-02 conc 7.04e-05 4.08e-04 9.84e-01 devi-1.34e-03 1.48e-01 7.01e-01 diso-3.60e-03 1.06e+00 3.04e-01 gene 1.31e-03 1.41e-01 7.07e-01 geog 4.64e-03 1.78e+00 1.82e-01 livb-1.19e-02 1.17e+01 6.24e-04 objc 3.87e-03 1.23e+00 2.67e-01 occu 6.06e-04 3.04e-02 8.62e-01 orga-8.24e-04 5.63e-02 8.12e-01 phen 3.87e-03 1.23e+00 2.68e-01 phys-1.94e-03 3.12e-01 5.77e-01 proc 2.23e-03 4.11e-01 5.22e-01 GLOBAL NA 4.20e+02 0.00e+00 The slope of the plot.cox.zph is perfectly 0 for all variables with narrow confidence bands. I probably should have put this details in the first post but it would have been too long. Sorry about that. Based on the plot of Schoenfeld residuals and Terry's explanation is it safe to say that proportional hazards assumption holds despite the significant global p-values? Thanks! On Tue, Aug 13, 2013 at 9:16 AM, Terry Therneau thern...@mayo.edu wrote: That's the primary reason for the plot: so that you can look and think. The test statistic is based on whether a LS line fit to the plot has zero slope. For larger data sets you can sometimes have a significant p-value but good agreement with proportional hazards. It's much like an example from Lincoln Moses' begining statistics book (now out of print, so rephrasing from memory). Suppose that you flip a coin 10,000 times and get 5101 heads. What can you say? a. The coin is not perfectly fair (p.05). b. But it is darn close to perfect! As a referee I would be comfortable using that coin to start a football game. The Cox model gives an average hazard ratio, averaged over time. When proportional hazards holds that value is a complete summary-- nothing else is needed.When it does not hold, the average may still be useful, or not, depending on the degree of change over time. Terry Therneau On 08/13/2013 05:00 AM, r-help-requ...@r-project.org wrote: Thanks to Bert and G?ran for your responses. To answer G?ran's comment, yes I did plot the Schoenfeld residuals using plot.cox.zph and the lines look horizontal (slope = 0) to me, which makes me think that it contradicts the results of cox.zph. What alternatives do I have if I assume proportional assumption of coxph does not hold? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] pulling out pairs from data frame
Oops! Ok So I have this file: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 3 1 75 F 4 15 56 F 5 2 58 F 6 4 80 M And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M .kripa Date: Mon, 12 Aug 2013 18:36:08 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The question is not clear so not sure this is what you wanted. dat1- read.table(text= SameName áIndividual áAge Gender 1 4 á80 áM á 2 15 á56 F 3 1 75 áF 4 15 á56 áF 5 á2 á58 áF 6 4 á80 áM ,sep=,header=TRUE,stringsAsFactors=FALSE) reps-c(4,15)á ádat1$Newcol-as.numeric(dat1$Individual%in% reps) ádat1 # áSameName Individual Age Gender Newcol #1 á á á á1 á á á á á4 á80 á á áM á á á1 #2 á á á á2 á á á á 15 á56 á á áF á á á1 #3 á á á á3 á á á á á1 á75 á á áF á á á0 #4 á á á á4 á á á á 15 á56 á á áF á á á1 #5 á á á á5 á á á á á2 á58 á á áF á á á0 #6 á á á á6 á á á á á4 á80 á á áM á á á1 A.K.á - Original Message - From: Kripa R kripa...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, August 12, 2013 6:59 PM Subject: [R] pulling out pairs from data frame Hello everyone, I'm having trouble pulling out paired samples from a data set... I have the following: reps-c(4,15) #the variable reps is a list of all paired samples data á á SameName á á á Individual á á á Age á á á Gender á á á 1 á á á 4 á á á 80 á á á M á á á 2 á á á 15 á á á 56 á á á F á á á 3 á á á 1 á á á 75 á á á F á á á 4 á á á 15 á á á 56 á á á F á á á 5 á á á 2 á á á 58 á á á F á á á 6 á á á 4 á á á 80 á á á M á I'd like to make a new variable with only the samples that have pairs. Any suggestions would be greatly appreciated Thanks! .kripa ááá ááá ááá á ááá ááá á ááá [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create rows for columns in dataframe
Hi experts, I have a dataframe with 100k+ records. it has a key/id column and 25 code columns. I would like to restructure it having a row for each code column. I have a structure like this (used dput): structure(list(DSYSRTKY = structure(c(1L, 2L, 3L, 3L, 4L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(10005, 10203, 10315, 10327), class = factor), C1 = structure(c(6L, 3L, 2L, 5L, 1L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(41401, 42831, 45341, 486, 5990, 71535), class = factor), C2 = structure(c(5L, 1L, 3L, 6L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(4019, 51881, 5990, 6826, 78900, V4986), class = factor), C3 = structure(c(6L, 3L, 5L, 2L, 4L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(5119, 5939, 72400, 7850, 8052, V1251), class = factor), C4 = structure(c(6L, 5L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(3109, 4019, 4241, 42789, V1011, V454), class = factor), C5 = structure(c(1L, 1L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2720, 4019, 7823), class = factor), C6 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 311, 41400, 49390), class = factor), C7 = structure(c(1L, 1L, 2L, 1L, 3L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, 2859, V4581), class = factor), C8 = structure(c(1L, 1L, 3L, 1L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40390, 71680, 79029), class = factor), C9 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4168, 5859, V1582), class = factor), C10 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 49390, 7804), class = factor), C11 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, V066), class = factor), C12 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 6930), class = factor), C13 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 41400), class = factor), C14 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, V4581), class = factor), C15 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40291), class = factor), C16 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4280), class = factor), C17 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C18 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C19 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C20 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C21 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C22 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C23 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C24 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C25 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor)), .Names = c(DSYSRTKY, C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25), row.names = c(1, 2, 3, 4, 5, 6), class = data.frame) Now I want to restructure this dataframe not having 25 code fields but a row for each code but only if the code has a value! The new structure should look something like: NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The ID column should just be an increment. PRIMAIRY is a boolean which should be true if orriginally was the first code (C1). It has to be efficient since my real data has many more rows than my example structure of only 6 rows. I tried some looping mechanism and it was working but it was not performing at all. Hopefully I provided enough information using dput. Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice: bwplot - changing box colors in legend and plot when using panel.groups = function... and panel = panel.superpose
Hi, Yes, I have searched stack overflow. My issue is to simply change coloring in boxes and legend in my bwplot. I have done this many times in lattice, but now I have been tweaking the plot somewhat and I can no longer apply the color changes. I would really appreciate some help. A. Zakrisson Here is some dummy data and my script: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:2), each = 16)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20))) font.settings - list( font = 1, cex = 1, fontfamily = serif) my.theme - list( box.umbrella = list(col = black), box.rectangle = list(fill= rep(c(black, black),2)), box.dot = list(col = black, pch = 3, cex=2), plot.symbol = list(cex = 1, col = 1, pch= 0), #outlier size and color par.xlab.text = font.settings, par.ylab.text = font.settings, axis.text = font.settings, par.sub=font.settings) bwplot(var1 ~ factor1, data = mydata, groups = factor2, box.width = 1/3,#width of the boxes auto.key = list(points = FALSE, rectangles = TRUE, space = right, title=Year, cex.title=1), panel = panel.superpose, ylab = var1, xlab=factor1, par.settings = my.theme, panel.groups = function(x, y, ..., group.number) { panel.bwplot(x + (group.number-1.8)/3, y, ...) }) Anna Zakrisson Braeunlich PhD student Department of Ecology, Environment and Plant Sciences Stockholm University Svante Arrheniusv. 21A SE-106 91 Stockholm Sweden/Sverige Lives in Berlin. For paper mail: Katzbachstr. 21 D-10965, Berlin - Kreuzberg Germany/Deutschland E-mail: anna.zakris...@su.se Tel work: +49-(0)3091541281 Mobile: +49-(0)15777374888 LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b º`. . `. . `. . º`. . `. . `. .º`. . `. . `. .º [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic problem in R
I am teaching a summercourse this and the next week where the students are using R. We have downloaded and use the new version R.3.0. It has worked perfectly until today where some of the basic functions have started NOT to work. Examples are sd() and lm () The message we get is Error: could not find function lm or Error: could not find function sd Have you ever encountered that. If yes what can I do about it. Is it a basic error in the new R version? I have used R for teaching in 5 years now and hae ever encountered a problem like that?? Thanks, Sinne [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making Sure your matrices are even
Thank you David. I had to sort the data afterwards for it to work: S-1:86 B-1:15 V-1:45 S2 - data.frame(Group=rep(S, length(S)), Cat=factor(S)) B2 - data.frame(Group=rep(B, length(B)), Cat=factor(B)) V2 - data.frame(Group=rep(V, length(V)), Cat=factor(V)) table(rbind(S2, B2, V2)) y-table(rbind(S2,B2,V2)) sort.list(y) chisq.test(table(y)) This resulted in a chisquare of S2 v. B2 v. V2. -- View this message in context: http://r.789695.n4.nabble.com/Making-Sure-your-matrices-are-even-tp4673598p4673626.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding S4 method dispatch
If you take an example which works with slots, setClass(A, representation(a = numeric) setClass(B, contains = c(A), representation(b = numeric)) a - new(A, a = 2) b - new(B, a = 3, b = 2) setClass(AB, contains = c(A, B)) new(AB, a = 2, b = 3) You see, that there is only one @a slot, the one inherited from B, that B inherits from A. If this were not the case, which slot should be taken, if we would call @a? To avoid this kind of ambiguity, only one A class is inherited to AB: the one B already inherits from A. You could create a class, that contains another A object in a slot: setClass(AandB, contains = c(B), representation(A = A)) new(AandB, a = 2, b = 3, A = new(A, a = 3)) Now back to your example: as there is only one A object inside the B object which is contained by the AB object, method dispatch works the way as it should: It looks for a method f for an AB object. It does not find one. Then it looks for a method f for the contained B object (as this is the only one contained in AB) and it finds a method. Then it calls this method on the B part of the object AB and the result is B-B Best Simon On Aug 13, 2013, at 4:24 PM, Hadley Wickham h.wick...@gmail.com wrote: The class AB inherits from A and from B, but B already inherits from class A. So actually you only have an object of class B in your object of class AB. When you call the function f R looks for a method f for AB objects. It does not find such a method and looks for a method of the object inherited from, B. Such a method is present and is then executed. The inheritance structure has to be changed. The behavior is actually desired, as if this behavior weren't given a diamond class inheritance would be fatal. Are you sure? That behaviour doesn't agree with the description of method dispatch given in ?Methods, not with getClass(AB) which shows that AB inherits from both A and B. (I totally agree that this is a bad idea, and unlikely to be useful in real life, but I'm trying to understand the details of S4 dispatch) getClass(AB) Class AB [in .GlobalEnv] Slots: Name: .xData Class: NULL Extends: Class B, directly Class A, directly Class .NULL, by class A, distance 2 Class NULL, by class A, distance 3, with explicit coerce Class OptionalFunction, by class A, distance 4, with explicit coerce Class optionalMethod, by class A, distance 4, with explicit coerce -- Chief Scientist, RStudio http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pulling out pairs from data frame
Hi, The conditions are still not clear. dat2- dat1[dat1$Individual%in% reps,] dat2 # SameName Individual Age Gender #1 1 4 80 M #2 2 15 56 F #4 4 15 56 F #6 6 4 80 M A.K. From: Kripa R kripa...@hotmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, August 13, 2013 10:56 AM Subject: RE: [R] pulling out pairs from data frame Oops! Ok So I have this file: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 3 1 75 F 4 15 56 F 5 2 58 F 6 4 80 M And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M .kripa Date: Mon, 12 Aug 2013 18:36:08 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The question is not clear so not sure this is what you wanted. dat1- read.table(text= SameName áIndividual áAge Gender 1 4 á80 áM á 2 15 á56 F 3 1 75 áF 4 15 á56 áF 5 á2 á58 áF 6 4 á80 áM ,sep=,header=TRUE,stringsAsFactors=FALSE) reps-c(4,15)á ádat1$Newcol-as.numeric(dat1$Individual%in% reps) ádat1 # áSameName Individual Age Gender Newcol #1 á á á á1 á á á á á4 á80 á á áM á á á1 #2 á á á á2 á á á á 15 á56 á á áF á á á1 #3 á á á á3 á á á á á1 á75 á á áF á á á0 #4 á á á á4 á á á á 15 á56 á á áF á á á1 #5 á á á á5 á á á á á2 á58 á á áF á á á0 #6 á á á á6 á á á á á4 á80 á á áM á á á1 A.K.á - Original Message - From: Kripa R kripa...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, August 12, 2013 6:59 PM Subject: [R] pulling out pairs from data frame Hello everyone, I'm having trouble pulling out paired samples from a data set... I have the following: reps-c(4,15) #the variable reps is a list of all paired samples data á á SameName á á á Individual á á á Age á á á Gender á á á 1 á á á 4 á á á 80 á á á M á á á 2 á á á 15 á á á 56 á á á F á á á 3 á á á 1 á á á 75 á á á F á á á 4 á á á 15 á á á 56 á á á F á á á 5 á á á 2 á á á 58 á á á F á á á 6 á á á 4 á á á 80 á á á M á I'd like to make a new variable with only the samples that have pairs. Any suggestions would be greatly appreciated Thanks! .kripa ááá ááá ááá á ááá ááá á ááá [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pulling out pairs from data frame
I manipulated the code you sent and it works perfectly, thanks! .kripa Date: Tue, 13 Aug 2013 08:10:53 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The conditions are still not clear. dat2- dat1[dat1$Individual%in% reps,] dat2 # SameName Individual Age Gender #11 4 80 M #22 15 56 F #44 15 56 F #66 4 80 M A.K. From: Kripa R kripa...@hotmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, August 13, 2013 10:56 AM Subject: RE: [R] pulling out pairs from data frame Oops! Ok So I have this file: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 3 1 75 F 4 15 56 F 5 2 58 F 6 4 80 M And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M .kripa Date: Mon, 12 Aug 2013 18:36:08 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The question is not clear so not sure this is what you wanted. dat1- read.table(text= SameName áIndividual áAge Gender 1 4 á80 áM á 2 15 á56 F 3 1 75 áF 4 15 á56 áF 5 á2 á58 áF 6 4 á80 áM ,sep=,header=TRUE,stringsAsFactors=FALSE) reps-c(4,15)á ádat1$Newcol-as.numeric(dat1$Individual%in% reps) ádat1 # áSameName Individual Age Gender Newcol #1 á á á á1 á á á á á4 á80 á á áM á á á1 #2 á á á á2 á á á á 15 á56 á á áF á á á1 #3 á á á á3 á á á á á1 á75 á á áF á á á0 #4 á á á á4 á á á á 15 á56 á á áF á á á1 #5 á á á á5 á á á á á2 á58 á á áF á á á0 #6 á á á á6 á á á á á4 á80 á á áM á á á1 A.K.á - Original Message - From: Kripa R kripa...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, August 12, 2013 6:59 PM Subject: [R] pulling out pairs from data frame Hello everyone, I'm having trouble pulling out paired samples from a data set... I have the following: reps-c(4,15) #the variable reps is a list of all paired samples data á á SameName á á á Individual á á á Age á á á Gender á á á 1 á á á 4 á á á 80 á á á M á á á 2 á á á 15 á á á 56 á á á F á á á 3 á á á 1 á á á 75 á á á F á á á 4 á á á 15 á á á 56 á á á F á á á 5 á á á 2 á á á 58 á á á F á á á 6 á á á 4 á á á 80 á á á M á I'd like to make a new variable with only the samples that have pairs. Any suggestions would be greatly appreciated Thanks! .kripa ááá ááá ááá á ááá ááá á ááá [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Outliers and overdispersion
Hi again, I have a question on some outliers that I have in my response variable (wich are bird counts). At the beginning I did not drop them out because they are part of the normal counts and I considered them ecologically correct. However, I tried some of the same models without ouliers and the AICs are thus better. I also have nice significances this way... So would you say that, even though the outliers are right observations and taking into consideration that already the negative binomial distribution that I am using is accounting for the some of the overdispersion due to the outliers, it is better to drop them out as the models fit better this way? Thanks for your patience! :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
HI, Your desired output is not clear. May be this helps: #dat1 is the dataset dat1$ID- 1:nrow(dat1) library(reshape2) res1-melt(dat1,id.vars=c(ID,DSYSRTKY)) res1$value-res1$value!= res1[,2]- as.integer(as.character(res1[,2])) res1[,3]-as.character(res1[,3]) colnames(res1)[3:4]-c(CODE,PRIMARY) head(res1) # ID DSYSRTKY CODE PRIMARY #1 1 10005 C1 TRUE #2 2 10203 C1 TRUE #3 3 10315 C1 TRUE #4 4 10315 C1 TRUE #5 5 10327 C1 TRUE #6 6 10327 C1 TRUE A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 5:46 AM Subject: [R] Create rows for columns in dataframe Hi experts, I have a dataframe with 100k+ records. it has a key/id column and 25 code columns. I would like to restructure it having a row for each code column. I have a structure like this (used dput): structure(list(DSYSRTKY = structure(c(1L, 2L, 3L, 3L, 4L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(10005, 10203, 10315, 10327), class = factor), C1 = structure(c(6L, 3L, 2L, 5L, 1L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(41401, 42831, 45341, 486, 5990, 71535), class = factor), C2 = structure(c(5L, 1L, 3L, 6L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(4019, 51881, 5990, 6826, 78900, V4986), class = factor), C3 = structure(c(6L, 3L, 5L, 2L, 4L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(5119, 5939, 72400, 7850, 8052, V1251), class = factor), C4 = structure(c(6L, 5L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(3109, 4019, 4241, 42789, V1011, V454), class = factor), C5 = structure(c(1L, 1L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2720, 4019, 7823), class = factor), C6 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 311, 41400, 49390), class = factor), C7 = structure(c(1L, 1L, 2L, 1L, 3L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, 2859, V4581), class = factor), C8 = structure(c(1L, 1L, 3L, 1L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40390, 71680, 79029), class = factor), C9 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4168, 5859, V1582), class = factor), C10 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 49390, 7804), class = factor), C11 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, V066), class = factor), C12 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 6930), class = factor), C13 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 41400), class = factor), C14 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, V4581), class = factor), C15 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40291), class = factor), C16 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4280), class = factor), C17 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C18 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C19 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C20 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C21 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C22 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C23 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C24 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C25 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor)), .Names = c(DSYSRTKY, C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25), row.names = c(1, 2, 3, 4, 5, 6), class = data.frame) Now I want to restructure this dataframe not having 25 code fields but a row for each code but only if the code has a value! The new structure should look something like: NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The ID column should just be an increment. PRIMAIRY is a boolean which should be true if orriginally was the first code (C1). It has to be efficient since my real data has many more rows than my example structure of only 6 rows. I tried some looping mechanism and it was working but it was not performing at all.
Re: [R] Outliers and overdispersion
Thanks for your interest and prompt answer! What I try to estimate is the correlation of one bird species counts with a set of environmental parameters. The count data are zero-inflated and overdispersed. I am modeling with hurdle-negative binomial-mixed effects. The results are very difficult to interpret and it get easier dropping out 3 outliers. But I do not know if I should do this.. Thanks! Marta Subject: Re: [R] Outliers and overdispersion From: szehn...@uni-bonn.de Date: Tue, 13 Aug 2013 17:41:10 +0200 CC: r-help@r-project.org To: lomasv...@hotmail.com I do not know what you are exactly estimating, but if it is about count models and the model fit gets better when you drop the outliers, it does not say, that the model is now more correct. It just says, if the data were without the outliers, this model would fit good. Overdispersion in count data is sometimes a cue, that you have a mixture distribution as the generating process - for example instead of one, K different (sub)species of birds which were aggregated in the count data. In this case a mixture (negative binomial)- distribution with K components could fit the data better. Best Simon On Aug 13, 2013, at 5:28 PM, Marta Lomas lomasv...@hotmail.com wrote: Hi again, I have a question on some outliers that I have in my response variable (wich are bird counts). At the beginning I did not drop them out because they are part of the normal counts and I considered them ecologically correct. However, I tried some of the same models without ouliers and the AICs are thus better. I also have nice significances this way... So would you say that, even though the outliers are right observations and taking into consideration that already the negative binomial distribution that I am using is accounting for the some of the overdispersion due to the outliers, it is better to drop them out as the models fit better this way? Thanks for your patience! :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outliers and overdispersion
I do not know what you are exactly estimating, but if it is about count models and the model fit gets better when you drop the outliers, it does not say, that the model is now more correct. It just says, if the data were without the outliers, this model would fit good. Overdispersion in count data is sometimes a cue, that you have a mixture distribution as the generating process - for example instead of one, K different (sub)species of birds which were aggregated in the count data. In this case a mixture (negative binomial)- distribution with K components could fit the data better. Best Simon On Aug 13, 2013, at 5:28 PM, Marta Lomas lomasv...@hotmail.com wrote: Hi again, I have a question on some outliers that I have in my response variable (wich are bird counts). At the beginning I did not drop them out because they are part of the normal counts and I considered them ecologically correct. However, I tried some of the same models without ouliers and the AICs are thus better. I also have nice significances this way... So would you say that, even though the outliers are right observations and taking into consideration that already the negative binomial distribution that I am using is accounting for the some of the overdispersion due to the outliers, it is better to drop them out as the models fit better this way? Thanks for your patience! :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pulling out pairs from data frame
?duplicated yourframe[!duplicated(yourframe)$Individual,] -- Bert On Tue, Aug 13, 2013 at 8:12 AM, Kripa R kripa...@hotmail.com wrote: I manipulated the code you sent and it works perfectly, thanks! .kripa Date: Tue, 13 Aug 2013 08:10:53 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The conditions are still not clear. dat2- dat1[dat1$Individual%in% reps,] dat2 # SameName Individual Age Gender #11 4 80 M #22 15 56 F #44 15 56 F #66 4 80 M A.K. From: Kripa R kripa...@hotmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, August 13, 2013 10:56 AM Subject: RE: [R] pulling out pairs from data frame Oops! Ok So I have this file: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 3 1 75 F 4 15 56 F 5 2 58 F 6 4 80 M And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M .kripa Date: Mon, 12 Aug 2013 18:36:08 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The question is not clear so not sure this is what you wanted. dat1- read.table(text= SameName áIndividual áAge Gender 1 4 á80 áM á 2 15 á56 F 3 1 75 áF 4 15 á56 áF 5 á2 á58 áF 6 4 á80 áM ,sep=,header=TRUE,stringsAsFactors=FALSE) reps-c(4,15)á ádat1$Newcol-as.numeric(dat1$Individual%in% reps) ádat1 # áSameName Individual Age Gender Newcol #1 á á á á1 á á á á á4 á80 á á áM á á á1 #2 á á á á2 á á á á 15 á56 á á áF á á á1 #3 á á á á3 á á á á á1 á75 á á áF á á á0 #4 á á á á4 á á á á 15 á56 á á áF á á á1 #5 á á á á5 á á á á á2 á58 á á áF á á á0 #6 á á á á6 á á á á á4 á80 á á áM á á á1 A.K.á - Original Message - From: Kripa R kripa...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, August 12, 2013 6:59 PM Subject: [R] pulling out pairs from data frame Hello everyone, I'm having trouble pulling out paired samples from a data set... I have the following: reps-c(4,15) #the variable reps is a list of all paired samples data á á SameName á á á Individual á á á Age á á á Gender á á á 1 á á á 4 á á á 80 á á á M á á á 2 á á á 15 á á á 56 á á á F á á á 3 á á á 1 á á á 75 á á á F á á á 4 á á á 15 á á á 56 á á á F á á á 5 á á á 2 á á á 58 á á á F á á á 6 á á á 4 á á á 80 á á á M á I'd like to make a new variable with only the samples that have pairs. Any suggestions would be greatly appreciated Thanks! .kripa ááá ááá ááá á ááá ááá á ááá [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outliers and overdispersion
The central question is: What caused the 3 unusual values? What is their scientific relevance? Only you can answer that, not us. -- Bert On Tue, Aug 13, 2013 at 8:51 AM, Marta Lomas lomasv...@hotmail.com wrote: Thanks for your interest and prompt answer! What I try to estimate is the correlation of one bird species counts with a set of environmental parameters. The count data are zero-inflated and overdispersed. I am modeling with hurdle-negative binomial-mixed effects. The results are very difficult to interpret and it get easier dropping out 3 outliers. But I do not know if I should do this.. Thanks! Marta Subject: Re: [R] Outliers and overdispersion From: szehn...@uni-bonn.de Date: Tue, 13 Aug 2013 17:41:10 +0200 CC: r-help@r-project.org To: lomasv...@hotmail.com I do not know what you are exactly estimating, but if it is about count models and the model fit gets better when you drop the outliers, it does not say, that the model is now more correct. It just says, if the data were without the outliers, this model would fit good. Overdispersion in count data is sometimes a cue, that you have a mixture distribution as the generating process - for example instead of one, K different (sub)species of birds which were aggregated in the count data. In this case a mixture (negative binomial)- distribution with K components could fit the data better. Best Simon On Aug 13, 2013, at 5:28 PM, Marta Lomas lomasv...@hotmail.com wrote: Hi again, I have a question on some outliers that I have in my response variable (wich are bird counts). At the beginning I did not drop them out because they are part of the normal counts and I considered them ecologically correct. However, I tried some of the same models without ouliers and the AICs are thus better. I also have nice significances this way... So would you say that, even though the outliers are right observations and taking into consideration that already the negative binomial distribution that I am using is accounting for the some of the overdispersion due to the outliers, it is better to drop them out as the models fit better this way? Thanks for your patience! :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pulling out pairs from data frame
Sorry. Typo. Corrected version is: yourframe[!duplicated(yourframe$Individual),] -- Bert On Tue, Aug 13, 2013 at 9:05 AM, Bert Gunter bgun...@gene.com wrote: ?duplicated yourframe[!duplicated(yourframe)$Individual,] -- Bert On Tue, Aug 13, 2013 at 8:12 AM, Kripa R kripa...@hotmail.com wrote: I manipulated the code you sent and it works perfectly, thanks! .kripa Date: Tue, 13 Aug 2013 08:10:53 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The conditions are still not clear. dat2- dat1[dat1$Individual%in% reps,] dat2 # SameName Individual Age Gender #11 4 80 M #22 15 56 F #44 15 56 F #66 4 80 M A.K. From: Kripa R kripa...@hotmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, August 13, 2013 10:56 AM Subject: RE: [R] pulling out pairs from data frame Oops! Ok So I have this file: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 3 1 75 F 4 15 56 F 5 2 58 F 6 4 80 M And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M .kripa Date: Mon, 12 Aug 2013 18:36:08 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The question is not clear so not sure this is what you wanted. dat1- read.table(text= SameName áIndividual áAge Gender 1 4 á80 áM á 2 15 á56 F 3 1 75 áF 4 15 á56 áF 5 á2 á58 áF 6 4 á80 áM ,sep=,header=TRUE,stringsAsFactors=FALSE) reps-c(4,15)á ádat1$Newcol-as.numeric(dat1$Individual%in% reps) ádat1 # áSameName Individual Age Gender Newcol #1 á á á á1 á á á á á4 á80 á á áM á á á1 #2 á á á á2 á á á á 15 á56 á á áF á á á1 #3 á á á á3 á á á á á1 á75 á á áF á á á0 #4 á á á á4 á á á á 15 á56 á á áF á á á1 #5 á á á á5 á á á á á2 á58 á á áF á á á0 #6 á á á á6 á á á á á4 á80 á á áM á á á1 A.K.á - Original Message - From: Kripa R kripa...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, August 12, 2013 6:59 PM Subject: [R] pulling out pairs from data frame Hello everyone, I'm having trouble pulling out paired samples from a data set... I have the following: reps-c(4,15) #the variable reps is a list of all paired samples data á á SameName á á á Individual á á á Age á á á Gender á á á 1 á á á 4 á á á 80 á á á M á á á 2 á á á 15 á á á 56 á á á F á á á 3 á á á 1 á á á 75 á á á F á á á 4 á á á 15 á á á 56 á á á F á á á 5 á á á 2 á á á 58 á á á F á á á 6 á á á 4 á á á 80 á á á M á I'd like to make a new variable with only the samples that have pairs. Any suggestions would be greatly appreciated Thanks! .kripa ááá ááá ááá á ááá ááá á ááá [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting rid of .Rhistory and .RData
Dear R users, occasionally I find .Rhistory and/or .RData files cluttered around in my file structure. Is there a way to tell R not to save such files? Or to use one central location where to save them (if they are of any use)? I have looked through options() to no avail. Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice: bwplot - changing box colors in legend and plot when using panel.groups = function... and panel = panel.superpose
I don't see a question in what you wrote. Your graph has some similarities to some of my examples. Please look at the demo in the HH package ## install.packages(HH) ## if necessary library(HH) demo(bwplot.examples, package=HH) Rich On Tue, Aug 13, 2013 at 10:00 AM, Anna Zakrisson Braeunlich anna.zakris...@su.se wrote: Hi, Yes, I have searched stack overflow. My issue is to simply change coloring in boxes and legend in my bwplot. I have done this many times in lattice, but now I have been tweaking the plot somewhat and I can no longer apply the color changes. I would really appreciate some help. A. Zakrisson Here is some dummy data and my script: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:2), each = 16)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20))) font.settings - list( font = 1, cex = 1, fontfamily = serif) my.theme - list( box.umbrella = list(col = black), box.rectangle = list(fill= rep(c(black, black),2)), box.dot = list(col = black, pch = 3, cex=2), plot.symbol = list(cex = 1, col = 1, pch= 0), #outlier size and color par.xlab.text = font.settings, par.ylab.text = font.settings, axis.text = font.settings, par.sub=font.settings) bwplot(var1 ~ factor1, data = mydata, groups = factor2, box.width = 1/3,#width of the boxes auto.key = list(points = FALSE, rectangles = TRUE, space = right, title=Year, cex.title=1), panel = panel.superpose, ylab = var1, xlab=factor1, par.settings = my.theme, panel.groups = function(x, y, ..., group.number) { panel.bwplot(x + (group.number-1.8)/3, y, ...) }) Anna Zakrisson Braeunlich PhD student Department of Ecology, Environment and Plant Sciences Stockholm University Svante Arrheniusv. 21A SE-106 91 Stockholm Sweden/Sverige Lives in Berlin. For paper mail: Katzbachstr. 21 D-10965, Berlin - Kreuzberg Germany/Deutschland E-mail: anna.zakris...@su.se Tel work: +49-(0)3091541281 Mobile: +49-(0)15777374888 LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b º`. . `. . `. . º`. . `. . `. .º`. . `. . `. .º [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to store and manipulate survey data like this?
I have to process a set of survey data with questions that are formatted like this; 1) Pick your top three breeds (pick 3) 1 Rottweiler 2 Pit Bull 3 German Shepard 4 Poodle 5 Border Collie 6 Dalmation 7 Mixed Breed and the answers are formatted like this: Respondent, Question1 1, 1,4,7 2, 2,7,5 3, 6,3,5 4, ... Any suggestions on how to preprocess the file to be able to do things like frequency analysis for breeds? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice: bwplot - changing box colors in legend and plot when using panel.groups = function... and panel = panel.superpose
I think I understand your question. You need to make sure that you are setting the right parameters in your theme. Use trellis.par.get() to have a look at the MANY possible settings. For example, in your case, to have the boxplots and rectangles be the same color: my.theme - list( box.umbrella = list(col = black), box.rectangle = list(fill= rep(c(black, black),2)), box.dot = list(col = black, pch = 3, cex=2), plot.symbol = list(cex = 1, col = 1, pch= 0), #outlier size and color par.xlab.text = font.settings, par.ylab.text = font.settings, axis.text = font.settings, #strip.shingle=list(col=c(red,blue)), superpose.symbol=list(fill=c(red,blue)), # boxplots #superpose.fill=list(col=c(red,blue)), superpose.polygon=list(col=c(red,blue)), # legend par.sub=font.settings) Kevin Wright On Tue, Aug 13, 2013 at 9:00 AM, Anna Zakrisson Braeunlich anna.zakris...@su.se wrote: Hi, Yes, I have searched stack overflow. My issue is to simply change coloring in boxes and legend in my bwplot. I have done this many times in lattice, but now I have been tweaking the plot somewhat and I can no longer apply the color changes. I would really appreciate some help. A. Zakrisson Here is some dummy data and my script: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:2), each = 16)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20))) font.settings - list( font = 1, cex = 1, fontfamily = serif) my.theme - list( box.umbrella = list(col = black), box.rectangle = list(fill= rep(c(black, black),2)), box.dot = list(col = black, pch = 3, cex=2), plot.symbol = list(cex = 1, col = 1, pch= 0), #outlier size and color par.xlab.text = font.settings, par.ylab.text = font.settings, axis.text = font.settings, par.sub=font.settings) bwplot(var1 ~ factor1, data = mydata, groups = factor2, box.width = 1/3,#width of the boxes auto.key = list(points = FALSE, rectangles = TRUE, space = right, title=Year, cex.title=1), panel = panel.superpose, ylab = var1, xlab=factor1, par.settings = my.theme, panel.groups = function(x, y, ..., group.number) { panel.bwplot(x + (group.number-1.8)/3, y, ...) }) Anna Zakrisson Braeunlich PhD student Department of Ecology, Environment and Plant Sciences Stockholm University Svante Arrheniusv. 21A SE-106 91 Stockholm Sweden/Sverige Lives in Berlin. For paper mail: Katzbachstr. 21 D-10965, Berlin - Kreuzberg Germany/Deutschland E-mail: anna.zakris...@su.se Tel work: +49-(0)3091541281 Mobile: +49-(0)15777374888 LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b º`. . `. . `. . º`. . `. . `. .º`. . `. . `. .º [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Kevin Wright [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outliers and overdispersion
Thanks Bert! I think they are relatively important. What I am doing is comparing 2003 with 2013 distribution and use of this species in an specific sampled area. They are currently way lower numbers than in 2003, however in both years the data are zero inflated. Most of the outliers are in 2003 when they were quite more birds. On the other hand, the behavior of the species is very social (ruffs) so where they are 5 birds, they could be 300 in the next 10 minutesso outliers accounting for this maybe are not that important to take into accout, and thus, I should focus more in the binomial part of the glmmadmb model that I chose (where just zeros vs no zeros are modeled). Thanks for your reflections they are very good to me! Date: Tue, 13 Aug 2013 09:07:41 -0700 Subject: Re: [R] Outliers and overdispersion From: gunter.ber...@gene.com To: lomasv...@hotmail.com CC: szehn...@uni-bonn.de; r-help@r-project.org The central question is: What caused the 3 unusual values? What is their scientific relevance? Only you can answer that, not us. -- Bert On Tue, Aug 13, 2013 at 8:51 AM, Marta Lomas lomasv...@hotmail.com wrote: Thanks for your interest and prompt answer! What I try to estimate is the correlation of one bird species counts with a set of environmental parameters. The count data are zero-inflated and overdispersed. I am modeling with hurdle-negative binomial-mixed effects. The results are very difficult to interpret and it get easier dropping out 3 outliers. But I do not know if I should do this.. Thanks! Marta Subject: Re: [R] Outliers and overdispersion From: szehn...@uni-bonn.de Date: Tue, 13 Aug 2013 17:41:10 +0200 CC: r-help@r-project.org To: lomasv...@hotmail.com I do not know what you are exactly estimating, but if it is about count models and the model fit gets better when you drop the outliers, it does not say, that the model is now more correct. It just says, if the data were without the outliers, this model would fit good. Overdispersion in count data is sometimes a cue, that you have a mixture distribution as the generating process - for example instead of one, K different (sub)species of birds which were aggregated in the count data. In this case a mixture (negative binomial)- distribution with K components could fit the data better. Best Simon On Aug 13, 2013, at 5:28 PM, Marta Lomas lomasv...@hotmail.com wrote: Hi again, I have a question on some outliers that I have in my response variable (wich are bird counts). At the beginning I did not drop them out because they are part of the normal counts and I considered them ecologically correct. However, I tried some of the same models without ouliers and the AICs are thus better. I also have nice significances this way... So would you say that, even though the outliers are right observations and taking into consideration that already the negative binomial distribution that I am using is accounting for the some of the overdispersion due to the outliers, it is better to drop them out as the models fit better this way? Thanks for your patience! :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic problem in R
R-3.0.1 (use all digits for describing an R version) is not the problem. Most likely you are masking a function or something like that. When you started the R session, did you get a message about restoring a previous session? If so, then close R, find the directory in which you were working and delete (or rename) the file .RData, and start R again. Or perhaps you inadvertently detached the stats package. type search() to check on that. The repair is the same as above. Rich On Tue, Aug 13, 2013 at 10:08 AM, Sinne Smed s...@ifro.ku.dk wrote: I am teaching a summercourse this and the next week where the students are using R. We have downloaded and use the new version R.3.0. It has worked perfectly until today where some of the basic functions have started NOT to work. Examples are sd() and lm () The message we get is Error: could not find function lm or Error: could not find function sd Have you ever encountered that. If yes what can I do about it. Is it a basic error in the new R version? I have used R for teaching in 5 years now and hae ever encountered a problem like that?? Thanks, Sinne [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] latin1 encoding in WriteXLS
Dear R users, I've just updated the WriteXLS package (on R 3.0.1) and I now have an error when exporting a data.frame with the argument Encoding=latin1. For example, these two lines work: library(WriteXLS) WriteXLS(iris, iris.xls) whereas these ones don't work: library(WriteXLS) WriteXLS(iris, irislatin1.xls,Encoding=latin1) I get this message: Argument Sepal.Length isn't numeric in subroutine entry at C:/Perl64/lib/Encode.pm line 217, CSVFILE line 1. Modification of a read-only value attempted at C:/Perl64/lib/Encode.pm line 218, CSVFILE line 1. The Perl script 'WriteXLS.pl' failed to run successfully. Message d'avis : l'exécution de la commande 'perl -IC:/Users/varet/Documents/R/win-library/3.0/WriteXLS/Perl C:/Users/varet/Documents/R/win-library/3.0/WriteXLS/Perl/WriteXLS.pl --CSVPath C:\Users\varet\AppData\Local\Temp\RtmpEzqFNz/WriteXLS --verbose FALSE --AdjWidth FALSE --AutoFilter FALSE --BoldHeaderRow FALSE --FreezeRow 0 --FreezeCol 0 --Encoding latin1 C:\Users\varet\Desktop\irislatin1.xls' renvoie un statut 255 Does anyone know why it failed? May it be a problem with Perl? Thanks for your help, Hugo Varet [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pulling out pairs from data frame
Bert, dat1-structure(list(SameName = 1:6, Individual = c(4L, 15L, 1L, 15L, 2L, 4L), Age = c(80L, 56L, 75L, 56L, 58L, 80L), Gender = c(M, F, F, F, F, M)), .Names = c(SameName, Individual, Age, Gender), class = data.frame, row.names = c(NA, -6L )) Your solution gives: dat1[!duplicated(dat1$Individual),] # SameName Individual Age Gender #1 1 4 80 M #2 2 15 56 F #3 3 1 75 F #5 5 2 58 F The OP asked for: And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender # 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M Anyway, the question was not clear as I mentioned in the earlier mail. Regards, A.K. - Original Message - From: Bert Gunter gunter.ber...@gene.com To: Kripa R kripa...@hotmail.com Cc: arun smartpink...@yahoo.com; R help r-help@r-project.org Sent: Tuesday, August 13, 2013 12:09 PM Subject: Re: [R] pulling out pairs from data frame Sorry. Typo. Corrected version is: yourframe[!duplicated(yourframe$Individual),] -- Bert On Tue, Aug 13, 2013 at 9:05 AM, Bert Gunter bgun...@gene.com wrote: ?duplicated yourframe[!duplicated(yourframe)$Individual,] -- Bert On Tue, Aug 13, 2013 at 8:12 AM, Kripa R kripa...@hotmail.com wrote: I manipulated the code you sent and it works perfectly, thanks! .kripa Date: Tue, 13 Aug 2013 08:10:53 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The conditions are still not clear. dat2- dat1[dat1$Individual%in% reps,] dat2 # SameName Individual Age Gender #1 1 4 80 M #2 2 15 56 F #4 4 15 56 F #6 6 4 80 M A.K. From: Kripa R kripa...@hotmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, August 13, 2013 10:56 AM Subject: RE: [R] pulling out pairs from data frame Oops! Ok So I have this file: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 3 1 75 F 4 15 56 F 5 2 58 F 6 4 80 M And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M .kripa Date: Mon, 12 Aug 2013 18:36:08 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The question is not clear so not sure this is what you wanted. dat1- read.table(text= SameName áIndividual áAge Gender 1 4 á80 áM á 2 15 á56 F 3 1 75 áF 4 15 á56 áF 5 á2 á58 áF 6 4 á80 áM ,sep=,header=TRUE,stringsAsFactors=FALSE) reps-c(4,15)á ádat1$Newcol-as.numeric(dat1$Individual%in% reps) ádat1 # áSameName Individual Age Gender Newcol #1 á á á á1 á á á á á4 á80 á á áM á á á1 #2 á á á á2 á á á á 15 á56 á á áF á á á1 #3 á á á á3 á á á á á1 á75 á á áF á á á0 #4 á á á á4 á á á á 15 á56 á á áF á á á1 #5 á á á á5 á á á á á2 á58 á á áF á á á0 #6 á á á á6 á á á á á4 á80 á á áM á á á1 A.K.á - Original Message - From: Kripa R kripa...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, August 12, 2013 6:59 PM Subject: [R] pulling out pairs from data frame Hello everyone, I'm having trouble pulling out paired samples from a data set... I have the following: reps-c(4,15) #the variable reps is a list of all paired samples data á á SameName á á á Individual á á á Age á á á Gender á á á 1 á á á 4 á á á 80 á á á M á á á 2 á á á 15 á á á 56 á á á F á á á 3 á á á 1 á á á 75 á á á F á á á 4 á á á 15 á á á 56 á á á F á á á 5 á á á 2 á á á 58 á á á F á á á 6 á á á 4 á á á 80 á á á M á I'd like to make a new variable with only the samples that have pairs. Any suggestions would be greatly appreciated Thanks! .kripa ááá ááá ááá á ááá ááá á ááá [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact
[R] internal error -3 in R_decompress1
Dear r users, what could cause such an error: internal error -3 in R_decompress1 unfortunately the error kills all my usual error catching mechanisms an appears on a remote cluster so I can not really tell you which command etc is causing it. Thanks for any hints on where to dig for the solution (or even the just cause). Jannis R version 3.0.0 (2013-04-03) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices datasets utils methods [8] base other attached packages: [1] snowfall_1.84-4snow_0.3-12 [3] DistributionUtils_0.5-1RUnit_0.4.26 [5] RColorBrewer_1.0-5 plotrix_3.5 [7] doMC_1.3.0 iterators_1.0.6 [9] multicore_0.1-7plyr_1.8 [11] raster_2.1-49 sp_1.0-11 [13] abind_1.4-0foreach_1.4.1 [15] RNetCDF_1.6.1-2Rssa_0.9.10 [17] forecast_4.06 svd_0.3.2-1 loaded via a namespace (and not attached): [1] codetools_0.2-8 colorspace_1.2-2 compiler_3.0.0 fracdiff_1.4-2 [5] grid_3.0.0 lattice_0.20-15 nnet_7.3-7 quadprog_1.5-5 [9] tools_3.0.0 tseries_0.10-32 zoo_1.7-10 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ave function
I've written the following function CoursePrep - function (Source, SaveName) { Clean$TERM - as.factor(Clean$TERM) Clean$INST_NUM - as.factor(Clean$INST_NUM) Clean$zGrade - with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = scale)) write.csv(Clean,paste(SaveName, csv, sep =.), row.names = FALSE) return(Clean) } which is all well and good, but I wan't to throw a shapiro.test in before I normalize. that is I don't really understand quite how I did ( I got help) what I wanted to in the Clean$zGrade - with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = scale)) that code for the whole of Clean finds all sets of GRADE.'s that have the same INST_NUM and TERM computes a mean, subtracts off the mean and divides by the standard deviation. I would like to for each one of those sets of grades to call shapiro.test() on the set, to see if it is normal *before* I assume it is. I know the naive with(Clean, shapiro.test( list(TERM, INST_NUM))) doesn't work. with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = function(x)shapiro.test(x))) which returns Error in shapiro.test(x) : sample size must be between 3 and 5000 and I have checked that the sets selected are all of length between 3 and 5000. using the following on my full data ClassSize - with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = function(x)length(x))) summary(ClassSize) Min. 1st Qu. MedianMean 3rd Qu.Max. 22.0 198.0 241.0 244.4 279.0 466.0 here is some sample data GRADE TERM INST_NUM 1, 9, 1 2, 9, 1 3, 9, 1 1.5, 8, 2 1.75, 8, 2 2, 8, 2 0.5, 9, 2 2, 9, 2 3.5, 9, 2 3.5,8, 1 3.75, 8, 1 4, 8, 1 and hopefully the code would test the following set of grades (1,2,3)(1.5,1.75,2)(0.5,2,3.5)(3.5,3.75,4) Thanks Robert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] internal error -3 in R_decompress1
On 13/08/2013 18:47, Jannis wrote: Dear r users, what could cause such an error: internal error -3 in R_decompress1 unfortunately the error kills all my usual error catching mechanisms an appears on a remote cluster so I can not really tell you which command etc is causing it. It is a corrupt package installation, so re-install. Thanks for any hints on where to dig for the solution (or even the just cause). Jannis R version 3.0.0 (2013-04-03) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices datasets utils methods [8] base other attached packages: [1] snowfall_1.84-4snow_0.3-12 [3] DistributionUtils_0.5-1RUnit_0.4.26 [5] RColorBrewer_1.0-5 plotrix_3.5 [7] doMC_1.3.0 iterators_1.0.6 [9] multicore_0.1-7plyr_1.8 [11] raster_2.1-49 sp_1.0-11 [13] abind_1.4-0foreach_1.4.1 [15] RNetCDF_1.6.1-2Rssa_0.9.10 [17] forecast_4.06 svd_0.3.2-1 loaded via a namespace (and not attached): [1] codetools_0.2-8 colorspace_1.2-2 compiler_3.0.0 fracdiff_1.4-2 [5] grid_3.0.0 lattice_0.20-15 nnet_7.3-7 quadprog_1.5-5 [9] tools_3.0.0 tseries_0.10-32 zoo_1.7-10 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave function
Hi, You could try: lapply(split(Clean,list(Clean$TERM,Clean$INST_NUM)),function(x) shapiro.test(x$GRADE)) A.K. - Original Message - From: Robert Lynch robert.b.ly...@gmail.com To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 1:46 PM Subject: [R] ave function I've written the following function CoursePrep - function (Source, SaveName) { Clean$TERM - as.factor(Clean$TERM) Clean$INST_NUM - as.factor(Clean$INST_NUM) Clean$zGrade - with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = scale)) write.csv(Clean,paste(SaveName, csv, sep =.), row.names = FALSE) return(Clean) } which is all well and good, but I wan't to throw a shapiro.test in before I normalize. that is I don't really understand quite how I did ( I got help) what I wanted to in the Clean$zGrade - with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = scale)) that code for the whole of Clean finds all sets of GRADE.'s that have the same INST_NUM and TERM computes a mean, subtracts off the mean and divides by the standard deviation. I would like to for each one of those sets of grades to call shapiro.test() on the set, to see if it is normal *before* I assume it is. I know the naive with(Clean, shapiro.test( list(TERM, INST_NUM))) doesn't work. with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = function(x)shapiro.test(x))) which returns Error in shapiro.test(x) : sample size must be between 3 and 5000 and I have checked that the sets selected are all of length between 3 and 5000. using the following on my full data ClassSize - with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = function(x)length(x))) summary(ClassSize) Min. 1st Qu. Median Mean 3rd Qu. Max. 22.0 198.0 241.0 244.4 279.0 466.0 here is some sample data GRADE TERM INST_NUM 1, 9, 1 2, 9, 1 3, 9, 1 1.5, 8, 2 1.75, 8, 2 2, 8, 2 0.5, 9, 2 2, 9, 2 3.5, 9, 2 3.5, 8, 1 3.75, 8, 1 4, 8, 1 and hopefully the code would test the following set of grades (1,2,3)(1.5,1.75,2)(0.5,2,3.5)(3.5,3.75,4) Thanks Robert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pulling out pairs from data frame
Yes, you're right. So I guess you should match on duplicated values, something like (untested) with(dat1, dat1[individual %in% individual[duplicated(individual)],] which is presumably essentially what you gave. -- Bert On Tue, Aug 13, 2013 at 10:41 AM, arun smartpink...@yahoo.com wrote: Bert, dat1-structure(list(SameName = 1:6, Individual = c(4L, 15L, 1L, 15L, 2L, 4L), Age = c(80L, 56L, 75L, 56L, 58L, 80L), Gender = c(M, F, F, F, F, M)), .Names = c(SameName, Individual, Age, Gender), class = data.frame, row.names = c(NA, -6L )) Your solution gives: dat1[!duplicated(dat1$Individual),] # SameName Individual Age Gender #11 4 80 M #22 15 56 F #33 1 75 F #55 2 58 F The OP asked for: And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender # 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M Anyway, the question was not clear as I mentioned in the earlier mail. Regards, A.K. - Original Message - From: Bert Gunter gunter.ber...@gene.com To: Kripa R kripa...@hotmail.com Cc: arun smartpink...@yahoo.com; R help r-help@r-project.org Sent: Tuesday, August 13, 2013 12:09 PM Subject: Re: [R] pulling out pairs from data frame Sorry. Typo. Corrected version is: yourframe[!duplicated(yourframe$Individual),] -- Bert On Tue, Aug 13, 2013 at 9:05 AM, Bert Gunter bgun...@gene.com wrote: ?duplicated yourframe[!duplicated(yourframe)$Individual,] -- Bert On Tue, Aug 13, 2013 at 8:12 AM, Kripa R kripa...@hotmail.com wrote: I manipulated the code you sent and it works perfectly, thanks! .kripa Date: Tue, 13 Aug 2013 08:10:53 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The conditions are still not clear. dat2- dat1[dat1$Individual%in% reps,] dat2 # SameName Individual Age Gender #11 4 80 M #22 15 56 F #44 15 56 F #66 4 80 M A.K. From: Kripa R kripa...@hotmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, August 13, 2013 10:56 AM Subject: RE: [R] pulling out pairs from data frame Oops! Ok So I have this file: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 3 1 75 F 4 15 56 F 5 2 58 F 6 4 80 M And I want to pull out paired samples, so the resulting file would look something like this: SampleName Individual Age Gender 1 4 80 M 2 15 56 F 4 15 56 F 6 4 80 M .kripa Date: Mon, 12 Aug 2013 18:36:08 -0700 From: smartpink...@yahoo.com Subject: Re: [R] pulling out pairs from data frame To: kripa...@hotmail.com CC: r-help@r-project.org Hi, The question is not clear so not sure this is what you wanted. dat1- read.table(text= SameName áIndividual áAge Gender 1 4 á80 áM á 2 15 á56 F 3 1 75 áF 4 15 á56 áF 5 á2 á58 áF 6 4 á80 áM ,sep=,header=TRUE,stringsAsFactors=FALSE) reps-c(4,15)á ádat1$Newcol-as.numeric(dat1$Individual%in% reps) ádat1 # áSameName Individual Age Gender Newcol #1 á á á á1 á á á á á4 á80 á á áM á á á1 #2 á á á á2 á á á á 15 á56 á á áF á á á1 #3 á á á á3 á á á á á1 á75 á á áF á á á0 #4 á á á á4 á á á á 15 á56 á á áF á á á1 #5 á á á á5 á á á á á2 á58 á á áF á á á0 #6 á á á á6 á á á á á4 á80 á á áM á á á1 A.K.á - Original Message - From: Kripa R kripa...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, August 12, 2013 6:59 PM Subject: [R] pulling out pairs from data frame Hello everyone, I'm having trouble pulling out paired samples from a data set... I have the following: reps-c(4,15) #the variable reps is a list of all paired samples data á á SameName á á á Individual á á á Age á á á Gender á á á 1 á á á 4 á á á 80 á á á M á á á 2 á á á 15 á á á 56 á á á F á á á 3 á á á 1 á á á 75 á á á F á á á 4 á á á 15 á á á 56 á á á F á á á 5 á á á 2 á á á 58 á á á F á á á 6 á á á 4 á á á 80 á á á M á I'd like to make a new variable with only the samples that have pairs. Any suggestions would be greatly appreciated Thanks! .kripa ááá ááá ááá á ááá ááá á ááá [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] Create rows for columns in dataframe
Hi, My desired output for my sample!! using dput(): structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), DSYSRTKY = c(10005, 10005, 10005, 10005, 10203, 10203, 10203, 10203, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327), CODE = c(71535, 78900, V1251, V454, 45341, 4019, 72400, V1011, 42831, 5990, 8052, 4241, 4019, 311, 2724, 71680, 4168, 7804, V066, 6930, 41400, V4581, 40291, 4280, 5990, V4986, 5939, 3109, 41401, 6826, 7850, 4019, 2720, 49390, 2859, 79029, V1582, 486, 51881, 5119, 42789, 7823, 41400, V4581, 40390, 5859, 49390, 2724), PRIMAIRY = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c(ID, DSYSRTKY, CODE, PRIMAIRY), row.names = c(NA, 48L), class = data.frame) So the 'DSYSRTKY' (10005) has 4 code fields filled so you get 4 rows. The next one also 4, the third one 16. Anyway, just take a look at the sample. I think this will help trying to make clear what my desired result is! Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with zero-inflated negative binomial model in sediment river dynamics
Dear All, I am running a negative binomial model in R using the package pscl in oder to estimate bed sediment movements versus river discharge. Currently we have deployed 4 different plates to test if a combination of more than one plate would better describe the sediment movements when the river discharge changes over time. My data are positively skewed and zero-inflated. I did run both zero-inflated Poisson and zero-inflated negative binomial regression and compared them using the VUONG test which showed that the negative binomial works better than a simple zero-inflated Poisson. My models look like: 1) plate1 ~ river discharge 2) (plate 1 + plate 2) ~ river discharge 3) (plate 1 + plate 2 +plate 3) ~ river discharge 4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge My main problem as I am new to these type of models is that I get a different sign for the coefficent of discharge in the output of the zero-inflated negative binomial model (please see below). What does this mean? Also how could I compare the different models (1-4) i.e. what tells me which is performing best? Thank you very much in advance for any comments and suggestions!! Kind Regards, Valentina Call: zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist = negbin, EM = TRUE) Pearson residuals: Min 1Q Median 3Q Max -0.6770 -0.3564 -0.2101 -0.0814 12.3421 Count model coefficients (negbin with log link): EstimateStd. Error z value Pr(|z|) (Intercept) 2.557066 0.036593 69.88 2e-16 *** discharge0.0646980.001983 32.63 2e-16 *** Log(theta) -0.775736 0.012451 -62.30 2e-16 *** Zero-inflation model coefficients (binomial with logit link): EstimateStd. Error z valuePr(|z|) (Intercept) 13.010110.22602 57.56 2e-16 *** discharge-1.642930.03092 -53.14 2e-16 *** Theta = 0.4604 Number of iterations in BFGS optimization: 1 Log-likelihood: -6.933e+04 on 5 Df [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regular repeats
Hi, Many apologies for the simplicity (hopefully!) of this request - I can't find it on the forum, but it may have been asked in the past. I have a data frame consisting of ~2000 rows. I simply want to take the average of the first 6, then the next 6, then the next 6 until the end of the table. The command mean(mole[1:6,c(PercentPI)]) gets me the first 6 rows (column is PercentPI), but I don't know how to increase the rows incrementally. Thanks in advance. J -- View this message in context: http://r.789695.n4.nabble.com/Regular-repeats-tp4673653.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] DIALLEL ANALYSIS
sir i have installed plant breeding library well. But when i import the file in R and give command * data(fulldial) Warning message: In data(fulldial) : data set fulldial not found* above warning message is found please guide me. my data is under MALE FEMALE YIELD 1 1 53.333 1 2 52.333 1 3 54.333 1 4 56.333 1 5 52.667 2 1 52.667 2 2 51.333 2 3 55.333 2 4 52.333 2 5 54 3 1 53.667 3 2 51.667 3 3 52.667 3 4 55.333 3 5 54.667 4 1 57.333 4 2 53.333 4 3 56 4 4 54.667 4 5 51.667 5 1 56.667 5 2 54.333 5 3 51.333 5 4 54 5 5 55.333 Is there any mistake in data entry please tell me.. please send me code and solve question...i shall thankful to you.. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting rid of .Rhistory and .RData
The following should help: What does R ask you each time you quit R? Answer no. Start R with R --no-save -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 8/13/13 9:15 AM, Jannis bt_jan...@yahoo.de wrote: Dear R users, occasionally I find .Rhistory and/or .RData files cluttered around in my file structure. Is there a way to tell R not to save such files? Or to use one central location where to save them (if they are of any use)? I have looked through options() to no avail. Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to store and manipulate survey data like this?
On 08/13/2013 12:17 PM, Walter Anderson wrote: I have to process a set of survey data with questions that are formatted like this; 1) Pick your top three breeds (pick 3) 1 Rottweiler 2 Pit Bull 3 German Shepard 4 Poodle 5 Border Collie 6 Dalmation 7 Mixed Breed and the answers are formatted like this: Respondent, Question1 1, 1,4,7 2, 2,7,5 3, 6,3,5 4, ... Any suggestions on how to preprocess the file to be able to do things like frequency analysis for breeds? Here's how I would get started: survey - read.csv(survey.csv, as.is=TRUE) survey Respondent Question1 1 1 1,4,7 2 2 2,7,5 3 3 6,3,5 4 4 TipleOrNAs - function(x) {if (length(x) == 3) x else c(NA, NA, NA)} options - lapply(strsplit(survey$Question1, ,), TripleOrNAs) options - matrix(unlist(options), ncol=3, byrow=TRUE) survey2 - cbind(survey, options) names(survey2) - c(names(survey), paste(Q1.Opt, 1:3, sep=.)) survey2 Respondent Question1 Q1.Opt.1 Q1.Opt.2 Q1.Opt.3 1 1 1,4,7147 2 2 2,7,5275 3 3 6,3,5635 4 4 NA NA NA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
According to your first post, NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The new output dataset: Out1 str(Out1) 'data.frame': 48 obs. of 4 variables: $ ID : chr 1 2 3 4 ... $ DSYSRTKY: chr 10005 10005 10005 10005 ... $ CODE : chr 71535 78900 V1251 V454 ... $ PRIMAIRY: chr TRUE FALSE FALSE FALSE ... I guess you wanted DSYSRTKY to be numeric and PRIMAIRY to be logical res1-do.call(rbind,lapply(seq_len(nrow(dat1)),function(i) {x1-as.character(unlist(dat1[i,-1]));CODE-x1[x1!=];PRIMAIRY-x1[x1!=]==head(x1,1); DSYSRTKY=as.numeric(as.character(dat1[i,1]));data.frame(DSYSRTKY,CODE,PRIMAIRY,stringsAsFactors=FALSE) })) res1$ID- row.names(res1) res2-res1[,c(4,1:3)] str(res2) #'data.frame': 48 obs. of 4 variables: # $ ID : chr 1 2 3 4 ... # $ DSYSRTKY: num 1e+08 1e+08 1e+08 1e+08 1e+08 ... # $ CODE : chr 71535 78900 V1251 V454 ... # $ PRIMAIRY: logi TRUE FALSE FALSE FALSE TRUE FALSE ... head(res2) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE head(Out1) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 12:16 PM Subject: Re: [R] Create rows for columns in dataframe Hi, My desired output for my sample!! using dput(): structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), DSYSRTKY = c(10005, 10005, 10005, 10005, 10203, 10203, 10203, 10203, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327), CODE = c(71535, 78900, V1251, V454, 45341, 4019, 72400, V1011, 42831, 5990, 8052, 4241, 4019, 311, 2724, 71680, 4168, 7804, V066, 6930, 41400, V4581, 40291, 4280, 5990, V4986, 5939, 3109, 41401, 6826, 7850, 4019, 2720, 49390, 2859, 79029, V1582, 486, 51881, 5119, 42789, 7823, 41400, V4581, 40390, 5859, 49390, 2724), PRIMAIRY = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c(ID, DSYSRTKY, CODE, PRIMAIRY), row.names = c(NA, 48L), class = data.frame) So the 'DSYSRTKY' (10005) has 4 code fields filled so you get 4 rows. The next one also 4, the third one 16. Anyway, just take a look at the sample. I think this will help trying to make clear what my desired result is! Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular repeats
What about something like this: tmp - data.frame(var1 = rnorm(36), ind = gl(6,6)) with(tmp, tapply(var1, ind, mean)) You can see that your version of mean(tmp[1:6,c(var1)]) gives the same as mine for the first 6 rows. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jsf1982 Sent: Tuesday, August 13, 2013 12:46 PM To: r-help@r-project.org Subject: [R] Regular repeats Hi, Many apologies for the simplicity (hopefully!) of this request - I can't find it on the forum, but it may have been asked in the past. I have a data frame consisting of ~2000 rows. I simply want to take the average of the first 6, then the next 6, then the next 6 until the end of the table. The command mean(mole[1:6,c(PercentPI)]) gets me the first 6 rows (column is PercentPI), but I don't know how to increase the rows incrementally. Thanks in advance. J -- View this message in context: http://r.789695.n4.nabble.com/Regular-repeats-tp4673653.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit on Linux?
From: Kevin E. Thorpe [mailto:kevin.tho...@utoronto.ca] Sent: Monday, August 12, 2013 11:00 AM Subject: Re: [R] Memory limit on Linux? What does ulimit -a report on both of these machines? Greetings, Sorry for the delay. Other fires demanded more attention... For the system in which memory seems to allocate as needed: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 386251 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 386251 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited For the system in which memory seems to hang around 5-7GB: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 2066497 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited I can also confirm the same behavior on a Scientific Linux system though the difference besides CentOS/RHEL is that the Scientific is at an earlier version of 6 (6.2 to be exact). The Scientific system has the same ulimit configuration as the problem box. I could be mistaken, but here are the differences I see in the ulimits: pending signals: shouldn't matter max locked memory: The Scientific/CentOS system is higher so I don't think this is it. stack size: Again, higher on Scientific/CentOS. max user processes: Seems high to me, but I don't see how this is capping a memory limit. Am I missing something? Any help is greatly appreciated. Thank you! Chris Stackpole __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert list with missing values to dataFrame
I have a dataFrame sID - c(a, 1,2,3, b, 4,5,6) rID - c(shr1125, bwr331, bwr330, vjhr1022) tmp - data.frame(cbind(sID,rID)) but I need to split tmp$sID into three different columns, filling locations where tmp$sID has only one value with NA. I can split tmp$sID by the comma tmp.1 - strsplit(tmp$sID, ,) but I can't figure out how to convert the resulting list into a dataFrame. Ideally, tmp will become four columns wide, something like sID.a sID.b sID.c rID NA NA ashr1125 12 3bwr331 NA NA b bwr330 456 vjhr1022 Thoughts or suggestions? I tried havecomma - grep(',', tmp$sID) for( i in 1:nrow(tmp)){ if (!(tmp[i,] %in% havecomma)){ tmp$sID[i] - paste(', ,', tmp$sID[i], sep=) } } and thought that I might be able to force the list into a dataframe once each component had three items, but it just seemed to apply the paste() function to everything which gave me a list with varying numbers of items. I'm stuck. Thanks for your help - SR Steven H. Ranney [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit on Linux?
On 08/13/2013 03:06 PM, Stackpole, Chris wrote: From: Kevin E. Thorpe [mailto:kevin.tho...@utoronto.ca] Sent: Monday, August 12, 2013 11:00 AM Subject: Re: [R] Memory limit on Linux? What does ulimit -a report on both of these machines? Greetings, Sorry for the delay. Other fires demanded more attention... For the system in which memory seems to allocate as needed: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 386251 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 386251 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited For the system in which memory seems to hang around 5-7GB: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 2066497 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited I can also confirm the same behavior on a Scientific Linux system though the difference besides CentOS/RHEL is that the Scientific is at an earlier version of 6 (6.2 to be exact). The Scientific system has the same ulimit configuration as the problem box. I could be mistaken, but here are the differences I see in the ulimits: pending signals: shouldn't matter max locked memory: The Scientific/CentOS system is higher so I don't think this is it. stack size: Again, higher on Scientific/CentOS. max user processes: Seems high to me, but I don't see how this is capping a memory limit. Am I missing something? Any help is greatly appreciated. Thank you! Chris Stackpole It appears that at the shell level, the differences are not to blame. It has been a long time, but years ago in HP-UX, we needed to change an actual kernel parameter (this was for S-Plus 5 rather than R back then). Despite the ulimits being acceptable, there was a hard limit in the kernel. I don't know whether such things have been (or can be) built in to your problem machine. If it is a multiuser box, it could be that limits have been set to prevent a user from gobbling up all the memory. The other thing to check is if R has/can be compiled with memory limits. Sorry I can't be of more help. -- Kevin E. Thorpe Head of Biostatistics, Applied Health Research Centre (AHRC) Li Ka Shing Knowledge Institute of St. Michael's Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with zero-inflated negative binomial model in sediment river dynamics
Lauria: For historical reasons the logistic regression (binomial with logit link) model portion of a zero-inflated count model is usually structured to predict the probability of the 0 counts rather than the nonzero (=1) counts so the coefficients will be the negative of what you expect based on the count model portion (as in your output). It is simple to interpret the probability of the logistic regression portion as the probability of the nonzero counts by just taking the negative of the coefficient estimates provided for the probability of the zero counts. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: ca...@usgs.gov brian_c...@usgs.gov tel: 970 226-9326 On Tue, Aug 13, 2013 at 9:06 AM, Lauria, Valentina valentina.lau...@nuigalway.ie wrote: Dear All, I am running a negative binomial model in R using the package pscl in oder to estimate bed sediment movements versus river discharge. Currently we have deployed 4 different plates to test if a combination of more than one plate would better describe the sediment movements when the river discharge changes over time. My data are positively skewed and zero-inflated. I did run both zero-inflated Poisson and zero-inflated negative binomial regression and compared them using the VUONG test which showed that the negative binomial works better than a simple zero-inflated Poisson. My models look like: 1) plate1 ~ river discharge 2) (plate 1 + plate 2) ~ river discharge 3) (plate 1 + plate 2 +plate 3) ~ river discharge 4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge My main problem as I am new to these type of models is that I get a different sign for the coefficent of discharge in the output of the zero-inflated negative binomial model (please see below). What does this mean? Also how could I compare the different models (1-4) i.e. what tells me which is performing best? Thank you very much in advance for any comments and suggestions!! Kind Regards, Valentina Call: zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist = negbin, EM = TRUE) Pearson residuals: Min 1Q Median 3Q Max -0.6770 -0.3564 -0.2101 -0.0814 12.3421 Count model coefficients (negbin with log link): EstimateStd. Error z value Pr(|z|) (Intercept) 2.557066 0.036593 69.88 2e-16 *** discharge0.0646980.001983 32.63 2e-16 *** Log(theta) -0.775736 0.012451 -62.30 2e-16 *** Zero-inflation model coefficients (binomial with logit link): EstimateStd. Error z valuePr(|z|) (Intercept) 13.010110.22602 57.56 2e-16 *** discharge-1.642930.03092 -53.14 2e-16 *** Theta = 0.4604 Number of iterations in BFGS optimization: 1 Log-likelihood: -6.933e+04 on 5 Df [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
You could also try: ##Out1 is the output dataset Out1$PRIMAIRY-as.logical(Out1$PRIMAIRY) #changing the class #dat1 input dataset vec1- paste(dat1[,1],dat1[,2],colnames(dat1)[2],sep=.) res2-reshape(dat1,idvar=newCol,varying=list(2:26),direction=long) res3-res2[order(res2[,4]),] res4- res3[res3[,3]!=,-4] vec2-paste(res4[,1],res4[,3],paste0(C,res4[,2]),sep=.) res4$PRIMAIRY-vec2%in%vec1 row.names(res4)-1:nrow(res4) res4$ID- row.names(res4) res4[,c(1,3)]- lapply(res4[,c(1,3)],as.character) res5-res4[,c(5,1,3,4)] colnames(res5)[3]-CODE identical(res5,Out1) #[1] TRUE A.K. A.K. - Original Message - From: arun smartpink...@yahoo.com To: R help r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 2:45 PM Subject: Re: [R] Create rows for columns in dataframe According to your first post, NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The new output dataset: Out1 str(Out1) 'data.frame': 48 obs. of 4 variables: $ ID : chr 1 2 3 4 ... $ DSYSRTKY: chr 10005 10005 10005 10005 ... $ CODE : chr 71535 78900 V1251 V454 ... $ PRIMAIRY: chr TRUE FALSE FALSE FALSE ... I guess you wanted DSYSRTKY to be numeric and PRIMAIRY to be logical res1-do.call(rbind,lapply(seq_len(nrow(dat1)),function(i) {x1-as.character(unlist(dat1[i,-1]));CODE-x1[x1!=];PRIMAIRY-x1[x1!=]==head(x1,1); DSYSRTKY=as.numeric(as.character(dat1[i,1]));data.frame(DSYSRTKY,CODE,PRIMAIRY,stringsAsFactors=FALSE) })) res1$ID- row.names(res1) res2-res1[,c(4,1:3)] str(res2) #'data.frame': 48 obs. of 4 variables: # $ ID : chr 1 2 3 4 ... # $ DSYSRTKY: num 1e+08 1e+08 1e+08 1e+08 1e+08 ... # $ CODE : chr 71535 78900 V1251 V454 ... # $ PRIMAIRY: logi TRUE FALSE FALSE FALSE TRUE FALSE ... head(res2) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE head(Out1) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 12:16 PM Subject: Re: [R] Create rows for columns in dataframe Hi, My desired output for my sample!! using dput(): structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), DSYSRTKY = c(10005, 10005, 10005, 10005, 10203, 10203, 10203, 10203, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327), CODE = c(71535, 78900, V1251, V454, 45341, 4019, 72400, V1011, 42831, 5990, 8052, 4241, 4019, 311, 2724, 71680, 4168, 7804, V066, 6930, 41400, V4581, 40291, 4280, 5990, V4986, 5939, 3109, 41401, 6826, 7850, 4019, 2720, 49390, 2859, 79029, V1582, 486, 51881, 5119, 42789, 7823, 41400, V4581, 40390, 5859, 49390, 2724), PRIMAIRY = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c(ID, DSYSRTKY, CODE, PRIMAIRY), row.names = c(NA, 48L), class = data.frame) So the 'DSYSRTKY' (10005) has 4 code fields filled so you get 4 rows. The next one also 4, the third one 16. Anyway, just take a look at the sample. I think this will help trying to make clear what my desired result is! Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert list with missing values to dataFrame
Try, sID - c(a, 1,2,3, b, 4,5,6) tmp1 - strsplit(sID,',') tmp2 - lapply(tmp1, function(x) if (length(x)==1) c('','',x) else x ) tmp3 - matrix(unlist(tmp2),ncol=3, byrow=TRUE) rID - c(shr1125, bwr331, bwr330, vjhr1022) newdf - data.frame(cbind(tmp3,rID)) You'll need to name the first three columns. As an aside, note that you don't need the cbind in your data.frame(cbind(sID,rID)) because data.frame(sID,rID) does just as well. But cbind is needed in my example, because tmp3 is a matrix. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 8/13/13 12:09 PM, Steven Ranney steven.ran...@gmail.com wrote: I have a dataFrame sID - c(a, 1,2,3, b, 4,5,6) rID - c(shr1125, bwr331, bwr330, vjhr1022) tmp - data.frame(cbind(sID,rID)) but I need to split tmp$sID into three different columns, filling locations where tmp$sID has only one value with NA. I can split tmp$sID by the comma tmp.1 - strsplit(tmp$sID, ,) but I can't figure out how to convert the resulting list into a dataFrame. Ideally, tmp will become four columns wide, something like sID.a sID.b sID.c rID NA NA ashr1125 12 3bwr331 NA NA b bwr330 456 vjhr1022 Thoughts or suggestions? I tried havecomma - grep(',', tmp$sID) for( i in 1:nrow(tmp)){ if (!(tmp[i,] %in% havecomma)){ tmp$sID[i] - paste(', ,', tmp$sID[i], sep=) } } and thought that I might be able to force the list into a dataframe once each component had three items, but it just seemed to apply the paste() function to everything which gave me a list with varying numbers of items. I'm stuck. Thanks for your help - SR Steven H. Ranney [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] post-hoc test for aovp() function
Hello. I am using the aovp() function from the library lmPerm with one factor (group: 3 levels) controlling for 2 covariates. I now want to conduct a post-hoc test using the same model. Unfortunately, I did not find an appropriate test which works with 2 covariates. I would be grateful for any suggestions. Thank you. Julia -- View this message in context: http://r.789695.n4.nabble.com/post-hoc-test-for-aovp-function-tp4673672.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular repeats
On 13-08-2013, at 18:46, jsf1982 jamie.free...@ucl.ac.uk wrote: Hi, Many apologies for the simplicity (hopefully!) of this request - I can't find it on the forum, but it may have been asked in the past. I have a data frame consisting of ~2000 rows. I simply want to take the average of the first 6, then the next 6, then the next 6 until the end of the table. The command mean(mole[1:6,c(PercentPI)]) gets me the first 6 rows (column is PercentPI), but I don't know how to increase the rows incrementally. Something like this N - 27 dd - data.frame(A=rnorm(1:N),index=gl(6,6,N)) aggregate(dd$A,by=list(dd$index),mean) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular repeats
Hi, You could try: set.seed(24) dat1- as.data.frame(matrix(sample(1:50,29*6,replace=TRUE),ncol=6)) ((seq_len(nrow(dat1))-1)%/%6)+1 # [1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 #For a particular column: aggregate(dat1[,5],list(((seq_len(nrow(dat1))-1)%/%6)+1),FUN=mean) # Group.1 x #1 1 38.16667 #2 2 29.5 #3 3 23.16667 #4 4 21.16667 #5 5 20.6 #or for the whole columns aggregate(dat1,list(((seq_len(nrow(dat1))-1)%/%6)+1),FUN=mean) # Group.1 V1 V2 V3 V4 V5 V6 #1 1 28.3 17.5 12.7 35.0 38.16667 30.16667 #2 2 26.16667 31.3 35.3 19.7 29.5 24.8 #3 3 24.0 11.8 20.0 25.5 23.16667 20.8 #4 4 18.3 23.3 23.7 20.3 21.16667 21.16667 #5 5 22.6 30.4 17.4 21.8 20.6 24.4 #or library(plyr) res1-ddply(dat1,.(((seq_len(nrow(dat1))-1)%/%6)+1),summarize,MeanV1=mean(V1)) colnames(res1)[1]-Group res1 # Group MeanV1 #1 1 28.3 #2 2 26.16667 #3 3 24.0 #4 4 18.3 #5 5 22.6 A.K. - Original Message - From: jsf1982 jamie.free...@ucl.ac.uk To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 12:46 PM Subject: [R] Regular repeats Hi, Many apologies for the simplicity (hopefully!) of this request - I can't find it on the forum, but it may have been asked in the past. I have a data frame consisting of ~2000 rows. I simply want to take the average of the first 6, then the next 6, then the next 6 until the end of the table. The command mean(mole[1:6,c(PercentPI)]) gets me the first 6 rows (column is PercentPI), but I don't know how to increase the rows incrementally. Thanks in advance. J -- View this message in context: http://r.789695.n4.nabble.com/Regular-repeats-tp4673653.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert list with missing values to dataFrame
Hi, You could try: tmp[,1]- as.character(tmp[,1]) tmp[,1][-grep(,,tmp[,1])]-paste0(,,,tmp[,1][-grep(,,tmp[,1])]) tmp2-data.frame(read.table(text=tmp[,1],sep=,,header=FALSE,stringsAsFactors=FALSE),rID=tmp[,2],stringsAsFactors=FALSE) colnames(tmp2)[1:3]-paste(sID,letters[1:3],sep=.) tmp2 # sID.a sID.b sID.c rID #1 NA NA a shr1125 #2 1 2 3 bwr331 #3 NA NA b bwr330 #4 4 5 6 vjhr1022 BTW, data.frame(sID,rID,stringsAsFactors=FALSE)#cbind is not needed. In this case, it is okay, # sID rID #1 a shr1125 #2 1,2,3 bwr331 #3 b bwr330 #4 4,5,6 vjhr1022 #But if they were of different class: str(data.frame(cbind(sID,Col2=1:4),stringsAsFactors=FALSE)) #'data.frame': 4 obs. of 2 variables: # $ sID : chr a 1,2,3 b 4,5,6 # $ Col2: chr 1 2 3 4 str(data.frame(sID,Col2=1:4,stringsAsFactors=FALSE)) #'data.frame': 4 obs. of 2 variables: # $ sID : chr a 1,2,3 b 4,5,6 # $ Col2: int 1 2 3 4 A.K. - Original Message - From: Steven Ranney steven.ran...@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 3:09 PM Subject: [R] Convert list with missing values to dataFrame I have a dataFrame sID - c(a, 1,2,3, b, 4,5,6) rID - c(shr1125, bwr331, bwr330, vjhr1022) tmp - data.frame(cbind(sID,rID)) but I need to split tmp$sID into three different columns, filling locations where tmp$sID has only one value with NA. I can split tmp$sID by the comma tmp.1 - strsplit(tmp$sID, ,) but I can't figure out how to convert the resulting list into a dataFrame. Ideally, tmp will become four columns wide, something like sID.a sID.b sID.c rID NA NA a shr1125 1 2 3 bwr331 NA NA b bwr330 4 5 6 vjhr1022 Thoughts or suggestions? I tried havecomma - grep(',', tmp$sID) for( i in 1:nrow(tmp)){ if (!(tmp[i,] %in% havecomma)){ tmp$sID[i] - paste(', ,', tmp$sID[i], sep=) } } and thought that I might be able to force the list into a dataframe once each component had three items, but it just seemed to apply the paste() function to everything which gave me a list with varying numbers of items. I'm stuck. Thanks for your help - SR Steven H. Ranney [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Runtime error in R
Hi everyone: I am running a code in R and I get the following message after using large files (files larger than 2GB): Runtime error! this application has requested the Runtime to terminate it in an usual way. Please contact the application's support team for more information Another person posted a similar situation years back with the use of a large data.table. but no solution was proposed. Does anyone has come across this problem? is there a fix?. in R at 64bit, the message appear immediately after running the code. At 32bit, the code start but I run into the issue that the code reaches the RAM limit. This makes me to suspect that the error in 64bit is related to a default value on how big the files could be. Anyway, any help will be highly appreciated. Cheers, Camilo Camilo Mora, Ph.D. Department of Geography, University of Hawaii Currently available in Colombia Phone: Country code: 57 Provider code: 313 Phone 776 2282 From the USA or Canada you have to dial 011 57 313 776 2282 http://www.soc.hawaii.edu/mora/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regression of categorical data
I have a set of survey data where I have answers to identify preference of three categories using three questions 1) a or b? 2) b or c? 3) a or c? and want to obtain weights for each of the preferences something like X(a) + Y(b) + Z(c) = 100% I am at a loss how how to calculate this from the data. Any help would be appreciated! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Runtime error in R
It would seem that in going to a 64 bit architecture you have not escaped your memory problems. Such problems are highly varied in details, so you would need to be much more specific about how you are encountering this problem before anyone could help. Read the Posting Guide and make a reproducible example and provide the output of sessionInfo just before the problem occurs. Note that operating-system-specific solutions are sometimes necessary, but algorithmic solutions are usually the most powerful at scaling to larger sizes... that is, change your code or use a different tool for part or all of the work. It can become crucial to understand every step of the processing you are doing in such cases. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Camilo Mora cm...@dal.ca wrote: Hi everyone: I am running a code in R and I get the following message after using large files (files larger than 2GB): Runtime error! this application has requested the Runtime to terminate it in an usual way. Please contact the application's support team for more information Another person posted a similar situation years back with the use of a large data.table. but no solution was proposed. Does anyone has come across this problem? is there a fix?. in R at 64bit, the message appear immediately after running the code. At 32bit, the code start but I run into the issue that the code reaches the RAM limit. This makes me to suspect that the error in 64bit is related to a default value on how big the files could be. Anyway, any help will be highly appreciated. Cheers, Camilo Camilo Mora, Ph.D. Department of Geography, University of Hawaii Currently available in Colombia Phone: Country code: 57 Provider code: 313 Phone 776 2282 From the USA or Canada you have to dial 011 57 313 776 2282 http://www.soc.hawaii.edu/mora/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to store and manipulate survey data like this?
On 08/13/2013 11:41 AM, Siraaj Khandkar wrote: On 08/13/2013 12:17 PM, Walter Anderson wrote: I have to process a set of survey data with questions that are formatted like this; 1) Pick your top three breeds (pick 3) 1 Rottweiler 2 Pit Bull 3 German Shepard 4 Poodle 5 Border Collie 6 Dalmation 7 Mixed Breed and the answers are formatted like this: Respondent, Question1 1, 1,4,7 2, 2,7,5 3, 6,3,5 4, ... Any suggestions on how to preprocess the file to be able to do things like frequency analysis for breeds? Here's how I would get started: survey - read.csv(survey.csv, as.is=TRUE) survey Respondent Question1 1 1 1,4,7 2 2 2,7,5 3 3 6,3,5 4 4 TipleOrNAs - function(x) {if (length(x) == 3) x else c(NA, NA, NA)} options - lapply(strsplit(survey$Question1, ,), TripleOrNAs) options - matrix(unlist(options), ncol=3, byrow=TRUE) survey2 - cbind(survey, options) names(survey2) - c(names(survey), paste(Q1.Opt, 1:3, sep=.)) survey2 Respondent Question1 Q1.Opt.1 Q1.Opt.2 Q1.Opt.3 1 1 1,4,7147 2 2 2,7,5275 3 3 6,3,5635 4 4 NA NA NA Thank you! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to store and manipulate survey data like this?
Hi, You could try: dat2- read.table(text=' Respondent, Question1 1, 1,4,7 2, 2,7,5 3, 6,3,5 4, ',sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) dat2New-cbind(dat2,do.call(rbind,lapply( str_split(str_trim(dat2[,2]),,),as.numeric))) colnames(dat2New)[3:5]- paste(Q1,colnames(dat2New)[3:5],sep=.) dat2New # Respondent Question1 Q1.1 Q1.2 Q1.3 #1 1 1,4,7 1 4 7 #2 2 2,7,5 2 7 5 #3 3 6,3,5 6 3 5 #4 4 NA NA NA A.K. - Original Message - From: Walter Anderson wandrso...@gmail.com To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 12:17 PM Subject: [R] How to store and manipulate survey data like this? I have to process a set of survey data with questions that are formatted like this; 1) Pick your top three breeds (pick 3) 1 Rottweiler 2 Pit Bull 3 German Shepard 4 Poodle 5 Border Collie 6 Dalmation 7 Mixed Breed and the answers are formatted like this: Respondent, Question1 1, 1,4,7 2, 2,7,5 3, 6,3,5 4, ... Any suggestions on how to preprocess the file to be able to do things like frequency analysis for breeds? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding S4 method dispatch
Hi Hadley, I suspect that the dispatch algorithm doesn't realize that selection is ambiguous in your example. For 2 reasons: (1) When it does realize it, it notifies the user: setClass(A, NULL) setGeneric(f, function(x, y) standardGeneric(f)) setMethod(f, signature(A, ANY), function(x, y) A-ANY) setMethod(f, signature(ANY, A), function(x, y) ANY-A) a - new(A) Then: f(a, a) Note: method with signature ‘A#ANY’ chosen for function ‘f’, target signature ‘A#A’. ANY#A would also be valid [1] A-ANY (2) When dispatch is ambiguous, the first method lexicographically in the ordering should be selected (according to ?Methods). So it should be A#A, not B#B. So it looks like a bug to me... Cheers, H. On 08/13/2013 06:08 AM, Hadley Wickham wrote: Hi all, Any insight into the code below would be appreciated - I don't understand why two methods which I think should have equal distance from the call don't. Thanks! Hadley # Create simple class hierarchy setClass(A, NULL) setClass(B, A) a - new(A) b - new(B) setGeneric(f, function(x, y) standardGeneric(f)) setMethod(f, signature(A, A), function(x, y) A-A) setMethod(f, signature(B, B), function(x, y) B-B) # These work as I expect f(a, a) f(b, b) setClass(AB, contains = c(A, B)) ab - new(AB) # Why does this return B-B? Shouldn't both methods be an equal distance? f(ab, ab) # These both return distance 1, as I expected extends(AB, A, fullInfo=TRUE)@distance extends(AB, B, fullInfo=TRUE)@distance # So why is signature(B, B) closer than signature(A, A) -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice: bwplot - changing box colors in legend and plot when using panel.groups = function... and panel = panel.superpose
I had a similar problem and found when looking inside one of the lattice functions that the legend colours are controlled by the superpose series eg superpose.line, superpose.polygon etc in trellis.par.set/get or par.settings Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au At 02:28 14/08/2013, you wrote: Content-Type: text/plain Content-Disposition: inline Content-length: 3675 I think I understand your question. You need to make sure that you are setting the right parameters in your theme. Use trellis.par.get() to have a look at the MANY possible settings. For example, in your case, to have the boxplots and rectangles be the same color: my.theme - list( box.umbrella = list(col = black), box.rectangle = list(fill= rep(c(black, black),2)), box.dot = list(col = black, pch = 3, cex=2), plot.symbol = list(cex = 1, col = 1, pch= 0), #outlier size and color par.xlab.text = font.settings, par.ylab.text = font.settings, axis.text = font.settings, #strip.shingle=list(col=c(red,blue)), superpose.symbol=list(fill=c(red,blue)), # boxplots #superpose.fill=list(col=c(red,blue)), superpose.polygon=list(col=c(red,blue)), # legend par.sub=font.settings) Kevin Wright On Tue, Aug 13, 2013 at 9:00 AM, Anna Zakrisson Braeunlich anna.zakris...@su.se wrote: Hi, Yes, I have searched stack overflow. My issue is to simply change coloring in boxes and legend in my bwplot. I have done this many times in lattice, but now I have been tweaking the plot somewhat and I can no longer apply the color changes. I would really appreciate some help. A. Zakrisson Here is some dummy data and my script: mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)), factor2 = factor(rep(c(1:2), each = 16)), var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40), sd = rep(c(1, 2, 3), each = 20))) font.settings - list( font = 1, cex = 1, fontfamily = serif) my.theme - list( box.umbrella = list(col = black), box.rectangle = list(fill= rep(c(black, black),2)), box.dot = list(col = black, pch = 3, cex=2), plot.symbol = list(cex = 1, col = 1, pch= 0), #outlier size and color par.xlab.text = font.settings, par.ylab.text = font.settings, axis.text = font.settings, par.sub=font.settings) bwplot(var1 ~ factor1, data = mydata, groups = factor2, box.width = 1/3,#width of the boxes auto.key = list(points = FALSE, rectangles = TRUE, space = right, title=Year, cex.title=1), panel = panel.superpose, ylab = var1, xlab=factor1, par.settings = my.theme, panel.groups = function(x, y, ..., group.number) { panel.bwplot(x + (group.number-1.8)/3, y, ...) }) Anna Zakrisson Braeunlich PhD student Department of Ecology, Environment and Plant Sciences Stockholm University Svante Arrheniusv. 21A SE-106 91 Stockholm Sweden/Sverige Lives in Berlin. For paper mail: Katzbachstr. 21 D-10965, Berlin - Kreuzberg Germany/Deutschland E-mail: anna.zakris...@su.se Tel work: +49-(0)3091541281 Mobile: +49-(0)15777374888 LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b º`. . `. . `. . º`. . `. . `. .º`. . `. . `. .º [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Kevin Wright [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with adding SD to graph
Among the many solutions, here is the one using phenology package: library(phenology) plot_errbar(1:100, rnorm(100, 1, 2), xlab=axe x, ylab=axe y, bty=n, xlim=c(1,100), errbar.x=2, errbar.y=rnorm(100, 1, 0.1)) or x - 1:100 plot_errbar(x=1:100, rnorm(100, 1, 2), xlab=axe x, ylab=axe y, bty=n, xlim=c(1,100), x.minus=x-2, x.plus=x+2) Sincerely Marc Girondot Le 12/08/13 15:41, Hedera a écrit : Hello, I really need help, I am completely new in using R and many things were possible to figure out but not my last problem. I created a dotchart with the dotchart2 command. On the y axis are my 16 groups and plotted is the mean of the data from each group. And now I want to add the SD for every mean data point. Sure, I can calculate the SD (with removing the NAs that are present in my data by na.rm=TRUE) but I don´t know how to combine the graph with the SD. I found the arrows function but it looks weird when I try to plot: arrows(x0=m,y0=c(1:16),y1=c(1:16)-1,x1=m+1 length=0) - m is the calculated mean of my data with removed NAs and 167 because of the 16 treatments Has somebody please, please any (easy) suggestion how to do it? I hope to get some help. Thank you very much for reading this. -- View this message in context: http://r.789695.n4.nabble.com/help-with-adding-SD-to-graph-tp4673558.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grImport/ghostscript problems
Hi Listers I have been trying to import a .ps graphic file into R using the grImport package but I keep getting the following error message Error in PostScriptTrace(fish.ps) : status 127 in running command 'gswin32c.exe -q -dBATCH -dNOPAUSE -sDEVICE=pswrite -sOutputFile=C:\Users\ahalford\AppData\Local\Temp\Rtmp6BOVDe\fileffc30613d6 -sstdout=fish.ps.xml capturefish.ps' Any advice appreciated. Andy -- Andrew Halford Ph.D Adjunct Research Scientist University of Guam Curtin University Ph: +61 (0) 468 419 473 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grImport/ghostscript problems
Hello, What is the result of sessionInfo()? Regards, Pascal 2013/8/14 Andrew Halford andrew.half...@gmail.com Hi Listers I have been trying to import a .ps graphic file into R using the grImport package but I keep getting the following error message Error in PostScriptTrace(fish.ps) : status 127 in running command 'gswin32c.exe -q -dBATCH -dNOPAUSE -sDEVICE=pswrite -sOutputFile=C:\Users\ahalford\AppData\Local\Temp\Rtmp6BOVDe\fileffc30613d6 -sstdout=fish.ps.xml capturefish.ps' Any advice appreciated. Andy -- Andrew Halford Ph.D Adjunct Research Scientist University of Guam Curtin University Ph: +61 (0) 468 419 473 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grap Element from Web Page
Dear R Helpers, I would like to pull the CIK number from the web page http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany If you put this web page into your browser you will see the CIK number in red on the left side of the page near the top. When I try the basic require(scrapeR) require(XML) require(RCurl) doc -htmlTreeParse(http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany;) str(doc) I get a large number of items in the data frame that I don't know how to interpret. Both tables - readHTMLTable(doc) and list-xmlToList(doc) result in errors. Any (positive) guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.