[R] snowfall
Hello, Just wondering why I am unable to run this in parallel. A dput of my dataset is attached at the end. Please use to create my data object. I want to run this function in parallel (not sure if this is an efficient implementation): #Function to calculate the time to maturity for the option require(fCalendar,quietly=TRUE) #Trying to calculate the trading days require(fractalrock,quietly=TRUE) #Just to calculate the trading days myFinCenter=Asia/Singapore getTimeToMaturity - function(x){ tryCatch({ toDt - as.Date(as.character(x['EXPIRY_DT']), %Y-%m-%d) #Expiry Date fromDt - as.Date(as.character(x['TIMESTAMP']), %Y-%m-%d) #Trade Timestamp NoOfDays - NROW(getTradingDates(toDt,fromDt)) return(NoOfDays/252) }, error = function (ex){ #print (paste(Error in,toDt,fromDt)) NoOfDays - 0 return(NoOfDays/252) } ) } Question: The following two lines work but the third and parallel one doesn't ... why? 1) apply(dNiftyOpt,1,getTimeToMaturity) #Works 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278 0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952 0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278 library(snowfall) 2) sfInit() snowfall 1.84 initialized: sequential execution, one CPU. sfApply(dNiftyOpt,1,getTimeToMaturity) #Works 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278 0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952 0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278 sfStop() DOESN'T WORK: 3) sfInit( parallel=TRUE, cpus=4 ); sfApply(dNiftyOpt,1,getTimeToMaturity) #Added the time to maturity. DOESN'T WORK? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sfStop(); My dataset: dput(dNiftyOpt) structure(list(INSTRUMENT = c(OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX), SYMBOL = c(NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY), EXPIRY_DT = c(2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29), STRIKE_PR = c(1780, 1780, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1820, 1820, 1820, 1830, 1830, 1830, 1830, 1840, 1840, 1850), OPTION_TYP = c(PE, PE, CE, CE, CE, CE, PE, PE, PE, PE, CE, CE, PE, CE, CE, PE, PE, CE, PE, CE), SETTLE_PR = c(27.4, 5.7, 152.95, 28.6, 70.45, 111.35, 14.75, 39.2, 8.6, 2.35, 20.4, 54.2, 50.15, 18.35, 47.25, 51.75, 15.5, 14.95, 57.95, 26.3), TIMESTAMP = c(2004-01-22, 2004-01-23, 2004-01-02, 2004-01-22, 2004-01-23, 2004-01-27, 2004-01-21, 2004-01-22, 2004-01-23, 2004-01-27, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22, 2004-01-21), Underlying = c(1770.5, 1847.55, 1946.05, 1770.5, 1847.55, 1904.7, 1824.6, 1770.5, 1847.55, 1904.7, 1770.5, 1847.55, 1770.5, 1770.5, 1847.55, 1770.5, 1847.55, 1770.5, 1770.5, 1824.6), UnderlyingVol = c(0.293906144944403, 0.331877179605752, 0.129552369208600, 0.293906144944403, 0.331877179605752, 0.348918971622834, 0.276334860399362, 0.293906144944403, 0.331877179605752, 0.348918971622834, 0.293906144944403, 0.331877179605752, 0.293906144944403, 0.293906144944403, 0.331877179605752, 0.293906144944403, 0.331877179605752, 0.293906144944403, 0.293906144944403, 0.276334860399362)), .Names = c(INSTRUMENT, SYMBOL, EXPIRY_DT, STRIKE_PR, OPTION_TYP, SETTLE_PR, TIMESTAMP, Underlying, UnderlyingVol), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20), class = data.frame) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] From polynomial to function
Dear all. I would like to use legendre polynomials which is something pretty easy in R. x-legendre.polynomials(2)[[3]] x -0.5 + 1.5*x^2 str(x) Class 'polynomial' num [1:3] -0.5 0 1.5 As you can see from the code above str(x) returns that x is of class polynomial. I want to use that polynomial as a function. The reason for that is that I would be grateful if I can feed that kind of function inside integrate(f,lower=,upper=) Could you please inform if it is possible to do that in R? I would like to thank you in advance for your help Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ThinkCell type waterfall charts in R?
On 01/12/2011 05:54 AM, ang wrote: Hi Jim, I looked through the plotrix documentation, and the waterfall plot comes from the stackpoly function right? I'm not sure if I can modify the stackpoly to create the plot I want, since stackpoly is a line plot and fills the area under with color. I haven't played with all the options yet, but what I was looking for was more similar to staircase.plot, but instead of horizontal bars, they would be vertical columns. Would you happen to know any packages or existing plots that could be easily modified to do this? Hi Adrian, Try using the dir argument as e or w. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] From polynomial to function
Alaios wrote: x-legendre.polynomials(2)[[3]] x -0.5 + 1.5*x^2 str(x) Class 'polynomial' num [1:3] -0.5 0 1.5 As you can see from the code above str(x) returns that x is of class polynomial. I want to use that polynomial as a function. The reason for that is that I would be grateful if I can feed that kind of function inside integrate(f,lower=,upper=) You can use the ‘as.function’ function. But if you’re just going to integrate the polynomial anyway, why don’t you just use the ‘integral’ function? It’s much more accurate than using numerical integration. BTW, to see which functions handle ‘polynomial’ objects, use methods(class=polynomial) Output: [1] as.character.polynomial* as.function.polynomial* coef.polynomial* [4] deriv.polynomial*GCD.polynomial* integral.polynomial* [7] LCM.polynomial* lines.polynomial*Math.polynomial* [10] Ops.polynomial* plot.polynomial* points.polynomial* [13] predict.polynomial* print.polynomial*solve.polynomial* [16] summary.polynomial* Summary.polynomial* -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot: skip a range of axis
On 01/12/2011 03:46 PM, Yuan Jian wrote: Hi, I am using plot to show scatter points in 2_D. in my data, there is no data between -1 and +1 in x-axis. I want to skip this region, i.e. x axis becomes [-Inf:-1, 1:Inf]. can any one tell me how to do? Hi Yu, Try the gap.plot function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vector or list of matrices corresponding to an observation
Dear all, I observe for each observation several joint distributions of two multinomial random variables (5x5 matrices). Right now, data are arranged so that I have 20 columns for each joint distribution by observation, which is not practical. I would like to work with matrices that would be indexed by observation, just as I work with vectors and matrices for regression like analysis. How should I proceed? With a list of list of matrices and how to build it in a systematic way (n=3000 and I observe up to 10 joint distributions by observation)? I'm not familiar with such procedure and I don't know where I should start reading. Thanks! Best, SL __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list concatenation
Bert Gunter gunter.ber...@gene.com writes: Lists are (isomorphic to) trees with (possibly) labelled nodes. A completely general solution in which two trees have possibly different topologies and different labels would therefore involve identifying the paths to leaves on each tree, e.g. via depth first search using recursion, and unioning leaves with the same paths (which could be quickly found in R via match() on the paths). This is a standard exercise in a data structures course. Considerable simplification could be effected if tree topologies and/or labels are identical or have other restrictions on them. However, you have not made it clear in your post whether this is the case (it is in your example). Thanks so much to all of you for your very helpful suggestions, that helped me solve my problem. The tree topologies are indeed identical, so the suggested solutions did work, but just for me to learn: Can somebody point me to how a general solution mentioned by Bert would look like? Cheers, Georg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems creating a PNG file for a dendrogram: Error in plot.window(...) : need finite 'xlim' values
That was a simple solution. Turns out you were correct, plot(p) was the problem. Simply removing it and everything worked perfectly. Thanks Bill, Peter and David for your help. On Jan 11, 2011, at 8:29 PM, David Winsemius wrote: On Jan 11, 2011, at 9:27 PM, David Winsemius wrote: On Jan 11, 2011, at 7:01 PM, Richard Vlasimsky wrote: Has anyone successfully created a PNG file for a dendrogram? I am able to successfully launch and view a dendrogram in Quartz. However, the dendrogram is quite large (too large to read on a computer screen), so I am trying to save it to a file (1000x4000 pixels) for viewing in other apps. However, whenever I try to initiate a PNG device, I get a need finitite 'xlim' values error. Here is some example code to illustrate my point: cor.matrix - cor(mydata,method=pearson,use=pairwise.complete.obs); distance - as.dist(1.0-cor.matrix); hc - hclust(distance); p - plot(hc); plot(p); #This works! Plot is generated in quartz no problem. #Now, try this: png(filename=delme.png,width=4000,height=1000); cor.matrix - cor(mydata,method=pearson,use=pairwise.complete.obs); distance - as.dist(1.0-cor.matrix); hc - hclust(distance); p - plot(hc); plot(p); #Error in plot.window(...) : need finite 'xlim' values #In addition: Warning messages: #1: In min(x) : no non-missing arguments to min; returning Inf #2: In max(x) : no non-missing arguments to max; returning -Inf #3: In min(x) : no non-missing arguments to min; returning Inf #4: In max(x) : no non-missing arguments to max; returning -Inf I'm not sure the other two answers address the problems I found. When I try to set up a png file with the parameters width=4000,height=1000, on a Mac I intially got no plot with what is an otherwise valid command. But after successfully getting plotting to a png device the logjam appear broken. Try: graphics.off() dev.list() #NULL png(filename=delme.png,width=4000,height=1000); plot(hc) dev.off() (Of course I used dev.off() which you did not, but even adding dev.off() was not enough to get success, at least initially. I don't understand the suggestion to get rid of plot(hc) or the suggestion that hclust() returns NULL. That's certainly not how I read the help page and examples for hclust.) I guess it's true to say that I misunderstood when Venables and Langfelder _didn't_ say either of those things. They said that plot(plot(hc)) was not needed and I have now seen the light. The missing ingredient in the OP's frustrations is still dev.off(). This is the exact same code, only a prior call to png() causes the seemingly unrelated xlim to fail. Why is this? Thanks, Richard Vlasimsky David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change strip text of effect plot
On Wed, Jan 12, 2011 at 10:36 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hmm, I felt like it was the clearest, most direct way. Then again, things of this nature (overide defaults using arguments rather than changing what you feed the functions) do seem to be a common request both for lattice and ggplot2. In any case, my memory of the lattice book and a quick search of the archives suggest that the typical way to do this when dealing with lattice functions such as xyplot() would be to define a custom strip function. For example: strip = function(..., factor.levels) strip.default(..., factor.levels = c(low, medium, high)) however, if I am not mistaken, you are plotting an object of class efflist, which dispatches plot.efflist which eventually dispatches plot.eff, in the bowels of plot.eff is the call to xyplot(), with its own strip function already defined. I did not see a way to override this, nor did I see an option to pass an argument to it. It is trivial to edit the function's code to work with such a change. You could probably create a local copy of it, make the necessary changes, and then just ensure that your object gets dispatched to your revised function rather than the packages original. Even easier, I suppose: debug(plot) plot(eff.cowles, 'neuroticism:ex2',factor.names=F) ## follow along until it has just created the object plot ## then overwrite that with your desired changes (factor.levels) ## before it is printed to the screen plot - xyplot(eval(parse(text = paste(fit ~, predictors[x.var], |, paste(predictors[-x.var], collapse = *, strip = function(..., factor.levels) strip.default(..., factor.levels = c(low, medium, high), strip.names = c(factor.names, TRUE)), panel = function(x, y, subscripts, x.vals, rug, lower, upper, has.se, ...) { llines(x, y, lwd = 2, col = colors[1], ...) if (rug) lrug(x.vals) if (has.se) { llines(x, lower[subscripts], lty = 2, col = colors[2]) llines(x, upper[subscripts], lty = 2, col = colors[2]) } if (has.thresholds) { panel.abline(h = thresholds, lty = 3) panel.text(rep(current.panel.limits()$xlim[1], length(thresholds)), thresholds, threshold.labels, adj = c(0, 0), cex = 0.75) panel.text(rep(current.panel.limits()$xlim[2], length(thresholds)), thresholds, threshold.labels, adj = c(1, 0), cex = 0.75) } }, ylim = ylim, ylab = ylab, xlab = if (missing(xlab)) predictors[x.var] else xlab, x.vals = x.vals, rug = rug, main = main, lower = x$lower, upper = x$upper, has.se = has.se, data = x, scales = list(y = list(at = tickmarks$at, labels = tickmarks$labels), alternating = alternating), ...) alternately still, you could rename the factor levels in the object eff.cowles, which is sort of inbetween changing your data and changing the strip label defaults. I can understand why it seems like there should be a simpler solution, but I honestly do not see one. The factor.levels argument does not even work inside xyplot(), it needs to be in the strip function, and that is nested far away from plot(effobject). I do not see any documentation that suggests a way, nor do I see any formal arguments to facilitate it. Then again, I'm not an expert in lattice or the effects package. One generally useful fact is that if you can get hold of the trellis object being plotted, you may be able to change the strip labels before plotting it. For example, foo - trellis.last.object() or in this case foo - plot(eff.cowles[['neuroticism:ex2']],factor.names=F) (this returns the object, but also forces a plot), followed by dimnames(foo) $ex2 [1] (1.98,8.99] (8.99,16] (16,23] dimnames(foo)$ex2 - c(low, medium, high) foo But I would tend to agree with Josh, that specifying labels in the cut() call is the more natural approach. It is not the job of graphics functions to modify (characteristics of) your data. -Deepayan Best regards, Josh On Tue, Jan 11, 2011 at 8:06 PM, Wincent ronggui.hu...@gmail.com wrote: Sure that is one way to go. Since it is possible to pass lattice arguments to that high level function, and there should be ways to relabel/custom the strip text in the lattice plotting system, I would think such an indirect method a last resort. Thank you. Ronggui On 12 January 2011 11:51, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi, I am guessing this is not what you meant by on the fly, but I think it will be by far the easiest way. Plotting an effects object is a high level plot with a lot of defaults and automation built in to make your life simple. The cost is that it is less flexible---you work its way, not vice versa. If you want the factor named high, just label it that way to begin with. If you think it makes the graphs more
Re: [R] Degrees of freedom
Hi: Look at the links in the following blog entry: http://blog.lib.umn.edu/moor0554/canoemoore/2010/09/lmer_p-values_lrt.html and this discussion, found on the R wiki: http://rwiki.sciviews.org/doku.php?id=guides:lmer-tests Also see Ben Bolker's GLMM wiki page, which discusses many of the unresolved foundational issues in (generalized) linear mixed models: http://glmm.wikidot.com/faq Welcome to the jungle :) HTH, Dennis On Tue, Jan 11, 2011 at 9:09 PM, Umit Tokac u...@fsu.edu wrote: Hello, I have a little problem about degree of freedom in R. if you can help me, I will be happy. I used nlme function to analyze my data and run the linear mixed effects model in R. I did the linear mixed effect analysis in SAS and SPSS as well. However, R gave the different degrees of freedom than SAS and SPSS did. Can you help me to learn what the reason is to obtain different degrees of freedom from R? Thanks Umit Tokac Graduate Student Measurement and Statistics Florida State University Tel:(850)345-7487 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R syntax for 95% prediction interval on a left-truncated normal variable
Hi all, I am searching for a R procedure to calculate a 95% prediction interval (i.e. the interval in which following observations of the variable will occur, not a confidence interval) on a variable that is the natural logarithm of a ratio that is always equal ore superior to 1, so the natural-log-transformed variable is left truncated from 0 and I think that it is expected to be normal; anyone can help? ThankYou, Fabio Dott. Fabio Colombo Coordinatore tecnico Università degli Studi di Milano, Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza Alimentare, Laboratorio di Identificazione di Specie (LIS) Via A. Grasselli, 7, 20131 Milano, Italy Phone: +390250318504 Fax: +390250318501 E.mail fabio.colo...@unimi.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change strip text of effect plot
Dear Deepayan, Josh, and Ronggui, I've recently changed plot.eff() so that it returns an object, normally printed by print.plot.eff(). You can therefore manipulate the lattice object, as Deepayan suggests. This change is currently in the development version of the effects package on R-Forge, not yet on CRAN. I agree with both Josh and Deepayan that it would be simpler for you just to name the levels of the factor with the labels you want. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Deepayan Sarkar Sent: January-12-11 7:26 AM To: Joshua Wiley Cc: r help Subject: Re: [R] how to change strip text of effect plot On Wed, Jan 12, 2011 at 10:36 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hmm, I felt like it was the clearest, most direct way. Then again, things of this nature (overide defaults using arguments rather than changing what you feed the functions) do seem to be a common request both for lattice and ggplot2. In any case, my memory of the lattice book and a quick search of the archives suggest that the typical way to do this when dealing with lattice functions such as xyplot() would be to define a custom strip function. For example: strip = function(..., factor.levels) strip.default(..., factor.levels = c(low, medium, high)) however, if I am not mistaken, you are plotting an object of class efflist, which dispatches plot.efflist which eventually dispatches plot.eff, in the bowels of plot.eff is the call to xyplot(), with its own strip function already defined. I did not see a way to override this, nor did I see an option to pass an argument to it. It is trivial to edit the function's code to work with such a change. You could probably create a local copy of it, make the necessary changes, and then just ensure that your object gets dispatched to your revised function rather than the packages original. Even easier, I suppose: debug(plot) plot(eff.cowles, 'neuroticism:ex2',factor.names=F) ## follow along until it has just created the object plot ## then overwrite that with your desired changes (factor.levels) ## before it is printed to the screen plot - xyplot(eval(parse(text = paste(fit ~, predictors[x.var], |, paste(predictors[-x.var], collapse = *, strip = function(..., factor.levels) strip.default(..., factor.levels = c(low, medium, high), strip.names = c(factor.names, TRUE)), panel = function(x, y, subscripts, x.vals, rug, lower, upper, has.se, ...) { llines(x, y, lwd = 2, col = colors[1], ...) if (rug) lrug(x.vals) if (has.se) { llines(x, lower[subscripts], lty = 2, col = colors[2]) llines(x, upper[subscripts], lty = 2, col = colors[2]) } if (has.thresholds) { panel.abline(h = thresholds, lty = 3) panel.text(rep(current.panel.limits()$xlim[1], length(thresholds)), thresholds, threshold.labels, adj = c(0, 0), cex = 0.75) panel.text(rep(current.panel.limits()$xlim[2], length(thresholds)), thresholds, threshold.labels, adj = c(1, 0), cex = 0.75) } }, ylim = ylim, ylab = ylab, xlab = if (missing(xlab)) predictors[x.var] else xlab, x.vals = x.vals, rug = rug, main = main, lower = x$lower, upper = x$upper, has.se = has.se, data = x, scales = list(y = list(at = tickmarks$at, labels = tickmarks$labels), alternating = alternating), ...) alternately still, you could rename the factor levels in the object eff.cowles, which is sort of inbetween changing your data and changing the strip label defaults. I can understand why it seems like there should be a simpler solution, but I honestly do not see one. The factor.levels argument does not even work inside xyplot(), it needs to be in the strip function, and that is nested far away from plot(effobject). I do not see any documentation that suggests a way, nor do I see any formal arguments to facilitate it. Then again, I'm not an expert in lattice or the effects package. One generally useful fact is that if you can get hold of the trellis object being plotted, you may be able to change the strip labels before plotting it. For example, foo - trellis.last.object() or in this case foo - plot(eff.cowles[['neuroticism:ex2']],factor.names=F) (this returns the object, but also forces a plot), followed by dimnames(foo) $ex2 [1] (1.98,8.99] (8.99,16] (16,23] dimnames(foo)$ex2 - c(low, medium, high) foo But I would tend to agree with Josh, that specifying
[R] graphics: 3D regression plane
Hello Masters, wishing you all a great 2011 I was also going to ask if anyone knows a quick and efficient way to plot a regression plane (z~x*y). I have tried the regr2.plot{HH} function but it is only an educational tool and has poor graphical properties. I also tried to run the following script on a fictitious longitudinal problem, with poor results set.seed(1234) id-c(rep(1,3),rep(2,4),rep(3,2)) # subjects y-rchisq(9,df=20) #response k-rnorm(9,4,2) # x time-as.Date(c(03/07/1981,15/11/1981,03/04/1983,08/12/1979, 30/12/1979,08/03/1980,12/08/1980,12/08/1973,28/03/1975), format=%d/%m/%Y) fac-c(m,m,m,f,f,f,f,m,m)# sex d1-as.vector(by(time,factor(id),min)) t0-as.Date(d1,origin=as.Date(1970-01-01));t0 A-data.frame(id=c(1,2,3),t=t0) B-data.frame(id=id,tempo=time) C-merge(A,B);C rd-as.vector(C$tempo-C$t);rd #time centered on sbj specific first occurrence mod-lm(y~rd*k) newax- expand.grid( days = giorni-seq(min(rd),max(rd), length=100), expl= esplic- seq(min(k), max(k), length=100) ) fit - predict(mod,data.frame(rd=giorni,k=esplic)) graph - persp(x=giorni, y=esplic,fit, expand=0.5, ticktype=detailed, theta=-45) #error : z argument not valid I would be grateful if someone would give me some suggestions. Thank u again and happy new year Federico Bonofiglio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue loading and executing own function.R with JRI, any ideas?
Hey, how did you manage to load and call your own function with JRI? I managed to execute the build-in R functions with JRI, but when I call r.eval(load(path-to-file)) or r.eval(source(path-to-file)), but my java program terminates :( Thanks for your help! -- View this message in context: http://r.789695.n4.nabble.com/Issue-loading-and-executing-own-function-R-with-JRI-any-ideas-tp3213756p3213756.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multilevel pseudo maximum likelihood
Caterina, Did you get an answer to this question? I'm trying to do something similar. Jason -- View this message in context: http://r.789695.n4.nabble.com/Multilevel-pseudo-maximum-likelihood-tp878413p3213583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sum by column
Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! ED ECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376PA1201 2.810428PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846AT0101 14.981488 AT0101 14.93511AT0101 14.93511AT0101 21.040785 AT0101 8.271615AT0101 12.94232AT0101 6.749846AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467 AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot: skip a range of axis
thanks Jim, I found gap.plot seperates x axis or y axis into two boxes. do you know any plot tool that can skip a range in x-axis or y-axis without lines? regards YU --- On Wed, 12/1/11, Jim Lemon j...@bitwrit.com.au wrote: From: Jim Lemon j...@bitwrit.com.au Subject: Re: [R] plot: skip a range of axis To: Yuan Jian jayuan2...@yahoo.com Cc: r-help@r-project.org Received: Wednesday, 12 January, 2011, 10:26 AM On 01/12/2011 03:46 PM, Yuan Jian wrote: Hi, I am using plot to show scatter points in 2_D. in my data, there is no data between -1 and +1 in x-axis. I want to skip this region, i.e. x axis becomes [-Inf:-1, 1:Inf]. can any one tell me how to do? Hi Yu, Try the gap.plot function in the plotrix package. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Weighted Likelihood Estimation of NIG Dist.
Hi All, I've put together some script which gives me the parameters of a modified NIG distribution. Now I'd like to include a weighted vector within the maximization function. The code below gives me parameter estimates which are implicitly equally weighted. nig.par.fit - try(optim(vega, negllh, hessian = se, pdf = density.nig, tmp.data = data, transf = transform, const.pars = vars[!opt.pars], silent = silent, par.names = names(vars), ...)) vega are the parameters to be fitted. This seems to work reasonably for my equally weighted observations, but I have not been able to find/adjust the function to include a weighted vector (w_i). I'm trying not to alter the density function. Is there a way of doing this with optim, or another optimizer I should be looking at? Many thanks, J -- View this message in context: http://r.789695.n4.nabble.com/Weighted-Likelihood-Estimation-of-NIG-Dist-tp3213608p3213608.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] non-parametric discriminant analysis
Hi, I used linear discriminant analysis (lda) to classify and to cross-validate samples of two plant species based on morphometric data and to identify the variables that best discriminate between the two species. Because some of the variables are not, and can not be transformed to be, normally distributed, I would like to use a non-parametric method. Are there any R packages that provide methods for non-parametric discriminant analysis? Would randomForest (http://cran.r-project.org/web/packages/randomForest) be appropriate and recommended? Thanks for help and best regards Walter Durka -- * Dr. Walter Durka Department Biozönoseforschung Department of community ecology Helmholtz-Zentrum für Umweltforschung GmbH - UFZ Helmholtz Centre for Environmental Research - UFZ Theodor-Lieser-Str. 4 / 06120 Halle / Germany walter.du...@ufz.de / http://www.ufz.de/index.php?en=798 phone +49 345 558 5314 / Fax +49 345 558 5329 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum by column
There are two functions you need to become familiar with: ?tapply ?ave If you wanted these summed values to be placed in another column of the same dataframe, you would use ave. If you wanted a new structure (somewhat shorter) you would use tapply with sum as the function. E. g: tapply(ecoregion$ED, ecoregion$ECOCODE, sum) -- David. On Jan 12, 2011, at 5:38 AM, Peter Francis wrote: Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! ED ECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376PA1201 2.810428PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846AT0101 14.981488 AT0101 14.93511AT0101 14.93511AT0101 21.040785 AT0101 8.271615AT0101 12.94232AT0101 6.749846AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467 AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum by column
Hi Peter, R has some fairly flexible ways of passing values of some variable (X) by another (the INDEX) to different FUNctions. Here is an example using your data: ## your email data, in convenient form dat - structure(list(ED = c(21.809467, 36.229566, 51.861284, 11.36232, 27.264634, 12.261986, 46.519313, 7.815376, 2.810428, 13.478372, 35.670182, 27.128715, 19.010294, 15.475368, 18.597983, 29.292615, 6.749846, 14.981488, 14.93511, 14.93511, 21.040785, 8.271615, 12.94232, 6.749846, 15.484412, 29.644494, 43.211212), ECOCODE = structure(c(1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 3L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(AA0101, AT0101, AT0801, AT1201, PA1201, PA1301), class = factor)), .Names = c(ED, ECOCODE), class = data.frame, row.names = c(NA, -27L)) ## look at the structure of the data str(dat) ## inside of dat (to avoid typing its name repeatedly) ## find the sum of ED at each level of ECOCODE with(dat, tapply(X = ED, INDEX = ECOCODE, FUN = sum, na.rm = TRUE)) ## should give something like AA0101AT0101AT0801AT1201PA1201PA1301 21.80947 236.83684 27.12871 34.48566 209.60328 35.67018 For documentation, look at: ?tapply ## similar in many ways though sometimes slightly more/less convenient ?by Hope that helps, Josh On Wed, Jan 12, 2011 at 2:38 AM, Peter Francis peterfran...@me.com wrote: Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! ED ECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232 PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376 PA1201 2.810428 PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846 AT0101 14.981488 AT0101 14.93511 AT0101 14.93511 AT0101 21.040785 AT0101 8.271615 AT0101 12.94232 AT0101 6.749846 AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467 AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] flexmix: predictions on new data from flexmix object
Dear R Users, R Core Team, I currently wonder how to predict the probability of an event with new data resulting from a finite mixture. I read the documentation of the flexmix package and the examples of applications provided on CRAN but I could not find how to predict (except manually but I am looking for a simpler solution) the final probability of the mixture (for each individual) with new data (I mean data different from the ones I used to build the model). I should have missed something but basically, I am fitting a 2-components mixture model with logistic weights and logistic components (and different explanatory variables, no identifiability problem). The flexmix object is then used in predict() function with 'newdata' in argument, and the predictions with these new data are obviously different depending on the component which is assigned to new observations. My question is: how can I access the information on clustering the new observations (if I understood well, predictions are the predictions of the event probability for each individual and each component, but these are not predictions for cluster assignment...) ? Thank you very much in advance for your answers, Sincerely, Xavier M. Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Integrate and subdivisions limit
Dear all, I have some issues with integrate in R thus I would like to request your help. I am trying to calculate the integral of f(x)*g(x). The f(x) is a step function while g(x) is a polynomial. If f(x) (step function) changes its value only few times (5 or 6 'steps') everything is calulated ok(verified results in scrap paper) but if f(x) takes like 800 different values I receive the error Error in integrate number of subdivisions reached I did some checks on the internet and I found that I can increase the number of subdivisions (as this is a parameter in integrate(). Thus I raised it from 100 to 1000 (and then to 1). A. Does this makes the error produced higher or does it only stress the computer? B. When the number was raised to 10.000 I started getting the error message roundoff error was detected What do you think I should do to solve that? I would like to thank u in advance for your help Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: vector or list of matrices corresponding to an observation
You're right Jim. I have now started to work with list of matrices. I'm not sure that it is a clean and nice code but it works. I was wondering if there was a synthetic but advanced tutorial on list() and associated functions? Stephane -- Forwarded message -- From: jim holtman jholt...@gmail.com Date: 2011/1/12 Subject: Re: [R] vector or list of matrices corresponding to an observation To: SL sl...@yahoo.fr An actual example of your data would be useful along with how you might want to access it. You can create a matrix of 'list' objects and these could contain your matrices, but not knowing what the data looks like, or how you intend to use it, make it hard to provide a solution. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Outputting csv file from dataframe with columns in a particular order
I have a dataframe with columns ID,'date,estimate,actual (but not necessarily in that order - I do a merge somewhere and that somehow messes up the order of the columns). How can I output it to a csv file with the columns in the order that I want? Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debug biglm response error on bigglm model
Thank you, Greg. The issue was in the simulation logic, where one of the values was not changing correctly for some iterations... On Jan 10, 3:20 pm, Greg Snow greg.s...@imail.org wrote: Not sure, but one possible candidate problem is that in your simulations one iteration ended up with fewer levels of a factor than the overall dataset and that caused the error. There is no recode function in the default packages, there are at least 6 recode functions in other packages, we cannot tell which you were using from the code below. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Mike Harwood Sent: Monday, January 10, 2011 6:29 AM To: r-h...@r-project.org Subject: [R] debug biglm response error on bigglm model G'morning What does the error message Error in x %*% coef(object) : non- conformable arguments indicate when calculating the response values for newdata with a model from bigglm (in package biglm), and how can I debug it? I am attempting to do Monte Carlo simulations, which may explain the loop in the code that follows. After the code I have included the output, which shows that the simulations are changing the response and input values, and that there are not any atypical values for the factors in the seventh iteration. At the end of the output is the aforementioned error message. Finally, I have included the model from biglm. Thanks in advance! Code: === iter - nrow(nov.2010) predict.nov.2011 - vector(mode='numeric', length=iter) for (i in 1:iter) { iter.df - nov.2010 ##-- Update values of dynamic variables -- iter.df$age - iter.df$age + 12 iter.df$pct_utilize - iter.df$pct_utilize + mc.util.delta[i] iter.df$updated_varname1 - ceiling(iter.df$updated_varname1 + mc.varname1.delta[i]) if(iter.df$state==WI) iter.df$varname3 - iter.df$varname3 + mc.wi.varname3.delta[i] if(iter.df$state==MN) iter.df$varname3 - iter.df$varname3 + mc.mn.varname3.delta[i] if(iter.df$state==IL) iter.df$varname3 - iter.df$varname3 + mc.il.varname3.delta[i] if(iter.df$state==US) iter.df$varname3 - iter.df$varname3 + mc.us.varname3.delta[i] ##--- Bin Variables -- iter.df$bin_varname1 - as.factor(recode(iter.df$updated_varname1, 300:499 = '300 - 499'; 500:549 = '500 - 549'; 550:599 = '550 - 599'; 600:649 = '600 - 649'; 650:699 = '650 - 699'; 700:749 = '700 - 749'; 750:799 = '750 - 799'; 800:849 = 'GE 800'; else = 'missing'; )) iter.df$bin_age - as.factor(recode(iter.df$age, 0:23 = ' 24mo.'; 24:72 = '24 - 72mo.'; 72:300 = '72 - 300mo'; else = 'missing'; )) iter.df$bin_util - as.factor(recode(iter.df$pct_utilize, 0.0:0.2 = ' 0 - 20%'; 0.2:0.4 = ' 20 - 40%'; 0.4:0.6 = ' 40 - 60%'; 0.6:0.8 = ' 60 - 80%'; 0.8:1.0 = ' 80 - 100%'; 1.0:1.2 = '100 - 120%'; else = 'missing'; )) iter.df$bin_varname2 - as.factor(recode(iter.df$varname2_prop, 0:70 = ' 70%'; 70:85 = ' 70 - 85%'; 85:95 = ' 85 - 95%'; 95:110 = '95 - 110%'; else = 'missing'; )) iter.df$bin_varname1 - relevel(iter.df$bin_varname1, 'missing') iter.df$bin_age - relevel(iter.df$bin_age, 'missing') iter.df$bin_util - relevel(iter.df$bin_util, 'missing') iter.df$bin_varname2 - relevel(iter.df$bin_varname2, 'missing') #~ print(head(iter.df)) if (i=6 i=8){ print('-') browser() print(i) print(table(iter.df$bin_varname1)) print(table(iter.df$bin_age)) print(table(iter.df$bin_util)) print(table(iter.df$bin_varname2)) #~ debug(predict.nov.2011[i] - #~ sum(predict(logModel.1, newdata=iter.df, type='response'))) } predict.nov.2011[i] - sum(predict(logModel.1, newdata=iter.df, type='response')) print(predict.nov.2011[i]) } Output == [1] 36.56073 [1] 561.4516 [1] 4.83483 [1] 5.01398 [1] 7.984146 [1] - Called from: top level Browse[1] [1] 6 missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749 750 - 799 GE 800 842 283 690 1094 1695 3404 6659 18374 21562 missing 24mo. 24 - 72mo. 72 - 300mo 16 2997 19709 31881 missing 0 - 20% 20 - 40% 40 - 60% 60 - 80% 80 - 100% 100 - 120% 17906
Re: [R] extracting more information from optim in R?
I have no experience with writing C code, but if I have such problems in R code, I add a line to my function which prints the values to the console: eg: fr - function(x) { ## Rosenbrock Banana function x1 - x[1] x2 - x[2] cat (paste(x1, x2, \n)) 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 } optim(c(-1.2,1), fr) If the same goes for C, I don't know. Bart -- View this message in context: http://r.789695.n4.nabble.com/extracting-more-information-from-optim-in-R-tp3213439p3214066.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum by column
David and Josh, Thanks very much for your help, it is much appreciated. Peter On 12 Jan 2011, at 14:28, David Winsemius wrote: There are two functions you need to become familiar with: ?tapply ?ave If you wanted these summed values to be placed in another column of the same dataframe, you would use ave. If you wanted a new structure (somewhat shorter) you would use tapply with sum as the function. E. g: tapply(ecoregion$ED, ecoregion$ECOCODE, sum) -- David. On Jan 12, 2011, at 5:38 AM, Peter Francis wrote: Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! EDECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232 PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376 PA1201 2.810428 PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846 AT0101 14.981488 AT0101 14.93511 AT0101 14.93511 AT0101 21.040785 AT0101 8.271615 AT0101 12.94232 AT0101 6.749846 AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467 AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bootstrapping to Correct Standard Errors in Two-Stage Least Square Estimation
Dear friends I want to estimate an equation using two-stage least square but suspect that the model suffers from autocorrelation. Can someone please advise how to implement bootstrapping method in order to calculate the correct standard errors in R? Thank you. Kind regards Thanaset -- View this message in context: http://r.789695.n4.nabble.com/Bootstrapping-to-Correct-Standard-Errors-in-Two-Stage-Least-Square-Estimation-tp3214080p3214080.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic Stars Plot - help ..
Hi there Rers I am trying a very basic stars plot: x-matrix(c(1,4,3,1.1,2,3,4,3,1,1,5,2), ncol = 3, byrow = TRUE, dimnames=list(c(a,b,c,d),c(x,y,z))) stars(x, draw.segments = TRUE, radius=TRUE) Can anyone explain what I am seeing there - EACH of my plots should have 3 coloured sectors no ? - for x, y and z. How come I am seeing only two sectors? Also the length of the radii show the ratio compared to the other values of the columns (e.g. x), correct ? (And there is no direct relationship between the values of a single row, or is there?) I am using R 1.12.1 for Linux, Many Thanks JP [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to define values for color distribution in the package gplots and function heatmap.2
Hi, This question is about the package gplots and the function heatmap.2. I'm not a programmer and I did not understand the answers I found when I googled it. Thank you in advance. A similar question was asked with title heatmap color distribution in 2005 and got the answer to use breaks. When I try to use the example code, I get the error Error in image.default(1:nc, 1:nr, x, xlim = 0.5 + c(0, nc), ylim = 0.5 + : must have one more break than colour. Basically, I have several sets of gene expression data and want to be able to compare these between patients. An example: When I use heatmap.2 with one gene set (with the colors red-black-green), green is 3 and red is 9 and in between the colors fade to black. When I then use another gene set, green is 5 and red is 11. I need to be able do define the limits to compare between heatmaps. Thank you again, Fredrik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outputting csv file from dataframe with columns in a particular order
Hi! Let's say your data.frame is called df and that you want column 1, then column 4, then 3 and then 2: df - data.frame(ID=LETTERS[1:5], date=rnorm(5), estimate=rnorm(5), actual=rnorm(5)) write.csv(df[c(1,4,3,2)], file=df.csv) HTH, Ivan Le 1/12/2011 16:16, analys...@hotmail.com a écrit : I have a dataframe with columns ID,'date,estimate,actual (but not necessarily in that order - I do a merge somewhere and that somehow messes up the order of the columns). How can I output it to a csv file with the columns in the order that I want? Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snowfall
You forgot to load the required packages on the client nodes by sfLibrary(fCalendar) sfLibrary(fractalrock) and you really should not tryCatch without evaluating the errors for yourself. Best wishes, Uwe Ligges On 12.01.2011 09:47, Santosh Srinivas wrote: Hello, Just wondering why I am unable to run this in parallel. A dput of my dataset is attached at the end. Please use to create my data object. I want to run this function in parallel (not sure if this is an efficient implementation): #Function to calculate the time to maturity for the option require(fCalendar,quietly=TRUE) #Trying to calculate the trading days require(fractalrock,quietly=TRUE) #Just to calculate the trading days myFinCenter=Asia/Singapore getTimeToMaturity- function(x){ tryCatch({ toDt- as.Date(as.character(x['EXPIRY_DT']), %Y-%m-%d) #Expiry Date fromDt- as.Date(as.character(x['TIMESTAMP']), %Y-%m-%d) #Trade Timestamp NoOfDays- NROW(getTradingDates(toDt,fromDt)) return(NoOfDays/252) }, error = function (ex){ #print (paste(Error in,toDt,fromDt)) NoOfDays- 0 return(NoOfDays/252) } ) } Question: The following two lines work but the third and parallel one doesn't ... why? 1) apply(dNiftyOpt,1,getTimeToMaturity) #Works 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278 0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952 0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278 library(snowfall) 2) sfInit() snowfall 1.84 initialized: sequential execution, one CPU. sfApply(dNiftyOpt,1,getTimeToMaturity) #Works 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278 0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952 0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278 sfStop() DOESN'T WORK: 3) sfInit( parallel=TRUE, cpus=4 ); sfApply(dNiftyOpt,1,getTimeToMaturity) #Added the time to maturity. DOESN'T WORK? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sfStop(); My dataset: dput(dNiftyOpt) structure(list(INSTRUMENT = c(OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX), SYMBOL = c(NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY), EXPIRY_DT = c(2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29), STRIKE_PR = c(1780, 1780, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1820, 1820, 1820, 1830, 1830, 1830, 1830, 1840, 1840, 1850), OPTION_TYP = c(PE, PE, CE, CE, CE, CE, PE, PE, PE, PE, CE, CE, PE, CE, CE, PE, PE, CE, PE, CE), SETTLE_PR = c(27.4, 5.7, 152.95, 28.6, 70.45, 111.35, 14.75, 39.2, 8.6, 2.35, 20.4, 54.2, 50.15, 18.35, 47.25, 51.75, 15.5, 14.95, 57.95, 26.3), TIMESTAMP = c(2004-01-22, 2004-01-23, 2004-01-02, 2004-01-22, 2004-01-23, 2004-01-27, 2004-01-21, 2004-01-22, 2004-01-23, 2004-01-27, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22, 2004-01-21), Underlying = c(1770.5, 1847.55, 1946.05, 1770.5, 1847.55, 1904.7, 1824.6, 1770.5, 1847.55, 1904.7, 1770.5, 1847.55, 1770.5, 1770.5, 1847.55, 1770.5, 1847.55, 1770.5, 1770.5, 1824.6), UnderlyingVol = c(0.293906144944403, 0.331877179605752, 0.129552369208600, 0.293906144944403, 0.331877179605752, 0.348918971622834, 0.276334860399362, 0.293906144944403, 0.331877179605752, 0.348918971622834, 0.293906144944403, 0.331877179605752, 0.293906144944403, 0.293906144944403, 0.331877179605752, 0.293906144944403, 0.331877179605752, 0.293906144944403, 0.293906144944403, 0.276334860399362)), .Names = c(INSTRUMENT, SYMBOL, EXPIRY_DT, STRIKE_PR, OPTION_TYP, SETTLE_PR, TIMESTAMP, Underlying, UnderlyingVol), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20), class = data.frame) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] Outputting csv file from dataframe with columns in a particular order
On 2011-01-12 07:16, analys...@hotmail.com wrote: I have a dataframe with columns ID,'date,estimate,actual (but not necessarily in that order - I do a merge somewhere and that somehow messes up the order of the columns). How can I output it to a csv file with the columns in the order that I want? Let's say that your data.frame is DF. mynames - c(ID, date, estimate, actual) write.csv(DF[, mynames], ) Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Outputting csv file from dataframe with columns in a particular order
Hi r-help-boun...@r-project.org napsal dne 12.01.2011 16:16:16: I have a dataframe with columns ID,'date,estimate,actual (but not necessarily in that order - I do a merge somewhere and that somehow messes up the order of the columns). If you have datafreme with column order a, b, c, d and you want b, c, d, a just order columns accordingly write.table(df[,c(b, c, d, s)], ) Regards Petr How can I output it to a csv file with the columns in the order that I want? Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum by column
Or with ddply : library(plyr) dat - structure(list(ED = c(21.809467, 36.229566, 51.861284, 11.36232, 27.264634, 12.261986, 46.519313, 7.815376, 2.810428, 13.478372, 35.670182, 27.128715, 19.010294, 15.475368, 18.597983, 29.292615, 6.749846, 14.981488, 14.93511, 14.93511, 21.040785, 8.271615, 12.94232, 6.749846, 15.484412, 29.644494, 43.211212), ECOCODE = structure(c(1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 3L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(AA0101, AT0101, AT0801, AT1201, PA1201, PA1301), class = factor)), .Names = c(ED, ECOCODE), class = data.frame, row.names = c(NA, -27L)) dat ddply(dat,ECOCODE,summarise,EDsummed=sum(ED)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA http://www.fws.gov/redbluff/rbdd_jsmp.aspx - Original Message From: Peter Francis peterfran...@me.com To: r-help@r-project.org Sent: Wed, January 12, 2011 2:38:19 AM Subject: [R] Sum by column Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! ED ECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232 PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376 PA1201 2.810428 PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846 AT0101 14.981488 AT0101 14.93511 AT0101 14.93511 AT0101 21.040785 AT0101 8.271615 AT0101 12.94232 AT0101 6.749846 AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467 AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics: 3D regression plane
On Jan 12, 2011, at 6:10 AM, Federico Bonofiglio wrote: Hello Masters, wishing you all a great 2011 I was also going to ask if anyone knows a quick and efficient way to plot a regression plane (z~x*y). There are many. There are limitations to using the ?? operator in that it only brings up functions that are installed on your machine but when I enter: ??3D ... on my machine it nominates a variety of functions from these packages: ca car emdbook grDevices HH igraph lattice locfit misc3d plotrix raster rgl rpanel scatterplot3d sm sna spatstat spancs TeachingDemos vcdExtra If you installed the sos package you would have search access to all of the functions in CRAN packages (and maybe more). There are also a variety of graphic galleries: http://research.stowers-institute.org/efg/R/ http://addictedtor.free.fr/graphiques/allgraph.php http://rgm2.lab.nig.ac.jp/RGM2/images.php?show=allpageID=1108 I have tried the regr2.plot{HH} function but it is only an educational tool and has poor graphical properties. Ah, a critic. And a very non-specific one at that. I also tried to run the following script on a fictitious longitudinal problem, with poor results Because of poor programming and failure to read the manuals. set.seed(1234) id-c(rep(1,3),rep(2,4),rep(3,2)) # subjects y-rchisq(9,df=20) #response k-rnorm(9,4,2) # x time-as.Date(c(03/07/1981,15/11/1981,03/04/1983,08/12/1979, 30/12/1979,08/03/1980,12/08/1980,12/08/1973,28/03/1975), format=%d/%m/%Y) fac-c(m,m,m,f,f,f,f,m,m)# sex d1-as.vector(by(time,factor(id),min)) t0-as.Date(d1,origin=as.Date(1970-01-01));t0 A-data.frame(id=c(1,2,3),t=t0) B-data.frame(id=id,tempo=time) C-merge(A,B);C rd-as.vector(C$tempo-C$t);rd #time centered on sbj specific first occurrence mod-lm(y~rd*k) newax- expand.grid( days = giorni-seq(min(rd),max(rd), length=100), expl= esplic- seq(min(k), max(k), length=100) ) fit - predict(mod,data.frame(rd=giorni,k=esplic)) graph - persp(x=giorni, y=esplic,fit, expand=0.5, ticktype=detailed, theta=-45) #error : z argument not valid I would be grateful if someone would give me some suggestions. First suggestion would be to re-read the predict help page: You are throwing together symbols in a manner not expected by predict. The argument to newdata is invalid because you did not construct your newax dataframe correctly, resulting in only 100 predicted points (at the original data). newax should have had column names that match the variables in the model. This is what you got: str(newax) 'data.frame': 1 obs. of 2 variables: $ days: num 0 6.45 12.91 19.36 25.82 ... $ expl: num 0.499 0.499 0.499 0.499 0.499 ... - attr(*, out.attrs)=List of 2 ..$ dim : Named int 100 100 .. ..- attr(*, names)= chr days expl ..$ dimnames:List of 2 .. ..$ days: chr days= 0.00 days= 6.454545 days= 12.909091 days= 19.363636 ... .. ..$ expl: chr expl=0.4985331 expl=0.5615784 expl=0.6246238 expl=0.6876691 ... Generally is is a bad idea to use - inside data.frame(). I'm not sure if it's illegal, but it certainly is confusing. And the predict result might have had the correct length had you had used the newax dataframe, but it needed to be passed to persp as a properly dimensioned matrix: ?persp At the end of your constructed example try this instead: mod-lm(y~rd*k) newax- expand.grid( rd = seq(min(rd), max(rd), length=100), k = seq(min(k), max(k), length=100) ) fit - predict(mod,newax) graph - persp(x=seq(min(rd), max(rd), length=100), y=seq(min(k), max(k), length=100), z= matrix(fit, 100, 100), expand=0.5, ticktype=detailed, theta=-45) persp is not a lattice plotting function, so it does its plotting by side-effects. It does return a value but it is only a transformation matrix and I do not see that you have intentions to use i the graph object, but who knows. Thank u again and happy new year Federico Bonofiglio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic Stars Plot - help ..
On 12.01.2011 15:53, JP wrote: Hi there Rers I am trying a very basic stars plot: x-matrix(c(1,4,3,1.1,2,3,4,3,1,1,5,2), ncol = 3, byrow = TRUE, dimnames=list(c(a,b,c,d),c(x,y,z))) stars(x, draw.segments = TRUE, radius=TRUE) Can anyone explain what I am seeing there - EACH of my plots should have 3 coloured sectors no ? - for x, y and z. How come I am seeing only two sectors? Also the length of the radii show the ratio compared to the other values of the columns (e.g. x), correct ? (And there is no direct relationship between the values of a single row, or is there?) I am using R 1.12.1 for Linux, I guess you forgot the auto-scaling is done scaling the radii to [0,1] for the three columns of the matrix. Uwe Ligges Many Thanks JP [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adonis, amova and haplotype frequency
Dear All, I'd like to perform adonis (from the vegan package) rather than amova (in ade4) on some haplotype data, as I have crossed factors. Is there a simple way to tweak the source to allow weights (haplotype frequencies) in a similar way to amova? Best Simon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R not recognized in command line
Im sorry for the late reply. The output to echo %PATH% : C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32 ;C:\Windows ;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\; C:\Program Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32; C:\Program Fil es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL Server\100\To ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program Files\ R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts 2011/1/8 Uwe Ligges lig...@statistik.tu-dortmund.de OK, then let's see if you managed to change the PATH or let is see what is incorrect there. Therefore, please do the following: 1. open a Windows command shell (what you call a DOS window) 2. type echo %PATH% 3. Send us the output. Uwe Ligges On 08.01.2011 19:29, Aaditya Nanduri wrote: Mr. Gregory : I may have to resort to a roundabout method like yours. I just cant seem to make it work. Thank you for your help. Mr. Spector : Everytime I change the path, I closed all the DOS windows. Yet, R is not recognized as a command. Also, I just want to say, you have an awesome last name. -- Aaditya Nanduri aaditya.nand...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Formatted output with alternating format at different rows
thx a lot Jim. sprintf solved my problem. Ray On Mon, Jan 3, 2011 at 3:40 PM, jim holtman jholt...@gmail.com wrote: 'sprintf' if your friend: dummy3 = c(1.1, 2.2, 3.3) dummy4 = c(4.4, 5.5, 6.6, 7.7) dummy2 = c(8.8, 9.9) cat(sprintf(%5.1f%6.2f%7.3f\n, dummy3[1], dummy3[2], dummy3[3])) 1.1 2.20 3.300 cat(sprintf(%5.2f%6.3f%7.4f%8.5f\n, dummy4[1], dummy4[2], dummy4[3], dummy4[4])) 4.40 5.500 6.6000 7.7 cat(sprintf(%5.3f%6.4f\n, dummy2[1], dummy2[2])) 8.8009.9000 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R not recognized in command line
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Aaditya Nanduri Sent: Wednesday, January 12, 2011 8:44 AM To: Uwe Ligges Cc: r-help@r-project.org; spec...@stat.berkeley.edu Subject: Re: [R] R not recognized in command line Im sorry for the late reply. The output to echo %PATH% : C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32 ;C:\Windows ;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\; C:\Program Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32; C:\Program Fil es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL Server\100\To ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program Files\ R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts For your system, the path to R should probably be C:\Program Files\R\R-2.12.1\bin\i386 Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R not recognized in command line
On 12.01.2011 18:13, Daniel Nordlund wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Aaditya Nanduri Sent: Wednesday, January 12, 2011 8:44 AM To: Uwe Ligges Cc: r-help@r-project.org; spec...@stat.berkeley.edu Subject: Re: [R] R not recognized in command line Im sorry for the late reply. The output to echo %PATH% : C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32 ;C:\Windows ;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\; C:\Program Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32; C:\Program Fil es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL Server\100\To ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program Files\ R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts For your system, the path to R should probably be C:\Program Files\R\R-2.12.1\bin\i386 Right, additionally, the output from above is wrong anyway: Entries must not end on \, and blanks before or after ; are not allowed. Uwe Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Require
I think that the quietly argument in require isn't working require('JumboShrimp', quietly=TRUE) Warning in library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'JumboShrimp' By the way, the behavior is the same with options(warn=0) or options(warn=1) I'm using R 2.12 (2010-10-15) on a windows 7 machine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Require
On 12.01.2011 18:53, Gene Leynes wrote: I think that the quietly argument in require isn't working require('JumboShrimp', quietly=TRUE) Warning in library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'JumboShrimp' ?require says: If TRUE, no message confirming package loading is printed, and most often, no errors/warnings are printed if package loading fails. It does not say that is keeps quiet if the package does not even exist on your machine. If you really want to suppress such important warnings, use suppressWarnings(require('JumboShrimp', quietly=TRUE)) By the way, the behavior is the same with options(warn=0) or options(warn=1) ... which I would expect from reading ?options. You have to use options(warn=-1) to suppress. Best, Uwe Ligges I'm using R 2.12 (2010-10-15) on a windows 7 machine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Require
Gene Leynes gleyne...@gmail.com writes: I think that the quietly argument in require isn't working require('JumboShrimp', quietly=TRUE) Warning in library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'JumboShrimp' Isn't quietly meant to suppress a message if loading is successful? Matthew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Don´t know what test i have to use
Hello, I´m starting with my PhD and I have to stop because i got a little knowledge in R and statistics. I´ve got a model of this kind: binary response variable: prevalence of infection (0/1) 3 categorical independent variables: sex, month and name of the area I was trying with a full model like this, before the simplification model-aov(prevalencia~sex*month*area) but the Fligner test told that i haven´t got homoscedascity, so I suppose I should trying with glm, with a model model2-glm(prevalencia~edad*sexo*mes*zona,binomial) is that correct? where I must put the link (logit) ? Thnks very much -- View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Metafor vs Meta vs Spreadsheet: wrong numbers
Hello, I experimented the Metafor and Meta packages in the scope of replacing Excel for meta-analysis. I performed the first working example provided in Michael Borenstein's book Introduction to Meta-Analysis with Excel, Metafor and Meta. The numbers given by my spreadsheet, which I validated from Borenstein's book, conrespond quite closely to those given by Meta, but are different from those obtained using Metafor. For the fixed effect, I infer that the differences are related to numerical issues, but for the random effect, the numbers are considerably different. Unfortunately, I could not find where I made it wrong. I would be grateful if someone would have a look at my calculations. Here are the meta-analysis commands: ### USING METAFOR library(metafor) ( dat-escalc(m1i=m1i, sd1i=sd1i, n1i=n1i, m2i=m2i, sd2i=sd2i, n2i=n2i, measure=SMD, data=metaData, append=T) ) # COMPUTE EFFECT SIZE ( res-rma.uni(yi,vi,data=dat,method=HE, level=95) ) ### RANDOM EFFECT ( res-rma.uni(yi,vi,data=dat,method=FE, level=95) ) ### FIXED EFFECT ### USING META ( res-metacont(metaData[,3], metaData[,1], metaData[,2], metaData[,6], metaData[,4], metaData[,5], studlab=rownames(metaData),sm=SMD, level = 0.95, level.comb = 0.95, comb.fixed=TRUE, comb.random=TRUE, label.e=Experimental, label.c=Control, bylab=rownames(metaData)) ) The whole R script is temporarly available at http://bit.ly/eYesbZ The spreadsheet is temporarly available at http://bit.ly/fAYWPo Kind regards, S.-É. Parent, Eng., Ph.D. Department of Soils and Agrifood Engineering, Université Laval Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R not recognized in command line
Although it could easily be user error, I never got Rpy or Rpy2 working an any sort of reliable way. However I did learn a couple of things about the Windows PATH First, (as others have mentioned) it's easiest to modify the PATH through the Windows GUI that comes up when you right click My Computer, and then you can modify the Environmental Variables somewhere. Second, here are examples of the syntax for adding things to the PATH from a DOS command line path = %PATH%;C:\Program Files\R\R-2.12.1\bin path = %PATH%;C:\Rtools\bin path = %PATH%;C:\Rtools\MinGW\bin path = %PATH%;C:\Rtools\perl\bin set R_HOME = C:\Program Files\R\R-2.12.1\ Third, I found this R command useful for checking the path from within R: Sys.getenv()[['PATH']] On Wed, Jan 12, 2011 at 10:43 AM, Aaditya Nanduri aaditya.nand...@gmail.com wrote: Im sorry for the late reply. The output to echo %PATH% : C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32 ;C:\Windows ;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\; C:\Program Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32; C:\Program Fil es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL Server\100\To ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program Files\ R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts 2011/1/8 Uwe Ligges lig...@statistik.tu-dortmund.de OK, then let's see if you managed to change the PATH or let is see what is incorrect there. Therefore, please do the following: 1. open a Windows command shell (what you call a DOS window) 2. type echo %PATH% 3. Send us the output. Uwe Ligges On 08.01.2011 19:29, Aaditya Nanduri wrote: Mr. Gregory : I may have to resort to a roundabout method like yours. I just cant seem to make it work. Thank you for your help. Mr. Spector : Everytime I change the path, I closed all the DOS windows. Yet, R is not recognized as a command. Also, I just want to say, you have an awesome last name. -- Aaditya Nanduri aaditya.nand...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question on dummy variable
Thanks Gabor and other for their input. I admit that I must have placed some reproducible codes on what I wanted. However it was actually in my mind however I restrained because it was not any R related query rather a general Statistics related. Here I am using dummy variables in ***Time series context***. Please assume following artificial TS along with the quarterly dummies: library(zoo) # my time series MyTimeSeries - zooreg(101:126, start=as.yearqtr(as.Date(2005-01-01)), frequency=4) # creation of quarterly dummy ### dummy1 dummy1 - zooreg(Reduce(rbind, rep(list(diag(4)), 7)), start=as.yearqtr(as.Date(2005-01-01)), frequency=4) dummy1 - merge(dummy1, MyTimeSeries, all=F)[,1:4] colnames(dummy1) - paste(dummy, 1:4, sep=) ### dummy2 dummy2 - dummy1 - 1/4 ### dummy3 dummy3 - dummy1 dummy3[dummy3 ==0] = -1/(4-1) # Time series with quarterly dummy TS_with_dummy1 - cbind(MyTimeSeries, dummy1[,-4]) TS_with_dummy2 - cbind(MyTimeSeries, dummy2[,-4]) TS_with_dummy3 - cbind(MyTimeSeries, dummy3[,-4]) TS_with_dummy1 TS_with_dummy2 TS_with_dummy3 Here you see, as my previous post, there are 3 types of dummies: dummy1, dummy2, and dummy3 (quarterly dummies). I used to use dummy1 declaration for all my time series analysis. However later in the vars package I noticed the 2nd type of definition for dummy. And 3rd definition I have come across from somewhere in net (which I cant just recall at this time.) Here my question was: which is the centred dummy variable (according to help page of vars package 2nd one is the centred dummy)? However I am searching for the definition of centred dummy variables in time series analysis context. Therefore I would want to know, why 2nd one is called centred dummy? Why people prefer for it, not the Standard dummy definition (i.e. dummy1). Can you please explain? Thanks and regards, -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: 12 January 2011 05:47 To: Christofer Bogaso Cc: r-help@r-project.org Subject: Re: [R] A question on dummy variable On Tue, Jan 11, 2011 at 3:18 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Dear all, I would like to ask one question related to statistics, for specifically on defining dummy variables. As of now, I have come across 3 different kind of dummy variables (assuming I am working with Seasonal dummy, and number of season is 4): dummy1 - diag(4) for(i in 1:3) dummy1 - rbind(dummy1, diag(4)) dummy1 - dummy1[,-4] dummy2 - dummy1 dummy2[dummy2 == 0] = -1/(4-1) dummy3 - dummy1 - 1/4 head(dummy1) [,1] [,2] [,3] [1,] 1 0 0 [2,] 0 1 0 [3,] 0 0 1 [4,] 0 0 0 [5,] 1 0 0 [6,] 0 1 0 head(dummy2) [,1] [,2] [,3] [1,] 1.000 -0.333 -0.333 [2,] -0.333 1.000 -0.333 [3,] -0.333 -0.333 1.000 [4,] -0.333 -0.333 -0.333 [5,] 1.000 -0.333 -0.333 [6,] -0.333 1.000 -0.333 head(dummy3) [,1] [,2] [,3] [1,] 0.75 -0.25 -0.25 [2,] -0.25 0.75 -0.25 [3,] -0.25 -0.25 0.75 [4,] -0.25 -0.25 -0.25 [5,] 0.75 -0.25 -0.25 [6,] -0.25 0.75 -0.25 Now I want to know which type of dummy definition is called Centered dummy and why it is called so? Is it equivalent to use any of the above definitions (atleast 2nd and 3rd?) It would really be very helpful if somebody point any suggestion and clarification. The contrasts of your dummy1 matrix are contr.SAS contrasts in R. (The default contrasts in R are contr.treatment which are the same as contr.SAS except contr.SAS uses the last level as the base whereas treatment contrasts use the first level as the base.) options(contrasts = c(contr.SAS, contr.poly)) f - gl(4, 1, 16) M - model.matrix( ~ f ) all( M[, -1] == dummy1) # TRUE Centered contrasts are ones which have been centered -- i.e. the mean of each column has been subtracted from that column. This is equivalent to saying that the column sums are zero. The means of the three columns of dummy1 are c(1/4, 1/4, 1/4) so if we subtract 1/4 from dummy1 we get a centered contrasts matrix. That is precisely what you did to get dummy3. We can check that dummy3 is centered: colSums(dummy3) # 0 0 0 dummy2 is just a scaled version of dummy3. In fact dummy2 equals dummy3 / .75 so its not fundamentally different. Its columns still sum to zero so its still centered. all( dummy2 == dummy3 / .75) # TRUE colSums(dummy2) # 0 0 0 except for floating point error -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Require
I read the help first, and read it again now, but I still don't see why the warning was generated. The help seems to state clearly that the user can suppress warnings most often, no errors/warnings are printed if package loading fails. If the package doesn't exist, then it would fail to load, right? When that failure happens, I thought I could suppress the warning message with the quietly argument. Thank you for pointing out the suppressWarnings function, that one is new to me, and I suppose it will work here. However it seems that it shouldn't be necessary with quietly=TRUE. By the way, I wanted to suppress the confusing message generated by R, and put in a simple recommendation that the user should try installing the package. 2011/1/12 Uwe Ligges lig...@statistik.tu-dortmund.de On 12.01.2011 18:53, Gene Leynes wrote: I think that the quietly argument in require isn't working require('JumboShrimp', quietly=TRUE) Warning in library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'JumboShrimp' ?require says: If TRUE, no message confirming package loading is printed, and most often, no errors/warnings are printed if package loading fails. It does not say that is keeps quiet if the package does not even exist on your machine. If you really want to suppress such important warnings, use suppressWarnings(require('JumboShrimp', quietly=TRUE)) By the way, the behavior is the same with options(warn=0) or options(warn=1) ... which I would expect from reading ?options. You have to use options(warn=-1) to suppress. Best, Uwe Ligges I'm using R 2.12 (2010-10-15) on a windows 7 machine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Don´t know what test i have to use
On Jan 12, 2011, at 12:51 PM, gaiarrido wrote: Hello, I´m starting with my PhD and I have to stop because i got a little knowledge in R and statistics. I´ve got a model of this kind: binary response variable: prevalence of infection (0/1) 3 categorical independent variables: sex, month and name of the area I was trying with a full model like this, before the simplification model-aov(prevalencia~sex*month*area) but the Fligner test told that i haven´t got homoscedascity, so I suppose I should trying with glm, with a model model2-glm(prevalencia~edad*sexo*mes*zona,binomial) is that correct? where I must put the link (logit) ? Why not read the help page regarding binomial that is on the help page for glm. There you will learn what the default link is for binomial. -- David Thnks very much -- View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Don´t know what test i have to use
Hi, That is basically correct. You can specify the link as logit (see my example), but that is the default so you do not strictly need to in this case. II would encourage you to keep your variables (prevalencia, edad, sexo, mes) stored in a data frame, in which case you would add the data = argument to glm(). model2 - glm(prevalencia ~ edad * sexo * mes * zona, family = binomial(link = logit), data = your_dataframe) Also, you might take a look at ?predict.glm it has some examples with binomial data based off the wonderful book by Drs. Venables and Ripley. Oh, and finally, if you have 12 levels of months, ? levels of zones, and 2 levels of sex, you might not want the 4way interactions that you will get by default from using the '*' operator inside a formula. Unless you have a theory that there is an additional effect of being a middle aged female in the month of July for zone 8, but not Cheers, Josh On Wed, Jan 12, 2011 at 9:51 AM, gaiarrido gaiarr...@usal.es wrote: Hello, I´m starting with my PhD and I have to stop because i got a little knowledge in R and statistics. I´ve got a model of this kind: binary response variable: prevalence of infection (0/1) 3 categorical independent variables: sex, month and name of the area I was trying with a full model like this, before the simplification model-aov(prevalencia~sex*month*area) but the Fligner test told that i haven´t got homoscedascity, so I suppose I should trying with glm, with a model model2-glm(prevalencia~edad*sexo*mes*zona,binomial) is that correct? where I must put the link (logit) ? Thnks very much -- View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integrate and subdivisions limit
Dear all, I have some issues with integrate in R thus I would like to request your help. I am trying to calculate the integral of f(x)*g(x). The f(x) is a step function while g(x) is a polynomial. If f(x) (step function) changes its value only few times (5 or 6 'steps') everything is calulated ok(verified results in scrap paper) but if f(x) takes like 800 different values I receive the error Error in integrate number of subdivisions reached I did some checks on the internet and I found that I can increase the number of subdivisions (as this is a parameter in integrate(). Thus I raised it from 100 to 1000 (and then to 1). A. Does this makes the error produced higher or does it only stress the computer? B. When the number was raised to 10.000 I started getting the error message roundoff error was detected What do you think I should do to solve that? I would like to thank u in advance for your help Best Regards Alex There's obviously a more numerically stable approach. If g(x) is a polynomial you do know its polynomial antiderivative. Take that and sum up all intervals where the step function is constant. Example: g(x) = 1 constant, integrate x^2 over 0..1 in 1 subdivisions. The antiderivative is 1/3 * x^3, thus g - function(x) 1 x - seq(0, 1, len=10001) sum((x[2:10001]^3 - x[1:1]^3)*g(2:10001))/3 #= 0.333 The antiderivative of a polynomial a_1 x^n + ... + a_{n+1} given as vector c(a1,...) can also be determined automatically, no manual work is necessary. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2d plot with modification of plotting symbol to indicate third dimension.
I would like to plot 3-dimensional data on a two-dimensional scatter-plot. Is there a way I can automatically modify the plot symbol (e.g. changing size or color) to indicate the value of a third variable? E.g. How can I plot weight vs. age and indicate the value of muscle mass for each value weight-age pair by making the plot point proportional to the subject's muscle mass? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2d plot with modification of plotting symbol to indicate third dimension.
You don't give an example, but in general you can use a vector for cex with the values proportional to the third variable. Same goes for color: col can be a vector, not just a single value. This has been discussed before on-list, and fairly recently. Sarah On Wed, Jan 12, 2011 at 2:19 PM, John Sorkin jsor...@grecc.umaryland.edu wrote: I would like to plot 3-dimensional data on a two-dimensional scatter-plot. Is there a way I can automatically modify the plot symbol (e.g. changing size or color) to indicate the value of a third variable? E.g. How can I plot weight vs. age and indicate the value of muscle mass for each value weight-age pair by making the plot point proportional to the subject's muscle mass? Thanks, John -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] metafor/ meta-regression
Hi I have tryed to do the meta-regression in metafor package, but I would like to get the standardized coefficients for each variable, however in command: Ø res-rma.uni (yi, vi, method=REML, mods=~cota+DL+uso+gadiente+idade, data= turbidez) I just have the coefficients no standardized (estimate) of the multiple regression. What I need to do? Thanks Fernanda Melo Carneiro contato: (62) 3521-1480 e 8121-7374www.ecoevol.ufg.br Laboratório de Ecologia Teórica e Síntese (UFG) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Don´t know what test i have to use
Thanks very much both. I´m starting playing with it, i was a little afaid because it was part of my job, but now i've found it very funny. Josh, I've got just data for 3 representatives months, and it's not a priori rejectable that could be differences in the ratio of changes along the months between the 2 sexes. Thanks again -- View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214638.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame subset too slow
Sorry for the late response. I was away for vacation and was unable to keep on working on the codes. Anyway, I was unable to provide *str* of that specific data since they are all in a big package with lots of inputs/outputs. Quickly gazing through the code, I narrowed them down (and made a bad guess) to data frame. But it turned out that data frame was not the reason. After carefully check through the package, I found out that there is a double for loop. I replaced that double for loop and now instead of running ~ 13hrs, the package now runs ~ 13min for a similar dataset. Thanks for all your helps, D. On 12/30/10 11:40 AM, jim holtman wrote: If you want the data in the first column of the dataframe, then you should be using '[['. Notice what comes back in each of these cases: str(dat) 'data.frame': 8 obs. of 5 variables: $ sample.1.200..n..TRUE.: int 25 199 70 124 93 157 49 137 192 57 ... $ runif.n. : num 0.7725 0.0263 0.0728 0.7594 0.2792 ... $ runif.n..1: num 0.4304 0.8608 0.0882 0.5666 0.1721 ... $ runif.n..2: num 0.3797 0.1191 0.0481 0.3297 0.0649 ... $ runif.n..3: num 0.0895 0.0441 0.0403 0.9679 0.3986 ... str(dat[1]) 'data.frame': 8 obs. of 1 variable: $ sample.1.200..n..TRUE.: int 25 199 70 124 93 157 49 137 192 57 ... str(dat[[1]]) int [1:8] 25 199 70 124 93 157 49 137 192 57 ... str(dat$sample.1.200..n..TRUE) int [1:8] 25 199 70 124 93 157 49 137 192 57 ... str(dat[,1]) int [1:8] 25 199 70 124 93 157 49 137 192 57 ... You will get different classes of values. We would really need to see the output of 'str' on your data structures to see what might be happening. Your data is not that big and most subsetting/extractions should be in less than a second unless there is something funny in your data. So provide the 'str' so we can see. On Thu, Dec 30, 2010 at 11:28 AM, Dukeduke.li...@gmx.com wrote: Hi Jim, Is this really a problem for me to use [1] instead of [[1]]? Will this make it run slower? Also, if I use dat$V1 %in% list$V1, will it be fine? Anyway, my data and list are basically gene lists (tab delimited): $ head test.txt Xkr4chr1-32045623661579320610236614293 3204562,3411782,3660632,3207049,3411982,3661579, Rp1chr1-42809264399322428306143992684 4280926,4341990,4342282,4399250,4283093,4342162,4342918,4399322, Rp1_2chr1-43335874350395433468043429064 4333587,4341990,4342282,4350280,4340172,4342162,4342918,4350395, Sox17chr1-44810084486494448179644834875 4481008,4483180,4483852,4485216,4486371, 4482749,4483547,4483944,4486023,4486494, Mrpl15chr1-47632784775807476453247757585 4763278,4767605,4772648,4774031,4775653, 4764597,4767729,4772814,4774186,4775807, Mrpl15_2chr1-47632784775807477580747758074 4763278,4767605,4772648,4775653,4764597,4767729,4772814,4775807, $ head list.txt GeneNamesChrStartEnd 0610007C21Rikchr53135101231356996 0610007L01Rikchr5130695613130719635 0610007L01Rik_2chr5130698204130719635 0610007P08Rikchr136391662764001609 0610007P08Rik_2chr136391664163970963 0610007P14Rikchr128715640487165495 Thanks, D. On 12/30/10 11:13 AM, jim holtman wrote: You should be using dat[[1]]. Here is an example with 8 rows that take about 0.02 seconds to get the subset. Provide an 'str' of what your data looks like n- 8 # rows to create dat- data.frame(sample(1:200, n, TRUE), runif(n), runif(n), runif(n), runif(n)) lst- data.frame(sample(1:100, n, TRUE), runif(n), runif(n), runif(n), runif(n)) str(dat) 'data.frame': 8 obs. of 5 variables: $ sample.1.200..n..TRUE.: int 39 116 69 163 51 125 144 32 28 4 ... $ runif.n. : num 0.519 0.793 0.549 0.77 0.272 ... $ runif.n..1: num 0.691 0.89 0.783 0.467 0.357 ... $ runif.n..2: num 0.705 0.254 0.584 0.998 0.279 ... $ runif.n..3: num 0.873 1 0.678 0.702 0.455 ... str(lst) 'data.frame': 8 obs. of 5 variables: $ sample.1.100..n..TRUE.: int 38 83 38 70 77 44 81 55 32 1 ... $ runif.n. : num 0.0621 0.7374 0.074 0.4281 0.0516 ... $ runif.n..1: num 0.879 0.294 0.146 0.884 0.58 ... $ runif.n..2: num 0.648 0.745 0.825 0.507 0.799 ... $ runif.n..3: num 0.2523 0.1679 0.9728 0.0478 0.0967 ... system.time({ + dat.sub- dat[dat[[1]] %in% lst[[1]],] + }) user system elapsed 0.020.000.01 str(dat.sub) 'data.frame': 39803 obs. of 5 variables: $ sample.1.200..n..TRUE.: int 39 69 51 32 28 4 69 3 48 69 ... $ runif.n. : num 0.5188 0.5494 0.2718 0.5566 0.0893 ... $ runif.n..1: num 0.691 0.783 0.357 0.619 0.717 ... $ runif.n..2: num 0.705
Re: [R] 2d plot with modification of plotting symbol to indicate third dimension.
Look at the symbols function for some options of doing what you suggest (you can also do a search for bubble plot for a couple of other implementations). If you want to go a bit further than what symbols does for you then look at the my.symbols function in the TeachingDemos package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of John Sorkin Sent: Wednesday, January 12, 2011 12:19 PM To: r-help@r-project.org Subject: [R] 2d plot with modification of plotting symbol to indicate third dimension. I would like to plot 3-dimensional data on a two-dimensional scatter- plot. Is there a way I can automatically modify the plot symbol (e.g. changing size or color) to indicate the value of a third variable? E.g. How can I plot weight vs. age and indicate the value of muscle mass for each value weight-age pair by making the plot point proportional to the subject's muscle mass? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggredating date data
I tried a date by date forecast of a time series and it seems to be too wild. How can I aggregate the date into weeks or months as required? Thanks. The input looks like ID datadate(-MM-DD) value_for_day -- ---- -- -- and I want to be able to change it to ID dataweek value_for_week or ID datamonth value_ for_ month __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outputting csv file from dataframe with columns in a particular order
Thanks to all who responded. On Jan 12, 10:34 am, Peter Ehlers ehl...@ucalgary.ca wrote: On 2011-01-12 07:16, analys...@hotmail.com wrote: I have a dataframe with columns ID,'date,estimate,actual (but not necessarily in that order - I do a merge somewhere and that somehow messes up the order of the columns). How can I output it to a csv file with the columns in the order that I want? Let's say that your data.frame is DF. mynames - c(ID, date, estimate, actual) write.csv(DF[, mynames], ) Peter Ehlers __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help in calculating ar on ranked vector
I was using ar(stats) to calculate autoregressive coefficient. It works on vector z, but it will not work on vector rz -rank (z, ties.method=average). What did I miss? Any info will be greatly appreciated. TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Data Transformation
Hi John, Thank you for your patience. I was away for a State certification exam yesterday, so am just getting back to this. Reading through you response I believe I wasn't clear enough about what I'm trying to do. Your description seems to rearrange the matrix without grouping the analytical results for a single sample onto a single line, as I had hoped. I may have confused things by attempting to send a truncated/simplified dataset. Restatement of needs: * I have 863 individual samples. The following columns contain invariant results for each sample: - Transect,Offset,Location,fldsampid,CLP_ID,sacode,matrix,LTCCODE, Northing,Easting,CRDUNITS,Event,LOGDATE,sbd,sed. - Sorting can make use of fldsampid as these values are entirely unique to each sample. * Each sample is associates with one or more of the following 48 analytical parameters: - AG,AL,ALK,ALKB,ALKC,AS,B,BA,BE,BR,CA,CDCL,CO,CR,CU, DOC,FE,Hg,HG,HGACIDLAB,HGEXTINO,HGEXTORG,HGNONMOBHGSEMIMOB,K, MEHG,MG,MN,MO,NH3N,NI,NO2N,NO3,NO3N,PBPO4,S,SB,SE,SO4, SOLID,SSC,TL,TOC,V,Zn,ZN - These are currently stored in the PARLABEL column. * For each sample ID I would like to create a single line; - Extract each PARLABEL to use as a column name; and - Place the Result in the appropriate column. * I can subset the data so that prccode, Lab, EXMCODE, Analysis, PARVQ, RL, EPA_FLAGS, and units are irrelevant to the issue. The following snippet should illustrate the absolute minumum needs: INPUT fldsampid | PARLABEL| Result +--+- fldsampid1 | PARLABEL-a | value-8 fldsampid1 | PARLABEL-b | value-5 fldsampid1 | PARLABEL-x | value-2 fldsampid1 | PARLABEL-y | value-0 fldsampid2 | PARLABEL-a | value-9 fldsampid2 | PARLABEL-c | value-8 fldsampid3 | PARLABEL-a | value-2 fldsampid3 | PARLABEL-d | value-8 fldsampid3 | PARLABEL-w | value-3 fldsampid3 | PARLABEL-x | value-9 fldsampid3 | PARLABEL-y | value-6 OUTPUT fldsampid | PARLABEL-a | PARLABEL-b | PARLABEL-w | PARLABEL-x | PARLABEL-y +--+--+--+--+-- fldsampid1 | value-8| value-5| NA | value-2| value-0 fldsampid2 | value-9| value-2| NA | NA | NA fldsampid3 | value-2| NA | value-3| value-9| value-6 If it would help I could attach a 31kb file written with write.table(Units_NG.L, file=Units_NG.L, quote=FALSE, sep=\t) This subset has 97 individual samples and 3 PARLABELS distributed across 249 individual lines. Added Responses: 1. The structure of my actual input file appears to be correct per the following: (I has sent you a separate extration from an excel file) (Strings as Factors, numbers as num or int; a date changed via as.Date()) 'data.frame': 19694 obs. of 25 variables: $ Transect : Factor w/ 78 levels FLR01,FLR02,..: 1 1 1 1 1 1 1 1 1 1 ... $ Offset : Factor w/ 16 levels 0,A,B,C,..: 1 1 1 1 1 1 1 1 1 1 ... $ Location : Factor w/ 246 levels FLR010,FLR01A,..: 1 1 1 1 1 1 1 1 1 1 ... $ fldsampid: Factor w/ 863 levels FLR010-ANE1,..: 1 1 1 1 1 1 1 1 1 1 ... $ CLP_ID : Factor w/ 586 levels ,MY6591,MY6593,..: 1 1 1 1 1 1 1 1 1 1 ... $ sacode : Factor w/ 2 levels FD,N: 2 2 2 2 2 2 2 2 2 2 ... $ matrix : Factor w/ 6 levels SE,SO,TA,..: 3 3 3 3 3 3 3 3 3 3 ... $ LTCCODE : Factor w/ 4 levels BH,LK,RE,..: 4 4 4 4 4 4 4 4 4 4 ... $ Northing : num 2444733 2444733 2444733 2444733 2444733 ... $ Easting : num 5684613 5684613 5684613 5684613 5684613 ... $ CRDUNITS : Factor w/ 1 level FT: 1 1 1 1 1 1 1 1 1 1 ... $ Event: int 1 1 1 1 1 1 1 1 1 1 ... $ LOGDATE :Class 'Date' num [1:19694] -717743 -717743 -717743 -717743 -717743 ... $ sbd : num 0 0 0 0 0 0 0 0 0 0 ... $ sed : num 0 0 0 0 0 0 0 0 0 0 ... $ prccode : Factor w/ 5 levels INO,MET,MI,..: 2 2 2 2 2 2 2 2 2 2 ... $ Lab : Factor w/ 5 levels A4SW,BRLS,..: 2 2 2 2 2 2 2 2 2 2 ... $ EXMCODE : Factor w/ 5 levels FLDFLT,METHOD,..: 2 2 2 2 2 2 2 2 2 2 ... $ Analysis : Factor w/ 23 levels A2320,A2540G,..: 10 11 12 12 12 12 12 12 12 12 ... $ PARLABEL : Factor w/ 48 levels AG,AL,ALK,..: 27 20 1 2 6 7 8 9 12 14 ... $ PARVQ: Factor w/ 3 levels =,ND,TR: 1 1 2 1 1 1 1 2 1 1 ... $ Result : num 20.6 24.7 5 14900 60 100 4930 4 182 80 ... $ RL : num 3.1 0.77 10 5750 160 790 160 8 10 80 ... $ EPA_FLAGS: Factor w/ 10 levels ,J,J-,J+,..: 4 1 7 3 2 2 1 7 1 2 ... $ units: Factor w/ 3 levels ug/kg,ug/L,..: 1 1 1 1 1 1 1 1 1 1 ... 2. etc... Sorry to confuse you, this was to indicate additional columns. Guy Jett ITSI, A Gilbane Company (925) 946-3340 Direct (925) 457-4168 ITSI Cell gj...@itsi.com -Original Message- From: John Kane [mailto:jrkrid...@yahoo.ca] Sent: Monday, January 10, 2011 4:29 PM To: r-help@r-project.org; Guy
[R] navigating in lists
Dear list members, I am stuck with navigating in a rather complicated list object. In general I would need a solution to access all first (or other) elements of the different sublists in one list: test=list(a=list(1,2),b=list(3,4),c=list(5,6)) like: test[[1:3]][[1]] which should result in c(1,3,5) Is there any way to access lists in such a way? Using unlist would create quite complicated objects Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot: skip a range of axis
You could always create a new vector, something like Xprime- if x-0 x else x-2 #not valid R code Thus mapping +1 to -1, and shifting everything else down. Fixing the x-tick labels is left as a homework problem :-) I'm assuming from your description that there are no y-values corresponding to -1x1, so plotting y vs Xprime won't lose any of your data. Carl From: Yuan Jian jayuan2008_at_yahoo.com Date: Wed, 12 Jan 2011 04:15:38 -0800 (PST) thanks Jim, I found gap.plot seperates x axis or y axis into two boxes. do you know any plot tool that can skip a range in x-axis or y-axis without lines? regards YU * On Wed, 12/1/11, Jim Lemon jim_at_bitwrit.com.au wrote: From: Jim Lemon jim_at_bitwrit.com.au Subject: Re: [R] plot: skip a range of axis To: Yuan Jian jayuan2008_at_yahoo.com Cc: r-help_at_r-project.org Received: Wednesday, 12 January, 2011, 10:26 AM On 01/12/2011 03:46 PM, Yuan Jian wrote: Hi, I am using plot to show scatter points in 2_D. in my data, there is no data between -1 and +1 in x-axis. I want to skip this region, i.e. x axis becomes [-Inf:-1, 1:Inf]. can any one tell me how to do? Hi Yu, Try the gap.plot function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2d plot with modification of plotting symbol to indicate third dimension.
Google on R Graph Gallery to find examples with code. You can also almost certainly RSiteSearch() on appropriate keys to find a pre-existing function (which someone may provideyou on the list). Also: ?symbols -- Bert On Wed, Jan 12, 2011 at 11:19 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: I would like to plot 3-dimensional data on a two-dimensional scatter-plot. Is there a way I can automatically modify the plot symbol (e.g. changing size or color) to indicate the value of a third variable? E.g. How can I plot weight vs. age and indicate the value of muscle mass for each value weight-age pair by making the plot point proportional to the subject's muscle mass? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] navigating in lists
sapply(test, '[[', 1) a b c 1 3 5 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Jannis Sent: Wednesday, January 12, 2011 1:50 PM To: r-help@r-project.org Subject: [R] navigating in lists Dear list members, I am stuck with navigating in a rather complicated list object. In general I would need a solution to access all first (or other) elements of the different sublists in one list: test=list(a=list(1,2),b=list(3,4),c=list(5,6)) like: test[[1:3]][[1]] which should result in c(1,3,5) Is there any way to access lists in such a way? Using unlist would create quite complicated objects Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Data Transformation
Hi: This seems like a problem that is well suited for the cast() function in package reshape2. Here's a toy example: library(reshape2) df - data.frame(idnum = rep(101:104, c(3, 2, 4, 3)), lab = unlist(sapply(c(3, 2, 4, 3), function(x) sample(LETTERS[1:6], x))), y = rpois(12, 7)) df idnum lab y 1101 C 7 2101 F 11 3101 E 4 4102 B 10 5102 A 7 6103 E 6 7103 D 9 8103 B 3 9103 F 12 10 104 C 10 11 104 D 9 12 104 A 5 cast(df, idnum ~ lab, value = 'y') idnum A B C D E F 1 101 NA NA 7 NA 4 11 2 102 7 10 NA NA NA NA 3 103 NA 3 NA 9 6 12 4 104 5 NA 10 9 NA NA Since you have multiple 'invariant' variables, you could use something like cast(df, . ~ lab, value = 'y') This would make all variables other than lab the 'rows', the values of lab as separate 'columns', with the value of the variable y inserted in the appropriate locations, NA fill otherwise. cast() is an alternative to the reshape() function in base R. HTH, Dennis On Wed, Jan 12, 2011 at 12:19 PM, Guy Jett gj...@itsi.com wrote: Hi John, Thank you for your patience. I was away for a State certification exam yesterday, so am just getting back to this. Reading through you response I believe I wasn't clear enough about what I'm trying to do. Your description seems to rearrange the matrix without grouping the analytical results for a single sample onto a single line, as I had hoped. I may have confused things by attempting to send a truncated/simplified dataset. Restatement of needs: * I have 863 individual samples. The following columns contain invariant results for each sample: - Transect,Offset,Location,fldsampid,CLP_ID,sacode,matrix,LTCCODE, Northing,Easting,CRDUNITS,Event,LOGDATE,sbd,sed. - Sorting can make use of fldsampid as these values are entirely unique to each sample. * Each sample is associates with one or more of the following 48 analytical parameters: - AG,AL,ALK,ALKB,ALKC,AS,B,BA,BE,BR,CA,CDCL,CO,CR,CU, DOC,FE,Hg,HG,HGACIDLAB,HGEXTINO,HGEXTORG,HGNONMOBHGSEMIMOB,K, MEHG,MG,MN,MO,NH3N,NI,NO2N,NO3,NO3N,PBPO4,S,SB,SE,SO4, SOLID,SSC,TL,TOC,V,Zn,ZN - These are currently stored in the PARLABEL column. * For each sample ID I would like to create a single line; - Extract each PARLABEL to use as a column name; and - Place the Result in the appropriate column. * I can subset the data so that prccode, Lab, EXMCODE, Analysis, PARVQ, RL, EPA_FLAGS, and units are irrelevant to the issue. The following snippet should illustrate the absolute minumum needs: INPUT fldsampid | PARLABEL| Result +--+- fldsampid1 | PARLABEL-a | value-8 fldsampid1 | PARLABEL-b | value-5 fldsampid1 | PARLABEL-x | value-2 fldsampid1 | PARLABEL-y | value-0 fldsampid2 | PARLABEL-a | value-9 fldsampid2 | PARLABEL-c | value-8 fldsampid3 | PARLABEL-a | value-2 fldsampid3 | PARLABEL-d | value-8 fldsampid3 | PARLABEL-w | value-3 fldsampid3 | PARLABEL-x | value-9 fldsampid3 | PARLABEL-y | value-6 OUTPUT fldsampid | PARLABEL-a | PARLABEL-b | PARLABEL-w | PARLABEL-x | PARLABEL-y +--+--+--+--+-- fldsampid1 | value-8| value-5| NA | value-2| value-0 fldsampid2 | value-9| value-2| NA | NA | NA fldsampid3 | value-2| NA | value-3| value-9| value-6 If it would help I could attach a 31kb file written with write.table(Units_NG.L, file=Units_NG.L, quote=FALSE, sep=\t) This subset has 97 individual samples and 3 PARLABELS distributed across 249 individual lines. Added Responses: 1. The structure of my actual input file appears to be correct per the following: (I has sent you a separate extration from an excel file) (Strings as Factors, numbers as num or int; a date changed via as.Date()) 'data.frame': 19694 obs. of 25 variables: $ Transect : Factor w/ 78 levels FLR01,FLR02,..: 1 1 1 1 1 1 1 1 1 1 ... $ Offset : Factor w/ 16 levels 0,A,B,C,..: 1 1 1 1 1 1 1 1 1 1 ... $ Location : Factor w/ 246 levels FLR010,FLR01A,..: 1 1 1 1 1 1 1 1 1 1 ... $ fldsampid: Factor w/ 863 levels FLR010-ANE1,..: 1 1 1 1 1 1 1 1 1 1 ... $ CLP_ID : Factor w/ 586 levels ,MY6591,MY6593,..: 1 1 1 1 1 1 1 1 1 1 ... $ sacode : Factor w/ 2 levels FD,N: 2 2 2 2 2 2 2 2 2 2 ... $ matrix : Factor w/ 6 levels SE,SO,TA,..: 3 3 3 3 3 3 3 3 3 3 ... $ LTCCODE : Factor w/ 4 levels BH,LK,RE,..: 4 4 4 4 4 4 4 4 4 4 ... $ Northing : num 2444733 2444733 2444733 2444733 2444733 ... $ Easting : num 5684613 5684613 5684613 5684613 5684613 ... $ CRDUNITS : Factor w/ 1 level FT: 1 1 1 1 1 1 1 1 1 1 ... $ Event: int 1 1 1 1 1 1 1 1 1 1 ... $ LOGDATE :Class 'Date' num [1:19694]
Re: [R] Don´t know what test i have to use
... But I would think that month should be treated as a cyclical quantity, not as a factor with 12 independent levels, e.g. by transforming month to sin( 2*pi*monthNumber/12) . This assumes 1 year periodicity, which might not be right, of course. Time series methods could obviously be relevant here. Given the possible importance of such periodicity and the relative complexity of the methodology necessary to deal with it properly, you might benefit by consulting your local statistician for help. -- Bert On Wed, Jan 12, 2011 at 10:43 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi, That is basically correct. You can specify the link as logit (see my example), but that is the default so you do not strictly need to in this case. II would encourage you to keep your variables (prevalencia, edad, sexo, mes) stored in a data frame, in which case you would add the data = argument to glm(). model2 - glm(prevalencia ~ edad * sexo * mes * zona, family = binomial(link = logit), data = your_dataframe) Also, you might take a look at ?predict.glm it has some examples with binomial data based off the wonderful book by Drs. Venables and Ripley. Oh, and finally, if you have 12 levels of months, ? levels of zones, and 2 levels of sex, you might not want the 4way interactions that you will get by default from using the '*' operator inside a formula. Unless you have a theory that there is an additional effect of being a middle aged female in the month of July for zone 8, but not Cheers, Josh On Wed, Jan 12, 2011 at 9:51 AM, gaiarrido gaiarr...@usal.es wrote: Hello, I´m starting with my PhD and I have to stop because i got a little knowledge in R and statistics. I´ve got a model of this kind: binary response variable: prevalence of infection (0/1) 3 categorical independent variables: sex, month and name of the area I was trying with a full model like this, before the simplification model-aov(prevalencia~sex*month*area) but the Fligner test told that i haven´t got homoscedascity, so I suppose I should trying with glm, with a model model2-glm(prevalencia~edad*sexo*mes*zona,binomial) is that correct? where I must put the link (logit) ? Thnks very much -- View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to disable using enter key to exit the browser in debugging mode
Dear R, How can I disable using enter key to exit the browser() in debug mode? I would love to have this option because it is so annoying to jump out of the debugging mode unexpectedly when I don't want to. I guess some of us have encouraged at least one of these situations, 1, Accidentally pressed the enter key within the browser. 2, Copy and paste a piece of debugging code containing empty lines to the prompt within the debugging mode. 3, If I paste a piece of code to the prompt to debug as follows, it will eventually jump out before I can do anything. ### copy starting from this line ## test - function() { x- 5 browser() y-4 } test() end of copy at this line Any suggestions are most welcome! Feng -- Feng Li Department of Statistics Stockholm University 106 91 Stockholm, Sweden http://feng.li/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RNetCDF: retrieving variable names and units
Dear List, does anybody has experience with the RNetCDF package? I manage to open a connection and copy data from a ncdf file but would need a way to automatically retrieve variable names (ideally all of them from one file) and units from the file. Any ideas? Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RNetCDF: retrieving variable names and units
Hi Jannis, although I don't know how you'd do that with RNetCDF, with the ncdf package it's pretty easy: ncid = open.ncdf( 'file.nc' ) nvars = ncid$nvars for( ivar in 1:nvars ) print(paste(var number,ivar,is named, ncid$var[[ivar]]$name, and has units, ncid$var[[ivar]]$units )) Regards, --Dave Jannis wrote: Dear List, does anybody has experience with the RNetCDF package? I manage to open a connection and copy data from a ncdf file but would need a way to automatically retrieve variable names (ideally all of them from one file) and units from the file. Any ideas? Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- David W. Pierce Division of Climate, Atmospheric Science, and Physical Oceanography Scripps Institution of Oceanography (858) 534-8276 (voice) / (858) 534-8561 (fax)dpie...@ucsd.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RNetCDF: retrieving variable names and units
There are a number of functions in the package to inquire about the file contents. See library(help = RNetCDF). For example: library(RNetCDF) nc - open.nc(file.nc) var.inq.nc(nc, 0) $id [1] 0 $name [1] longitude_U $type [1] NC_DOUBLE $ndims [1] 1 $dimids [1] 1 $natts [1] 0 You can then read from the file with something like this: obj0 - var.inq.nc(nc, 0) dat - var.get.nc(nc, obj0$name, start = ...) Going with ncdf or ncdf4 package is probably better ( I just happen to be more familiar with RNetCDF.) Cheers, Mike. On Thu, Jan 13, 2011 at 9:00 AM, Jannis bt_jan...@yahoo.de wrote: Dear List, does anybody has experience with the RNetCDF package? I manage to open a connection and copy data from a ncdf file but would need a way to automatically retrieve variable names (ideally all of them from one file) and units from the file. Any ideas? Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsum...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] syntax for extending a line in a script??
Hello, A hopefully simple question. I use 'R' through emacs, but I suspect the following would occur with any manner of text editor: - my editor has a normally quite handy feature where it will automatically indent to the appropriate level when I start a new line. However, this occasionally creates cases where there is no friendly way to break a long line of code into two lines which still function as one command. Therefore, I need a nice way to be able to flag 'R' to know that the code is continuing on the next line. Let me explain via example: numericColumns - names(listOfDataFrames[[myDF]][,columnsOI]) [sapply(listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] As you can see in this case, I would *like* for these 2 lines of code to be read as 1 line, but since the names(blah) command is sufficiently a command on its own, 'R' see this as a completed line of code. I could try to break it up at different points, but emacs (and other text editors) takes a guess as to the most intelligent way to indent, so that if I were to write something like: numericColumns - names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] it would actually indent something more like this: numericColumns - names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] and as you can see, that doesn't help the issue of preventing the code from wrapping around (and therefore doesn't help readability). Is there some simple way to flag that the next line is continuing? Something like python's \ at the end of a line? I tried wrapping the whole thing around curly braces { } but that didn't work, either. Thanks! Mike Telescopes and bathyscaphes and sonar probes of Scottish lakes, Tacoma Narrows bridge collapse explained with abstract phase-space maps, Some x-ray slides, a music score, Minard's Napoleanic war: The most exciting frontier is charting what's already here. -- xkcd -- Help protect Wikipedia. Donate now: http://wikimediafoundation.org/wiki/Support_Wikipedia/en [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] speed up subsetting with certain conditions
Hi folks, I am working on a project that requires subsetting of a found file based on some known file. The known file contains several lines like below: chr132375463237547rs523104280+ chr132375493237550rs520975820+ chr245133264513327rs297692800+ chr245133374513338rs332860090+ where the first column can be chr2, chr1, chr12 etc... The second and third are numbers (cordinates). The found file contains lines like: chr13213435GC chr13237547TC chr13237549GT chr24513326AG chr24513337CG where the first column, again, can be chr1, chr2, chr12 etc... and the second is a number. What I have to do is to separate the found file to two files: one (foundY) contains lines that have the same first column and the second column in range of the two columns 2 and 3 of any line of known file, and one (foundN) contains lines that do not meet the previous condition. For the two examples above, foundN will be the first line, and foundY will be the next 4 lines. What I came up with is this algorithm: * get the uniq item in the first column of found file (chr1, chr2, chr12, chr13 etc...) * for each of the uniq item, set subset of the known file and the found file that have same first column, then scanning each item in the known subset to see if any line meets any condition The code is like below: ## CODE START### # import known and found files to data frames known - read.table( known.txt, sep=\t, header=FALSE ) found - read.table( found.txt, sep=\t, header=FALSE, fill=TRUE ) # get the uniq item in first column of found file found.Chr - as.character(found[!duplicated(found[[1]]),1]) # create two empty result data frames foundN - found[0,] foundY - found[0,] # scan for each of the uniq items for ( iChr in found.Chr ) { # subset of known and found with specific item found.iChr - found[found[[1]]==iChr,] known.iChr - known[known[[1]]==iChr,] # scan through all found subset items if ( nrow(known.iChr)0 ) { for ( i in 1:nrow(found.iChr) ) { if ( nrow(known.iChr[known.iChr[[3]]=found.iChr[i,2] known.iChr[[2]]=found.iChr[i,2],])==0 ) { foundN - rbind( foundN, found.iChr[i,] ) } else { foundY - rbind( foundN, found.iChr[i,] ) } } } } ## CODE END### The code works well, but I tested it for only small known and found files. When trying with larger files (the known file can contains ~ 15 million lines, the found ~ 15k lines), it takes like hrs to run. I want to speed up the process, and I believe there must be a better algorithm to do this with R. My questions are: * any body has a better algorithm or comments or suggestion? * I read (google) that matrices work faster than data frame. Can I use matrices for this case? (is matrices for numbers only?) * I read (google) that I should avoid rbind, and prelocate data frame for faster speed. How would I do that in this case? Thank you very much in advance, Bests, D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] syntax for extending a line in a script??
On 1/12/2011 2:46 PM, Mike Williamson wrote: Hello, A hopefully simple question. I use 'R' through emacs, but I suspect the following would occur with any manner of text editor: - my editor has a normally quite handy feature where it will automatically indent to the appropriate level when I start a new line. However, this occasionally creates cases where there is no friendly way to break a long line of code into two lines which still function as one command. Therefore, I need a nice way to be able to flag 'R' to know that the code is continuing on the next line. Let me explain via example: numericColumns- names(listOfDataFrames[[myDF]][,columnsOI]) [sapply(listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] You can put the right hand side of the assignment in parentheses. Then even with the same breaks, the first line is not complete, so R will continue parsing. An emacs still indents reasonably. (I added a second line break to try and avoid email wrapping affecting things). numericColumns - (names(listOfDataFrames[[myDF]][,columnsOI]) [sapply(listOfDataFrames[[myDF]][,columnsOI], is.numeric)]) As you can see in this case, I would *like* for these 2 lines of code to be read as 1 line, but since the names(blah) command is sufficiently a command on its own, 'R' see this as a completed line of code. I could try to break it up at different points, but emacs (and other text editors) takes a guess as to the most intelligent way to indent, so that if I were to write something like: numericColumns- names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] it would actually indent something more like this: numericColumns- names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] and as you can see, that doesn't help the issue of preventing the code from wrapping around (and therefore doesn't help readability). Is there some simple way to flag that the next line is continuing? Something like python's \ at the end of a line? I tried wrapping the whole thing around curly braces { } but that didn't work, either. Putting the right hand side in curly braces might work too. That would turn it into a code block, which should evaluate to whatever the last statement in the code block is (which in this case is the only statement). I wouldn't be surprised if there is some case where curly braces might lead to a different result; parentheses shouldn't (but I may be wrong). Thanks! Mike -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speed up subsetting with certain conditions
On 1/12/2011 2:52 PM, Duke wrote: Hi folks, I am working on a project that requires subsetting of a found file based on some known file. The known file contains several lines like below: chr132375463237547rs523104280+ chr132375493237550rs520975820+ chr245133264513327rs297692800+ chr245133374513338rs332860090+ where the first column can be chr2, chr1, chr12 etc... The second and third are numbers (cordinates). The found file contains lines like: chr13213435GC chr13237547TC chr13237549GT chr24513326AG chr24513337CG where the first column, again, can be chr1, chr2, chr12 etc... and the second is a number. What I have to do is to separate the found file to two files: one (foundY) contains lines that have the same first column and the second column in range of the two columns 2 and 3 of any line of known file, and one (foundN) contains lines that do not meet the previous condition. For the two examples above, foundN will be the first line, and foundY will be the next 4 lines. What I came up with is this algorithm: * get the uniq item in the first column of found file (chr1, chr2, chr12, chr13 etc...) * for each of the uniq item, set subset of the known file and the found file that have same first column, then scanning each item in the known subset to see if any line meets any condition The code is like below: ## CODE START### # import known and found files to data frames known - read.table( known.txt, sep=\t, header=FALSE ) found - read.table( found.txt, sep=\t, header=FALSE, fill=TRUE ) # get the uniq item in first column of found file found.Chr - as.character(found[!duplicated(found[[1]]),1]) # create two empty result data frames foundN - found[0,] foundY - found[0,] # scan for each of the uniq items for ( iChr in found.Chr ) { # subset of known and found with specific item found.iChr - found[found[[1]]==iChr,] known.iChr - known[known[[1]]==iChr,] # scan through all found subset items if ( nrow(known.iChr)0 ) { for ( i in 1:nrow(found.iChr) ) { if ( nrow(known.iChr[known.iChr[[3]]=found.iChr[i,2] known.iChr[[2]]=found.iChr[i,2],])==0 ) { foundN - rbind( foundN, found.iChr[i,] ) } else { foundY - rbind( foundN, found.iChr[i,] ) } } } } ## CODE END### The code works well, but I tested it for only small known and found files. When trying with larger files (the known file can contains ~ 15 million lines, the found ~ 15k lines), it takes like hrs to run. I want to speed up the process, and I believe there must be a better algorithm to do this with R. My questions are: * any body has a better algorithm or comments or suggestion? The Bioconductor project has many tools for dealing with sequence-related data. With the data k - read.table(textConnection( chr132375463237547rs523104280+ chr132375493237550rs520975820+ chr245133264513327rs297692800+ chr245133374513338rs332860090+)) f - read.table(textConnection( chr13213435GC chr13237547TC chr13237549GT chr24513326AG chr24513337CG)) One might use the GenomicRanges package as library(GenomicRanges) kgr - with(k, GRanges(V1, IRanges(V2, V3, names=V4), V6, score=V5)) fgr - with(f, GRanges(V1, IRanges(V2, width=1), V3=V3, V4=V4)) olaps - findOverlaps(fgr, kgr) idx - countOverlaps(fgr, kgr) != 0 resulting in idx [1] FALSE TRUE TRUE TRUE TRUE This will be fast. One could write foundY with as.data.frame(fgr[idx]) (maybe a little editing) but likely one would want to stay in R / Bioc and do something more interesting... See http://bioconductor.org/install/index.html Martin * I read (google) that matrices work faster than data frame. Can I use matrices for this case? (is matrices for numbers only?) * I read (google) that I should avoid rbind, and prelocate data frame for faster speed. How would I do that in this case? Thank you very much in advance, Bests, D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Martin Morgan, PhD Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] syntax for extending a line in a script??
On Jan 12, 2011, at 5:46 PM, Mike Williamson wrote: Hello, A hopefully simple question. I use 'R' through emacs, but I suspect the following would occur with any manner of text editor: - my editor has a normally quite handy feature where it will automatically indent to the appropriate level when I start a new line. However, this occasionally creates cases where there is no friendly way to break a long line of code into two lines which still function as one command. Therefore, I need a nice way to be able to flag 'R' to know that the code is continuing on the next line. Let me explain via example: My practice is to use the opening of a paired code delimiter like [ or ( at the end of a line as I have modified your code to show: numericColumns - names(listOfDataFrames[[myDF]][,columnsOI])[ sapply(listOfDataFrames[[myDF]] [,columnsOI], is.numeric) ] As you can see in this case, I would *like* for these 2 lines of code to be read as 1 line, but since the names(blah) command is sufficiently a command on its own, 'R' see this as a completed line of code. I could try to break it up at different points, but emacs (and other text editors) takes a guess as to the most intelligent way to indent, so that if I were to write something like: numericColumns - names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] it would actually indent something more like this: numericColumns - names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] and as you can see, that doesn't help the issue of preventing the code from wrapping around (and therefore doesn't help readability). Is there some simple way to flag that the next line is continuing? Something like python's \ at the end of a line? I tried wrapping the whole thing around curly braces { } but that didn't work, either. Thanks! Mike Telescopes and bathyscaphes and sonar probes of Scottish lakes, Tacoma Narrows bridge collapse explained with abstract phase-space maps, Some x-ray slides, a music score, Minard's Napoleanic war: The most exciting frontier is charting what's already here. -- xkcd -- Help protect Wikipedia. Donate now: http://wikimediafoundation.org/wiki/Support_Wikipedia/en [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggredating date data
?aggregate I would transfer your date character sting into a date object (as.POSIXct) and then extract month or week numbers (?format) from this vector and use them as indices for the aggregate function. There may be more elegant ways though HTH Jannis --- analys...@hotmail.com analys...@hotmail.com schrieb am Mi, 12.1.2011: Von: analys...@hotmail.com analys...@hotmail.com Betreff: [R] aggredating date data An: r-help@r-project.org Datum: Mittwoch, 12. Januar, 2011 20:20 Uhr I tried a date by date forecast of a time series and it seems to be too wild. How can I aggregate the date into weeks or months as required? Thanks. The input looks like ID datadate(-MM-DD) value_for_day -- - --- -- -- and I want to be able to change it to ID dataweek value_for_week or ID datamonth value_ for_ month __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouped bars in barplot
Dear all, I am trying to make a barplot with clustered pairs of bars, using class=numeric data and the following command: barplot(c(bline_precip[10,9], bline_runoff[10,9], cccma_precip[10,9], cccma_runoff[10,9], csiro_precip[10,9], csiro_runoff[10,9], ipsl_precip[10,9], ipsl_runoff[10,9], mpi_precip[10,9], mpi_runoff[10,9], ncar_precip[10,9], ncar_runoff[10,9], ukmo_precip[10,9], ukmo_runoff[10,9]), beside=TRUE, space=c(0,2)) This results in all bars being packed tightly together, but with no gap between each pair. I suspect the problem is something to do with the data not being a matrix, but I've tried using as.matrix for each data element and this doesn't seem to work. If any one has any suggestions I'd be very grateful to hear them. Also, I'm hoping to put a label beneath each pair of bars on the x-axis, in the centre. At present I can only get labels to appear directly underneath a single bar, as opposed to the centre of the pair of bars. Does anyone have any suggestions for solving this? Many thanks for any help offered. Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] navigating in lists
Or, if for some reason the lists differ in length... test=list(a=list(1,2),b=list(3,4),c=list(5,6,7)) picker - function(x, i) { if(length(x)=i) x[[i]] else NA } pick - function(list,i) { sapply(list, function(x) picker(x, i)) } pick(test, 1) a b c 1 3 5 pick(test, 2) a b c 2 4 6 pick(test, 3) a b c NA NA 7 pick(test, 4) a b c NA NA NA -Erik Gregory Student Assistant, California EPA CSU Sacramento, Mathematics - Original Message From: Greg Snow greg.s...@imail.org To: Jannis bt_jan...@yahoo.de; r-help@r-project.org r-help@r-project.org Sent: Wed, January 12, 2011 1:17:54 PM Subject: Re: [R] navigating in lists sapply(test, '[[', 1) a b c 1 3 5 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Jannis Sent: Wednesday, January 12, 2011 1:50 PM To: r-help@r-project.org Subject: [R] navigating in lists Dear list members, I am stuck with navigating in a rather complicated list object. In general I would need a solution to access all first (or other) elements of the different sublists in one list: test=list(a=list(1,2),b=list(3,4),c=list(5,6)) like: test[[1:3]][[1]] which should result in c(1,3,5) Is there any way to access lists in such a way? Using unlist would create quite complicated objects Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multivariate autoregressive models with lasso penalization
I wish to estimate sparse causal networks from simulated time series data. Although there's some discussion about this problem in the literature (at least a few authors have used lasso and l(1,2) regularization to enforce sparsity in multivariate autoregressive models, e.g., http://user.cs.tu-berlin.de/~nkraemer/papers/grplasso_causality.pdf), I can't find any R packages with these capabilities. Has anyone in the R community experimented with such or put code out for this problem? Many thanks. John -- John M. Drake, Ph.D. Associate Professor University of Georgia Odum School of Ecology Athens, GA 30602-2202 phone: 706.583.5539 fax: 706.542.4819 email: jdr...@uga.edu skype: john.drake.uga web: http://dragonfly.ecology.uga.edu/drakelab [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speed up subsetting with certain conditions
On 1/12/11 6:12 PM, Martin Morgan wrote: The Bioconductor project has many tools for dealing with sequence-related data. With the data k - read.table(textConnection( chr132375463237547rs523104280+ chr132375493237550rs520975820+ chr245133264513327rs297692800+ chr245133374513338rs332860090+)) f - read.table(textConnection( chr13213435GC chr13237547TC chr13237549GT chr24513326AG chr24513337CG)) One might use the GenomicRanges package as library(GenomicRanges) kgr - with(k, GRanges(V1, IRanges(V2, V3, names=V4), V6, score=V5)) fgr - with(f, GRanges(V1, IRanges(V2, width=1), V3=V3, V4=V4)) olaps - findOverlaps(fgr, kgr) idx - countOverlaps(fgr, kgr) != 0 resulting in idx [1] FALSE TRUE TRUE TRUE TRUE This will be fast. Thanks so much for your suggestion Martin. I had Bioconductor installed but I honestly do not know all its applications. Anyway, I am testing GenomicRanges with my data now. I will report back when I get the result. One could write foundY with as.data.frame(fgr[idx]) (maybe a little editing) but likely one would want to stay in R / Bioc and do something more interesting... I suppose foundN - as.data.frame(fgr[!idx]) and foundY - as.data.frame(fgr[idx]) as you suggested, but I dont really understand your last comment :). Thanks, D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] syntax for extending a line in a script??
Thanks to Brian David! For some reason, I'd thought to use {, but not (. I guess I can chalk that up to a slow brain. Interestingly (and I didn't bother to figure out why), the { didn't work for me. But I tried a few cases, and ( always seems to work, so far. Regards, Mike Telescopes and bathyscaphes and sonar probes of Scottish lakes, Tacoma Narrows bridge collapse explained with abstract phase-space maps, Some x-ray slides, a music score, Minard's Napoleanic war: The most exciting frontier is charting what's already here. -- xkcd -- Help protect Wikipedia. Donate now: http://wikimediafoundation.org/wiki/Support_Wikipedia/en On Wed, Jan 12, 2011 at 6:08 PM, Brian Diggs dig...@ohsu.edu wrote: On 1/12/2011 2:46 PM, Mike Williamson wrote: Hello, A hopefully simple question. I use 'R' through emacs, but I suspect the following would occur with any manner of text editor: - my editor has a normally quite handy feature where it will automatically indent to the appropriate level when I start a new line. However, this occasionally creates cases where there is no friendly way to break a long line of code into two lines which still function as one command. Therefore, I need a nice way to be able to flag 'R' to know that the code is continuing on the next line. Let me explain via example: numericColumns- names(listOfDataFrames[[myDF]][,columnsOI]) [sapply(listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] You can put the right hand side of the assignment in parentheses. Then even with the same breaks, the first line is not complete, so R will continue parsing. An emacs still indents reasonably. (I added a second line break to try and avoid email wrapping affecting things). numericColumns - (names(listOfDataFrames[[myDF]][,columnsOI]) [sapply(listOfDataFrames[[myDF]][,columnsOI], is.numeric)]) As you can see in this case, I would *like* for these 2 lines of code to be read as 1 line, but since the names(blah) command is sufficiently a command on its own, 'R' see this as a completed line of code. I could try to break it up at different points, but emacs (and other text editors) takes a guess as to the most intelligent way to indent, so that if I were to write something like: numericColumns- names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] it would actually indent something more like this: numericColumns- names(listOfDataFrames[[myDF]][,columnsOI]) [sapply( listOfDataFrames[[myDF]][,columnsOI], is.numeric) ] and as you can see, that doesn't help the issue of preventing the code from wrapping around (and therefore doesn't help readability). Is there some simple way to flag that the next line is continuing? Something like python's \ at the end of a line? I tried wrapping the whole thing around curly braces { } but that didn't work, either. Putting the right hand side in curly braces might work too. That would turn it into a code block, which should evaluate to whatever the last statement in the code block is (which in this case is the only statement). I wouldn't be surprised if there is some case where curly braces might lead to a different result; parentheses shouldn't (but I may be wrong). Thanks! Mike -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health Science University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to disable using enter key to exit the browser in debugging mode
That also drives me crazy! I don't have that problem when I use the StatEt plug-in for Eclipse. Of course, using a new IDE is a big undertaking, but I can assure you: it's worth it! This is just one small benefit. On Wed, Jan 12, 2011 at 4:00 PM, Feng Li m...@feng.li wrote: Dear R, How can I disable using enter key to exit the browser() in debug mode? I would love to have this option because it is so annoying to jump out of the debugging mode unexpectedly when I don't want to. I guess some of us have encouraged at least one of these situations, 1, Accidentally pressed the enter key within the browser. 2, Copy and paste a piece of debugging code containing empty lines to the prompt within the debugging mode. 3, If I paste a piece of code to the prompt to debug as follows, it will eventually jump out before I can do anything. ### copy starting from this line ## test - function() { x- 5 browser() y-4 } test() end of copy at this line Any suggestions are most welcome! Feng -- Feng Li Department of Statistics Stockholm University 106 91 Stockholm, Sweden http://feng.li/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2d plot with modification of plotting symbol to indicate third dimension.
I would like to plot 3-dimensional data on a two-dimensional scatter-plot. Is there a way I can automatically modify the plot symbol (e.g. changing size or color) to indicate the value of a third variable? E.g. How can I plot weight vs. age and indicate the value of muscle mass for each value weight-age pair by making the plot point proportional to the subject's muscle mass? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Openbugs and rbugs on mac with wine
Hello list, I’ve been trying to get OpenBUGS running on my mac using the wine emulator. I can run Openbugs just fine by doing: wine ~/OpenBUGS312/OpenBUGS.exe In the terminal, so OpenBUGS works. When I try to run the schools example using rbugs(), the OpenBUGS process starts in wine, but it just sits there, no log, no script, no output of any sort. The rbugs () call makes the init, data, model and script file, but there seems to be a problem with R piping the script to OpenBUGS, here is my example library(rbugs) data(schools) J - nrow(schools) y - schools$estimate y - rnorm(length(y)) sigma.y - schools$sd schools.data - list (J, y, sigma.y) ## schools.data - list(J=J, y=y, sigma.y=sigma.y) inits - function() {list (theta=rnorm(J,0,100), mu.theta=rnorm(1,0,100), sigma.theta=runif(1,0,100))} parameters - c(theta, mu.theta, sigma.theta) schools.bug - file.path(.path.package(rbugs), bugs/model, schools.bug) file.show(schools.bug) #This almost runs, it makes all files, but doesn't run the script schools.sim - rbugs(data=schools.data, inits, parameters, schools.bug, n.chains=3, n.iter=1,seed=123, workingDir=/Users/ozd504/Documents/, bugsWorkingDir=/Users/ozd504/Documents/, useWine=TRUE, wine=/opt/local/bin/wine, bugs = /Users/ozd504/OpenBUGS312/ OpenBUGS.exe,OpenBugs=T, debug=TRUE) This Returns an error saying that bugs terminated before the coda could be written I can also send a screen shot of what happens if anyone is interested. Any help would be most appreciated. Here is my sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] R2WinBUGS_2.1-16 coda_0.14-2 lattice_0.19-13 rbugs_0.4-9 loaded via a namespace (and not attached): [1] grid_2.12.1 tools_2.12.1 Thanks, Corey Corey Sparks Assistant Professor Department of Demography and Organization Studies University of Texas at San Antonio 501 West Durango Blvd Monterey Building 2.270C San Antonio, TX 78207 210-458-3166 corey.sparks 'at' utsa.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Openbugs and rbugs on mac with wine
Hello list, Ive been trying to get OpenBUGS running on my mac using the wine emulator. I can run Openbugs just fine by doing: wine ~/OpenBUGS312/OpenBUGS.exe In the terminal, so OpenBUGS works. When I try to run the schools example using rbugs(), the OpenBUGS process starts in wine, but it just sits there, no log, no script, no output of any sort. The rbugs() call makes the init, data, model and script file, but there seems to be a problem with R piping the script to OpenBUGS, here is my example library(rbugs) data(schools) J - nrow(schools) y - schools$estimate y - rnorm(length(y)) sigma.y - schools$sd schools.data - list (J, y, sigma.y) ## schools.data - list(J=J, y=y, sigma.y=sigma.y) inits - function() {list (theta=rnorm(J,0,100), mu.theta=rnorm(1,0,100), sigma.theta=runif(1,0,100))} parameters - c(theta, mu.theta, sigma.theta) schools.bug - file.path(.path.package(rbugs), bugs/model, schools.bug) file.show(schools.bug) #This almost runs, it makes all files, but doesn't run the script schools.sim - rbugs(data=schools.data, inits, parameters, schools.bug, n.chains=3, n.iter=1,seed=123, workingDir=/Users/ozd504/Documents/, bugsWorkingDir=/Users/ozd504/Documents/, useWine=TRUE, wine=/opt/local/bin/wine, bugs = /Users/ozd504/OpenBUGS312/OpenBUGS.exe,OpenBugs=T, debug=TRUE) This Returns an error saying that bugs terminated before the coda could be written I can also send a screen shot of what happens if anyone is interested. Any help would be most appreciated. Here is my sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] R2WinBUGS_2.1-16 coda_0.14-2 lattice_0.19-13 rbugs_0.4-9 loaded via a namespace (and not attached): [1] grid_2.12.1 tools_2.12.1 Thanks, Corey Corey S. Sparks, Ph.D. Assistant Professor Department of Demography and Organization Studies University of Texas San Antonio 501 West Durango Blvd San Antonio, TX 78207 email:corey.spa...@utsa.edu web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouped bars in barplot
Tena koe Steve Convert your data into a matrix: dataMat - matrix(c(bline_precip[10,9], bline_runoff[10,9], cccma_precip[10,9], cccma_runoff[10,9], csiro_precip[10,9], csiro_runoff[10,9], ipsl_precip[10,9], ipsl_runoff[10,9], mpi_precip[10,9], mpi_runoff[10,9], ncar_precip[10,9], ncar_runoff[10,9], ukmo_precip[10,9], ukmo_runoff[10,9]), nrow=2) Since I don't know the nature of your data I will use a simple example: dataMat - matrix(1:14, nrow=2) barplot(dataMat, beside=TRUE, space=c(0,2)) tTicks - barplot(dataMat, beside=TRUE, space=c(0,2)) tTicks - tapply(tTicks, rep(1:7, each=2), mean) axis(1, tTicks, letters[1:7]) Is that what you want? HTH Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Steve Murray Sent: Thursday, 13 January 2011 12:14 p.m. To: r-help@r-project.org Subject: [R] Grouped bars in barplot Dear all, I am trying to make a barplot with clustered pairs of bars, using class=numeric data and the following command: barplot(c(bline_precip[10,9], bline_runoff[10,9], cccma_precip[10,9], cccma_runoff[10,9], csiro_precip[10,9], csiro_runoff[10,9], ipsl_precip[10,9], ipsl_runoff[10,9], mpi_precip[10,9], mpi_runoff[10,9], ncar_precip[10,9], ncar_runoff[10,9], ukmo_precip[10,9], ukmo_runoff[10,9]), beside=TRUE, space=c(0,2)) This results in all bars being packed tightly together, but with no gap between each pair. I suspect the problem is something to do with the data not being a matrix, but I've tried using as.matrix for each data element and this doesn't seem to work. If any one has any suggestions I'd be very grateful to hear them. Also, I'm hoping to put a label beneath each pair of bars on the x- axis, in the centre. At present I can only get labels to appear directly underneath a single bar, as opposed to the centre of the pair of bars. Does anyone have any suggestions for solving this? Many thanks for any help offered. Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] easy loop question
Hi everyone, I am new in R and programming. I have tried to remove the values out of range in some variables using a loop: 1) var - names(est8vo[, 77:83]) # I got the variable names var [1] p16.1 p16.2 p16.3 p16.4 p16.5 p16.6 p16.7 for (i in 1:7) { var.i - var[i] est8vo$var.i[ est8vo$var.i==3] - 99 } I got this error: Error in `$-.data.frame`(`*tmp*`, var.i, value = numeric(0)) : replacement has 0 rows, data has 215700 2) The second step would be to define the factors, but I got the same error: for (i in 1:7) { var.i - var[i] est8vo$var.i- factor(est8vo$var.i, levels=c(0, 1, 2, 99), labels=c(vacío, sí, no, doble marca) ) } I don't know how to do it. Thank you in advance! Sebastian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rotated, Right-Justified Labels for Shortened Tick Marks
Hello R-help, I'm trying to make a fairly simple plot axis that goes something like this: plot(-10:10,-10:10, yaxt='n') axis(side=2, las=1, hadj=1, tck=-.01, cex.axis=.6) ...but as you can see, the labels are not close enough to the y-axis (where I want them... to save space for publication). Can anybody help me figure out how to move these labels over the the right a bit? Thanks, -D __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggredating date data
I like the zoo package, and there are several helpful examples. library(zoo) You can easily convert your data into a zoo object using I was actually just doing this using this function: LoadReturnData=function(x){ ret = read.csv(x) ret = zoo(ret[ , -1], as.Date(ret[ , 1])) colnames(ret) = toupper(colnames(ret)) return(ret) } fnd = LoadReturnData('/Data/SomeSpecialData.csv') My data is already in weeks, and aggregating to months is easy using as.yearmon MonthIndex=as.yearmon(index(fnd)) aggregate(.~MonthIndex, data=fnd, sum) If you have daily data and you need weeks, then you'll have to create a vector to indicate the week, like the MonthIndex above. e.g. for 365 days WeekIndex = rep(1:53, each=7, length.out=365) On Wed, Jan 12, 2011 at 2:20 PM, analys...@hotmail.com analys...@hotmail.com wrote: I tried a date by date forecast of a time series and it seems to be too wild. How can I aggregate the date into weeks or months as required? Thanks. The input looks like ID datadate(-MM-DD) value_for_day -- ---- -- -- and I want to be able to change it to ID dataweek value_for_week or ID datamonth value_ for_ month __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What does the shell() command do?
Dear R community, I am trying to understand what the shell() function does. An example is: xfile - shell(paste(dir/b , paste(directory.folder,file.name,sep=)),intern=T) I'm afraid I wasn't able to completely understand the explanation under the Help files. Thanks for your help! Leanne. -- View this message in context: http://r.789695.n4.nabble.com/What-does-the-shell-command-do-tp3215032p3215032.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standard errors in johansen test
Dear all, I have a question. How to get the standard errors of alpha and beta when using ca.jo to test cointergration? In the paper by Bernhard Pfaff and Kronberg im Taunus âVAR, SVAR and SVEC Models: Implementation Within R Packageâ pp.24-25. The standard errors are listed on the table 5 following the code: R vecm.r1 - cajorls(vecm, r = 1) I tried this in my Mac R, but failed. Thanks. -- Best Regards Walter an ACCA Affiliate (Association of Chartered Certified Accountants) I COME FROM CHINA ææä¸ææ¿åï¼é¢æ大海ï¼æ¥æè±å¼ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unicodepdf font problem
Dear List, I would like to print a plot into pdf. The problem is that the character \U0171 is replaced by a simple 'u' (i.e. without accents) in the pdf file. Example: # this works fine plot(1,type=n) text(1,1,print \U0171) # this fails pdf(trial.pdf) plot(1,type=n) text(1,1,print \U0171) dev.off() I found an earlier post at http://www.mail-archive.com/r-help@r-project.org/msg65541.html, but it is too hard to understand at my R-level. Any help is appreciated. Regards, Denes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about svm(e1071)
Dear all, I executed svm calculation using e1071 library with a microarray data (http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt). Then, I shuffled the data samples and executed svm calculation again. The results of 2 calculation were different (in SV, coefs and weights). I attached the script below. Could please tell me why this happens? If possible please tell me how to make them equal. Best regards, Hiro ### Script start ### library(e1071) data - read.table('http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt', header=TRUE, row.names=1, sep=\t, quote=) data.cl - rep(NA,ncol(data)) data.cl[grep('Normal',colnames(data))] - 'Normal' data.cl[grep('Tumour',colnames(data))] - 'Tumour' s - sample(ncol(data)) m - svm(x=t(data), y=factor(data.cl ), scale=T, type=C-classification,kernel=linear) m.s - svm(x=t(data[,s]), y=factor(data.cl[s]), scale=T, type=C-classification, kernel=linear) w - t(m $coefs) %*% m$SV w.s - t(m.s$coefs) %*% m.s$SV # SV and coefs are slightly different sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),])) sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))])) # rank of weight are not identical all(rank(w)==rank(w.s)) ### Script end ### [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Repeating value occurence
How can achieve this in R using seq, or rep function c(-1,0,1,0,-1,0,1,0,-1,0) The range value is between-1 and 1, and I want it such that there could be n number of points between -1 and 1 Anyone? Please help Thanks Rusty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Data Transformation - RESOLVED
Hi Dennis, SOLVED!!! My thanks to both you, John, and others who chimed in. Took a little more digging before finally working, but it is working! Here's a little more of what I did and the ultimate resolutions: I installed the reshape2 library from CRAN. Executed (.packages()) [1] reshape2 stats graphics grDevices datasets rcom rscproxy utils methods base confirming presence. Executed your toy script (stepwise so as not to include the output as input) Received the following result at the end: cast(df, idnum ~ lab, value = 'y') Error: could not find function cast The Package 'reshape2' documentation indicates that either acast() or dcast() are used, not giving an example of cast() itself. I resolved this by trying both acast() and dcast(). Both yielded the same screen results. Turning to my own data I was able to successfully execute the operation on one of my subsets AFTER, cleaning up some results with duplicates PARLABELS. (throws the resulting matrix into a count of values but the docs give an indication of the resolution). So it looks like I'm on my way after some further data clean-up. I am using dcast() as I wish to stay 2-dimensional (personal limitations on brain()). Cheers and thanks again to all, Guy ITSI, A Gilbane Company (925) 946-3340 Direct (925) 457-4168 ITSI Cell gj...@itsi.commailto:gj...@itsi.com From: Dennis Murphy [mailto:djmu...@gmail.com] Sent: Wednesday, January 12, 2011 1:19 PM To: Guy Jett Cc: r-help@r-project.org Subject: Re: [R] Help with Data Transformation Hi: This seems like a problem that is well suited for the cast() function in package reshape2. Here's a toy example: library(reshape2) df - data.frame(idnum = rep(101:104, c(3, 2, 4, 3)), lab = unlist(sapply(c(3, 2, 4, 3), function(x) sample(LETTERS[1:6], x))), y = rpois(12, 7)) df idnum lab y 1101 C 7 2101 F 11 3101 E 4 4102 B 10 5102 A 7 6103 E 6 7103 D 9 8103 B 3 9103 F 12 10 104 C 10 11 104 D 9 12 104 A 5 cast(df, idnum ~ lab, value = 'y') idnum A B C D E F 1 101 NA NA 7 NA 4 11 2 102 7 10 NA NA NA NA 3 103 NA 3 NA 9 6 12 4 104 5 NA 10 9 NA NA Since you have multiple 'invariant' variables, you could use something like cast(df, . ~ lab, value = 'y') This would make all variables other than lab the 'rows', the values of lab as separate 'columns', with the value of the variable y inserted in the appropriate locations, NA fill otherwise. cast() is an alternative to the reshape() function in base R. HTH, Dennis On Wed, Jan 12, 2011 at 12:19 PM, Guy Jett gj...@itsi.commailto:gj...@itsi.com wrote: Hi John, Thank you for your patience. I was away for a State certification exam yesterday, so am just getting back to this. Reading through you response I believe I wasn't clear enough about what I'm trying to do. Your description seems to rearrange the matrix without grouping the analytical results for a single sample onto a single line, as I had hoped. I may have confused things by attempting to send a truncated/simplified dataset. Restatement of needs: * I have 863 individual samples. The following columns contain invariant results for each sample: - Transect,Offset,Location,fldsampid,CLP_ID,sacode,matrix,LTCCODE, Northing,Easting,CRDUNITS,Event,LOGDATE,sbd,sed. - Sorting can make use of fldsampid as these values are entirely unique to each sample. * Each sample is associates with one or more of the following 48 analytical parameters: - AG,AL,ALK,ALKB,ALKC,AS,B,BA,BE,BR,CA,CDCL,CO,CR,CU, DOC,FE,Hg,HG,HGACIDLAB,HGEXTINO,HGEXTORG,HGNONMOBHGSEMIMOB,K, MEHG,MG,MN,MO,NH3N,NI,NO2N,NO3,NO3N,PBPO4,S,SB,SE,SO4, SOLID,SSC,TL,TOC,V,Zn,ZN - These are currently stored in the PARLABEL column. * For each sample ID I would like to create a single line; - Extract each PARLABEL to use as a column name; and - Place the Result in the appropriate column. * I can subset the data so that prccode, Lab, EXMCODE, Analysis, PARVQ, RL, EPA_FLAGS, and units are irrelevant to the issue. The following snippet should illustrate the absolute minumum needs: INPUT fldsampid | PARLABEL| Result +--+- fldsampid1 | PARLABEL-a | value-8 fldsampid1 | PARLABEL-b | value-5 fldsampid1 | PARLABEL-x | value-2 fldsampid1 | PARLABEL-y | value-0 fldsampid2 | PARLABEL-a | value-9 fldsampid2 | PARLABEL-c | value-8 fldsampid3 | PARLABEL-a | value-2 fldsampid3 | PARLABEL-d | value-8 fldsampid3 | PARLABEL-w | value-3 fldsampid3 | PARLABEL-x | value-9 fldsampid3 | PARLABEL-y | value-6 OUTPUT fldsampid | PARLABEL-a | PARLABEL-b | PARLABEL-w | PARLABEL-x | PARLABEL-y +--+--+--+--+-- fldsampid1 | value-8| value-5| NA | value-2|
Re: [R] easy loop question
On Jan 12, 2011, at 10:54 PM, Sebastián Daza wrote: Hi everyone, I am new in R and programming. I have tried to remove the values out of range in some variables using a loop: 1) var - names(est8vo[, 77:83]) # I got the variable names var [1] p16.1 p16.2 p16.3 p16.4 p16.5 p16.6 p16.7 for (i in 1:7) { var.i - var[i] est8vo$var.i[ est8vo$var.i==3] - 99 You CANNOT use names like that. (It makes no sense to supply a vector argument to $.) If you want to change every instance of 3 within a group of columns. See if this works est8vo[, 77:83] - sapply( est8vo[, 77:83], function(x) ifelse(x==3, 99, x) ) } I got this error: Error in `$-.data.frame`(`*tmp*`, var.i, value = numeric(0)) : replacement has 0 rows, data has 215700 2) The second step would be to define the factors, but I got the same error: est8vo[, 77:83] - sapply(est8vo[, 77:83] , factor, labels=c(vacío, sí, no, doble marca)) for (i in 1:7) { var.i - var[i] est8vo$var.i- factor(est8vo$var.i, Wrong. Wrong. Wrong. And please forget about using $ inside loops in the left-hand side. It is not designed for that. levels=c(0, 1, 2, 99), labels=c(vacío, sí, no, doble marca) ) } I don't know how to do it. Thank you in advance! Sebastian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.