[R] Help with reshape/reShape and indexing
Dear R Helpers, I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. I have a dataframe df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4)) df Name X1 X2 1a 12 200 2a 13 250 3a 14 300 4b 20 600 5b 25 700 6c 30 900 First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) df.index Name X1 X2Index 1a 12 2001 2a 13 2502 3a 14 3003 4b 20 6001 5b 25 7002 6c 30 9001 How can I do this? Secondly, I would like to reshape this dataframe in the form: df2 1 2 3 a 12 13 14 b 20 25 NA c 30 NA NA Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). If I could generate the Index column, I think I could accomplish this with: df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) colnames(df2) = c(V1, V2, V3) However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? Thank you so much for your help on these two issues. With best regards, Dana Sevak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlay cdf
Here are some ideas you might like to consider par(mar = c(5,4,2,4)+0.1, yaxs = r) Sample - rgamma(1000,2.5,.8) hist(Sample, main = , freq = FALSE, ylim = c(0,1)) pu - par(usr)[1:2] x - seq(pu[1], pu[2], len = 5000) y - pgamma(x, 2.5, 0.8) par(new = TRUE) plot(x, y, type = l, axes = FALSE, ann = FALSE, col = red) lines(x, dgamma(x, 2.5, 0.8), col = darkgreen) axis(4, col = red) mtext(side = 4, text = Cumulative probability, col = red, line = 2.5) x0 - c(0, sort(Sample)) p0 - 0:1000/1000 lines(x0, p0, type = S, col = blue) Bill Venables http://www.cmis.csiro.au/bill.venables/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of beetle2 Sent: Wednesday, 13 May 2009 3:23 PM To: r-help@r-project.org Subject: [R] Overlay cdf Hi, Is it possible to overlay a cummulative distribution function on a histogram of a gamma distribuition. I have a gamma function Sample = rgamma(1000,2.5,.8)+1.5 hist(Sample) regards -- View this message in context: http://www.nabble.com/Overlay-cdf-tp23515551p23515551.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a postscript file with two xyplots
Liati liats80 at hotmail.com writes: I would like to create one postscript file with two different xyplots (which library(lattice) postscript(myps.ps) xyplot(1~1,main=Plot 1) xyplot(2~3,main=Plot 2) dev.off() Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import HTML tables
Dimitri Szerman-2 wrote: Hello, I was wondering if there is a function in R that imports tables directly from a HTML document. The XML package can do this: http://markmail.org/message/cyicoa3htme4gei2 Duncan Temple Lang: The htmlParse() and htmlTreeParse() functions in the XML package use the non-strict HTML parser in libxml2 and so the HTML document can be malformed. Dieter -- View this message in context: http://www.nabble.com/import-HTML-tables-tp23504282p23517322.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adonis help - (non-parametric (permutational) manova)
On Tue, 2009-05-12 at 15:53 -0400, stephen sefick wrote: I am trying to apply this technique (M.J Anderson 2001) to a dataset of aquatic insect abundances. There is a sample in the unrestored and restored segement of a stream for every time period. I would like to compare the centroids of the distance matrices for the treatments up (unrestored) and dn (restored) to see if there is a difference in insect communities between the treatments. I will not include the raw data in this posting as it is large for posting to the list; however, I would be happy to provide it off list if it would make this easier (and reproducible). my environmental matrix (or factor matrix I am not sure of the terminology) is set up like this: datesite 0104 dn 0104 dn 0106 dn 0106 dn 0203 dn 0203 dn 0503 dn 0503 dn 0704 dn 0704 dn 0803 dn 0803 dn 0804 dn 0804 dn 0805 dn 0805 dn 1005 dn 1005 dn 1102 dn 1102 dn 1204 dn 1204 dn 0104 up 0104 up 0106 up 0106 up 0203 up 0203 up 0503 up 0503 up 0704 up 0704 up 0803 up 0803 up 0804 up 0804 up 0805 up 0805 up 1005 up 1005 up 1102 up 1102 up 1204 up 1204 up my site x species matrix is called a, so here is the call to adonis: adonis(a~site, data=b, strata=b[,date] ,Permutations=999) I don't think the permutations will be stratified correctly - you want them to represent a time series yes? 'strata' is meant to define in vegan. Samples within the strata are permuted. So if you only have two samples per unique time point (1 for up and 1 for dn), the effect of setting strata to the date variable will be to permute only pairs of samples. Work has begun (and stalled for a little while - my fault) on providing a wider range of restricted permutation tests. The function permuted.index2 in vegan can generate permutations for time series (or other ordered observations), but you'd have to a) edit adonis in place to use permuted.index2 and work out how to set up the call to this function correctly so that it returns the permutation structure adonis wants. Then check it does what it says it does - there is at least one bug that I know of but I'm not fixing it as the development version on my local machine has completely changed the way the permutation schemes are specified. Contact me off-list if you would like some help with this, though as I'm teaching for two weeks, I won't be able to look at it until later. For now, perhaps you could just ignore the time-series aspects and run the analysis without strata, but require a far lower p-value than you might normally use to reflect the fact that the permutations do not take into account correlations between time points. HTH Is this the correct way of testing the null hypothesis that : There is no difference in community structure between treatments. Thank you very much in advance, and anything that you need to make this easier please don't hessitate to ask. regards, -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Do you use R for data manipulation?
Warren Young wrote: Farrel Buchinsky wrote: Is R an appropriate tool for data manipulation and data reshaping and data organizing? I think so but someone who recently joined our group thinks not. The new recruit believes that python or another language is a far better tool for developing data manipulation scripts that can be then used by several members of our research group. Her assessment is that R is useful only when it comes to data analysis and working with statistical models. It's hard to shift people's individual preferences, but impressive objective comparisons are easy to come by. Ask her how many lines it would take to do this trivial R task in Python: data - read.csv('original-data.csv') write.csv('scaled-data.csv', data * 10) you might want to learn that this is a question of appropriate libraries. in r, read.csv and write.csv reside in the package utils. in python, you'd use numpy: from numpy import loadtxt, savetxt savetxt('scaled.csv', loadtxt('original.csv', delimiter=',')*10, delimiter=',') this makes 2 lines, together with importing the library. R's ability to do something to an entire data structure -- or a slice of it, or some other subset -- in a single operation is very useful when cleaning up data for presentation and analysis. but this is really *hardly* r-specific. you can do that in many, many languages, be assured. just peek out. Also point out how easy it is to get data *out* of R, as above, not just into it, so you can then hack on it in Python, if that's the better language for further manipulation. If she gives you static about how a few more lines are no big deal, remind her that it's well established that bug count is always a simple function of line count. This fact has been known since the 70's. that's a slogan, esp. when you think of how compact (but unreadable, and thus error-prone) can code written in perl be. often, more lines of code make it easier to maintain, and thus avoid bugs. While making your points, remember that she has a good one, too: R is not the only good language out there. You should learn Python while she's learning R. +1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adonis help - (non-parametric (permutational) manova)
Apologies, I seem to have deleted the important part of a sentence below: On Wed, 2009-05-13 at 09:37 +0100, Gavin Simpson wrote: snip / adonis(a~site, data=b, strata=b[,date] ,Permutations=999) I don't think the permutations will be stratified correctly - you want them to represent a time series yes? 'strata' is meant to define in vegan. Should have said: 'strata' is meant to define groups of samples, or blocks, in vegan. G Samples within the strata are permuted. So if you only have two samples per unique time point (1 for up and 1 for dn), the effect of setting strata to the date variable will be to permute only pairs of samples. Work has begun (and stalled for a little while - my fault) on providing a wider range of restricted permutation tests. The function permuted.index2 in vegan can generate permutations for time series (or other ordered observations), but you'd have to a) edit adonis in place to use permuted.index2 and work out how to set up the call to this function correctly so that it returns the permutation structure adonis wants. Then check it does what it says it does - there is at least one bug that I know of but I'm not fixing it as the development version on my local machine has completely changed the way the permutation schemes are specified. Contact me off-list if you would like some help with this, though as I'm teaching for two weeks, I won't be able to look at it until later. For now, perhaps you could just ignore the time-series aspects and run the analysis without strata, but require a far lower p-value than you might normally use to reflect the fact that the permutations do not take into account correlations between time points. HTH Is this the correct way of testing the null hypothesis that : There is no difference in community structure between treatments. Thank you very much in advance, and anything that you need to make this easier please don't hessitate to ask. regards, -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survival curves for time dependent covariates (was consultation)
At 14:50 12.05.2009, Terry Therneau wrote: *I´m writing to ask you how can I do Survivals Curves using Time-dependent *covariates? Which packages I need to Install?* This is a very difficult problem statistically. That is, there are not many good ideas for what SHOULD be done. Hence, there are no packages. Almost everything you find in an applied paper (e.g. a medical journal) is wrong. Terry Therneau Dear Terry, just in case it does not make too much work to you, maybe you could give some references to examples of wrong applications in applied medical papers. Thanks, Heinz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Segmentation fault in package rJava on CentOS server
Hello, I just installed rJava on [r...@ug13 ~]# R --version R version 2.9.0 (2009-04-17) runing on a [r...@ug13 ~]# cat /etc/redhat-release CentOS release 5.3 (Final) This is the output of [r...@ug13 ~]# R CMD javareconf Java interpreter : /usr/bin/java Java version : 1.4.2_18 Java home path : /usr/java/j2sdk1.4.2_18/jre Java compiler: /usr/bin/javac Java headers gen.: /usr/bin/javah Java archive tool: /usr/bin/jar Java library path: $(JAVA_HOME)/lib/i386/client:$(JAVA_HOME)/lib/i386:$(JAVA_HOME)/../lib/i386 JNI linker flags : -L$(JAVA_HOME)/lib/i386/client -L$(JAVA_HOME)/lib/i386 -L$(JAVA_HOME)/../lib/i386 -ljvm JNI cpp flags: -I$(JAVA_HOME)/../include -I$(JAVA_HOME)/../include/linux Package rJava got properly installed (there were a number of warnings, though, in the installation process). However, library(rJava) .jinit() *** caught segfault *** address 0xc, cause 'memory not mapped' Traceback: 1: .External(RinitJVM, boot.classpath, parameters, PACKAGE = rJava) 2: .jinit() Whenever I try to interact with Java from R --I am interested in the RJDBC package--, I get the same segmentation fault at the .jinit call. In particular, when .jinit calls RinitJVM. Any ideas? Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with reshape/reShape and indexing
Dana Sevak wrote: Dear R Helpers, I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. I have a dataframe df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4)) df Name X1 X2 1a 12 200 2a 13 250 3a 14 300 4b 20 600 5b 25 700 6c 30 900 First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) df.index Name X1 X2Index 1a 12 2001 2a 13 2502 3a 14 3003 4b 20 6001 5b 25 7002 6c 30 9001 How can I do this? Secondly, I would like to reshape this dataframe in the form: df2 1 2 3 a 12 13 14 b 20 25 NA c 30 NA NA This does it more or less your way: ds - split(df, df$Name) ds - lapply(ds, function(x){x$Index - seq_along(x[,1]); x}) df2 - unsplit(ds, df$Name) tapply(df2$X1, df2[,c(Name, Index)], function(x) x) athough there may exist much easier ways ... Uwe Ligges Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). If I could generate the Index column, I think I could accomplish this with: df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) colnames(df2) = c(V1, V2, V3) However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? Thank you so much for your help on these two issues. With best regards, Dana Sevak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Response surface plot
HI, thank you for that. I had come across a while ago a presentation outlining the ideas for such a function but can't remember who or where. Thanks again, Tim - Original Message - From: Duncan Murdoch murd...@stats.uwo.ca Date: Tuesday, May 12, 2009 4:26 pm Subject: Re: [R] Response surface plot To: Tim Carnus tim.car...@ucd.ie Cc: r-help@r-project.org On 5/12/2009 8:43 AM, Tim Carnus wrote: Dear List, I am trying to plot a similar graph to attached from minitab manual in R. I have a response Y and three components which systematically vary in their proportions. I have found in R methods/packages to plot ternary plots (eg. plotrix) but nothing which can extend it to response surface in 3-D. Any help appreciated, I'm not aware of anyone who has done this. The way to do the surface in rgl would be to construct a mesh of triangles using tmesh3d, and set the color of each vertex as part of the material argument. It's a little tricky to get the colors right when they vary by vertex, but the code below gives an example. I would construct the mesh by starting with one triangle and calling subdivision3d, but you may want more control over them. For example: library(rgl) # First create a flat triangle and subdivide it triangle - c(0,0,0,1, 1,0,0,1, 0.5, sqrt(3)/2, 0, 1) mesh - tmesh3d( triangle, 1:3, homogeneous=TRUE) mesh - subdivision3d(mesh, 4, deform=FALSE, normalize=TRUE) # Now get the x and y coordinates and compute the surface height x - with(mesh, vb[1,]) y - with(mesh, vb[2,]) z - x^2 + y^2 mesh$vb[3,] - z # Now assign colors according to the height; remember that the # colors need to be in the order of mesh$it, not vertex order. vcolors - rainbow(100)[99*z+1] tricolors - vcolors[mesh$it] mesh$material = list(color=tricolors) # Now draw the surface, and a rudimentary frame behind it. shade3d(mesh) triangles3d(matrix(triangle, byrow=TRUE, ncol=4), col=white) quads3d(matrix(c(1,0.5,0.5,1, 0,sqrt(3)/2, sqrt(3)/2,0, 0,0,1,1), ncol=3), col=white) bg3d(gray) Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple plot margins
Andre Nathan wrote: Hello I'm plotting 6 graphs using mfrow = c(2, 3). In these plots, only graphs in the first column have titles for the y axis, and only the ones in the last row have titles for the x axis. I'd like all plots to be of the same size, and I'm trying to keep them as near each other as possible, but I'm having the following problem. If I make a single call to par(mar = ...), to leave room on the left and bottom for the axes titles, a lot of space will be wasted because not all graphs need titles; however, if I make one call of par(mar = ...) per plot, to have finer control of the margins, the first column and last row plots will be smaller than the rest, because the titles use up some of their space. I thought that setting large enough values for oma would do what I want, but it doesn't appear to work if mar is too small. To illustrate better what I'm trying to do: l +-+ +-+ +-+ a | | | | | | b | | | | | | e | | | | | | l +-+ +-+ +-+ l +-+ +-+ +-+ a | | | | | | b | | | | | | e | | | | | | l +-+ +-+ +-+ label label label where the margins between each plot should be narrow. Should I just plot the graphs without axis titles and then use text() to manually position them? Can't you do it with lattice / grid? If not, example: par(mfrow = c(2,3), mar = c(0,0,0,0), oma = c(5,5,0,0), xpd=NA) plot(1, xaxt=n, xlab=, ylab=A) plot(1, xaxt=n, yaxt=n, xlab=, ylab=) plot(1, xaxt=n, yaxt=n, xlab=, ylab=) plot(1, xlab=I, ylab=B) plot(1, xlab=II, ylab=, yaxt=n) plot(1, xlab=III, ylab=, yaxt=n) Uwe Ligges Thanks in advance, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] where does the null come from?
m = matrix(1:4, 2) apply(m, 1, cat, '\n') # 1 2 # 3 4 # NULL why the null? vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with reshape/reShape and indexing
one way is the following: df.index - df df.index$Index - ave(seq_along(df$Name), df$Name, FUN = seq_along) df.index df2 - reshape(df.index[c(Name, Index, X1)], timevar = Index, idvar = Name, direction = wide) df2 I hope it helps. Best, Dimitris Dana Sevak wrote: Dear R Helpers, I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. I have a dataframe df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4)) df Name X1 X2 1a 12 200 2a 13 250 3a 14 300 4b 20 600 5b 25 700 6c 30 900 First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) df.index Name X1 X2Index 1a 12 2001 2a 13 2502 3a 14 3003 4b 20 6001 5b 25 7002 6c 30 9001 How can I do this? Secondly, I would like to reshape this dataframe in the form: df2 1 2 3 a 12 13 14 b 20 25 NA c 30 NA NA Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). If I could generate the Index column, I think I could accomplish this with: df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) colnames(df2) = c(V1, V2, V3) However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? Thank you so much for your help on these two issues. With best regards, Dana Sevak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] questions on rpart (tree changes when rearrange the order of covariates?!)
Yuanyuan wrote: Greetings, I am using rpart for classification with class method. The test data is the Indian diabetes data from package mlbench. I fitted a classification tree firstly using the original data, and then exchanged the order of Body mass and Plasma glucose which are the strongest/important variables in the growing phase. The second tree is a little different from the first one. The misclassification tables are different too. I did not change the data, but why the results are so different? Well, at some splits the variable that comes first and yields in the same reduction of the entropy criterion as another one might be used, hence another result. Uwe Ligges Does anyone know how rpart deal with ties? Here is the codes for running the two trees. library(mlbench) data(PimaIndiansDiabetes2) mydata-PimaIndiansDiabetes2 library(rpart) fit2-rpart(diabetes~., data=mydata,method=class) plot(fit2,uniform=T,main=CART for original data) text(fit2,use.n=T,cex=0.6) printcp(fit2) table(predict(fit2,type=class),mydata$diabetes) ## misclassifcation table: rows are fitted class neg pos neg 437 68 pos 63 200 #Klimt(fit2,mydata) pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)]) fit3-rpart(diabetes~., data=pmydata,method=class) plot(fit3,uniform=T,main=CART after exchaging mass glucose) text(fit3,use.n=T,cex=0.6) printcp(fit3) table(predict(fit3,type=class),pmydata$diabetes) ##after exchage the order of BODY mass and PLASMA glucose neg pos neg 436 64 pos 64 204 #Klimt(fit3,pmydata) Thanks, -- Yuanyuan Huang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where does the null come from?
Hi! Wacek Kusnierczyk wrote: m = matrix(1:4, 2) apply(m, 1, cat, '\n') # 1 2 # 3 4 # NULL why the null? Could it be the return value of 'cat'. See ?cat, where: ---snip --- Value None (invisible NULL). ---snip --- Kind regrads, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where does the null come from?
Wacek Kusnierczyk wrote: m = matrix(1:4, 2) apply(m, 1, cat, '\n') # 1 2 # 3 4 # NULL why the null? It comes from unlist()ing a list of NULLs, which in turn are the return values of cat(). It is arguably a design-buglet not to return list(NULL, NULL), but the internal logic is to unlist() unless the first element is.recursive (and NULL is not) or the return values have different length() (and all are zero). It _is_, however, in accordance with the documentation (see the Value: section): -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where does the null come from?
Peter Dalgaard wrote: Wacek Kusnierczyk wrote: m = matrix(1:4, 2) apply(m, 1, cat, '\n') # 1 2 # 3 4 # NULL why the null? It comes from unlist()ing a list of NULLs, which in turn are the return values of cat(). yes; i'd think i'd get a list of nulls, but... It is arguably a design-buglet not to return list(NULL, NULL), but the internal logic is to unlist() unless the first element is.recursive (and NULL is not) or the return values have different length() (and all are zero). It _is_, however, in accordance with the documentation (see the Value: section): ... i agree the actual outcome is appropriately explained in the docs. i don't think it has no merit, but it's a bit surprising at first. thanks, vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with a cumullative Hazrd Ratio plot
Hi R-masters I need help to make modified cumulative hazard ratio plot. I need create a common plot but with the number of subjects in risk each ticks times for two different groups in bottom of plot (I put one example in attach). Do you know a routine for this? Is possible create a routine for this? In this case with how commands? Thanks in advance! -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple ANOVA tests
Hello!!! I'm trying to do multiple ANOVA tests with R (testing the affect off different factors on the same response). As a result I get many ANOVA tables, and I want to extract a list of the Pr(F) from all the tables. Maybe someone have an idea how to do this? Thanks Imri -- View this message in context: http://www.nabble.com/Multiple-ANOVA-tests-tp23518637p23518637.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting numeric into character strings
Hi Melissa unless I miss a point, you should get what you want with (for example) y-paste(b,collapse=,) Hope this helps. Olivier Melissa2k9 wrote: Hi, Im trying to put some numbers into a dataframe , I have a list of numbers (change points in a time series) like such [1] 2 11 12 20 21 98 99 but I want R to recognise this as just a character string so it will put it in one row and column, ideally I want them seperated by commas so I would have for example Person Change points (seconds) A 2,11,12,20,21,98,99 B4,5,89 etc. Is there any way I can get this I've tried this: for example if the command to get the list of numbers was b-which(a!=s), then i have tried as.character(b) but I just end up with [1] 2 11 12 20 21 98 99 which is not what I want as this is more than one string and is not seperated by commas, I also tried paste(b,sep=,) but I end up with the same thing. Sorry it's a bit confusing to read but any help would be great! Melissa -- View this message in context: http://www.nabble.com/converting-numeric-into-character-strings-tp23518762p23519577.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a grid with color on a map
dxc13 wrote: Hi all, I have posted similar questions regarding this topic, but I just can't seem to get over the hump and find a straightforward way to do this. I have attached my file as a reference. Basically, the attached file is a 5 degree by 5 degree grid of the the world (2592 cells), most of them are NA's. I just want to be able to plot this grid over a world map and color code the cells. For example, if a cell has a temperature less than 20 degrees it will be blue, 21 to 50 green color, 51-70 orange, 71+ red colored cells. For any NAs, they should be colored white. I know how to create a map of the world using map() and add a grid to it using map.grid(), but I can't color code the cells the way I need. Is there a way to do this in R? Thanks again. dxc13 http://www.nabble.com/file/p23514804/time1test.txt time1test.txt How about the following, which doesn't need a grid at all? library(maps) temp - as.matrix(read.table(time1test.txt)) xvals - c(0, 0, 5, 5, 0) yvals - c(0, 5, 5, 0, 0) map(world) palette(rainbow(50)) for (lat in seq(-90, 85, 5)) for (lon in seq(-180, 175, 5)) { col - temp[(lat + 95)/5, (lon + 185)/5] if (!is.na(col)) polygon(lat + xvals, lon + yvals, col=col, border=NA) } palette(default) HTH Ray Brownrigg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlay cdf
You might also use ?curve # same example as Bill's par(mar = c(5,4,2,4)+0.1, yaxs = r) Sample - rgamma(1000,2.5,.8) hist(Sample, main = , freq = FALSE, ylim = c(0,1)) curve(pgamma(x, 2.5, 0.8), add=T, col='red') curve(dgamma(x, 2.5, 0.8), add=T, col='darkgreen') axis(4, col = red) mtext(side = 4, text = Cumulative probability, col = red, line = 2.5) x0 - c(0, sort(Sample)) p0 - 0:1000/1000 lines(x0, p0, type = S, col = blue) Regards, Matthieu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a cumullative Hazrd Ratio plot
?mtext You may need to adjust the margins. For this I recommend adjusting that mar option in par (see ?par). -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bernardo Rangel Tura Sent: Wednesday, May 13, 2009 6:31 AM To: r-help Subject: Re: [R] Help with a cumullative Hazrd Ratio plot On Wed, 2009-05-13 at 07:19 -0300, Bernardo Rangel Tura wrote: Hi R-masters I need help to make modified cumulative hazard ratio plot. I need create a common plot but with the number of subjects in risk each ticks times for two different groups in bottom of plot (I put one example in attach). Do you know a routine for this? Is possible create a routine for this? In this case with how commands? Thanks in advance! Sorry I put attach in jpeg format In this mail a attach in PDF format -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] overlap contour
Friends, I have two covariance matrices (m1 and m2) of same size (150x150). I used contourplot function to make contour plots individually (c1 and c2). I am interested in making one contourplot overlapping the two individual contours so that the portion of the plot above and below the diagonal can represent the c1 and c2. Someone suggest me how can i do the same. Is there any way that i can combine m1 and m2 and write the combined matrix to a file and plot it to achieve the mentioned above. Thanks, Bala [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a grid with color on a map
dxc13 wrote: Hi all, I have posted similar questions regarding this topic, but I just can't seem to get over the hump and find a straightforward way to do this. I have attached my file as a reference. Basically, the attached file is a 5 degree by 5 degree grid of the the world (2592 cells), most of them are NA's. I just want to be able to plot this grid over a world map and color code the cells. For example, if a cell has a temperature less than 20 degrees it will be blue, 21 to 50 green color, 51-70 orange, 71+ red colored cells. For any NAs, they should be colored white. I know how to create a map of the world using map() and add a grid to it using map.grid(), but I can't color code the cells the way I need. Is there a way to do this in R? Hi dxc13, This might get you started: temp1-read.table(time1test.dat,header=TRUE) mapcol-color.scale(as.matrix(temp1[36:1,]),c(0.5,1),c(0.5,0),c(1,0)) # have to draw the map to get the user coordinates map() # get the limits of the map maplim-par(usr) # transform the temperatures into colors, reversing the row order color2D.matplot(temp1[36:1,],cellcolors=mapcol,axes=FALSE) # don't erase the current plot par(new=TRUE) # draw an empty plot with the appropriate axes (I think) plot(0,xlim=maplim[1:2],ylim=maplim[3:4],type=n) # add the map over the color squares map(add=TRUE) This seems a bit wonky, probably because I haven't adjusted the coordinates. Also, I'm only getting grayscale colors, even though the colors in mapcol aren't gray. Don't know why yet. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a grid with color on a map
Oops, forgot to include: library(plotrix) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with reshape/reShape and indexing
Try this: DF$Index - ave(1:nrow(DF), DF$Name, FUN = seq_along) reshape(DF[-3], dir = wide, idvar = Name, timevar = Index) Also see the reshape package for another similar facility. On Wed, May 13, 2009 at 2:02 AM, Dana Sevak dana.se...@yahoo.com wrote: Dear R Helpers, I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. I have a dataframe df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4)) df Name X1 X2 1 a 12 200 2 a 13 250 3 a 14 300 4 b 20 600 5 b 25 700 6 c 30 900 First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) df.index Name X1 X2 Index 1 a 12 200 1 2 a 13 250 2 3 a 14 300 3 4 b 20 600 1 5 b 25 700 2 6 c 30 900 1 How can I do this? Secondly, I would like to reshape this dataframe in the form: df2 1 2 3 a 12 13 14 b 20 25 NA c 30 NA NA Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). If I could generate the Index column, I think I could accomplish this with: df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) colnames(df2) = c(V1, V2, V3) However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? Thank you so much for your help on these two issues. With best regards, Dana Sevak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] questions on rpart (tree changes when rearrange the order of covariates?!)
From: Uwe Ligges Yuanyuan wrote: Greetings, I am using rpart for classification with class method. The test data is the Indian diabetes data from package mlbench. I fitted a classification tree firstly using the original data, and then exchanged the order of Body mass and Plasma glucose which are the strongest/important variables in the growing phase. The second tree is a little different from the first one. The misclassification tables are different too. I did not change the data, but why the results are so different? Well, at some splits the variable that comes first and yields in the same reduction of the entropy criterion as another one might be used, hence another result. Uwe Ligges I recently tried writing adaboost.m1 using rpart, and was surprised that with very small training set (say n=10 or 20), I get a large improvement in test set accuracy if I randomly shuffle the columns in the data at every adaboost iteration. (With twonorm data, we're talking about 25% error vs. 19%, using n=2000 test set.) It turned out to be the way rpart deals with ties--- first come, first win. Without shuffling the columns, rpart almost never pick any variable beyond the 10th. (In twonorm, all variables are equally important, so one would expect roughly equal selection frequency.) I've gotten some pointers from Terry Therneau about where in the code to check. I may try to implement breaking ties at random (as I've done in randomForest). No promises, though... Andy Does anyone know how rpart deal with ties? Here is the codes for running the two trees. library(mlbench) data(PimaIndiansDiabetes2) mydata-PimaIndiansDiabetes2 library(rpart) fit2-rpart(diabetes~., data=mydata,method=class) plot(fit2,uniform=T,main=CART for original data) text(fit2,use.n=T,cex=0.6) printcp(fit2) table(predict(fit2,type=class),mydata$diabetes) ## misclassifcation table: rows are fitted class neg pos neg 437 68 pos 63 200 #Klimt(fit2,mydata) pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)]) fit3-rpart(diabetes~., data=pmydata,method=class) plot(fit3,uniform=T,main=CART after exchaging mass glucose) text(fit3,use.n=T,cex=0.6) printcp(fit3) table(predict(fit3,type=class),pmydata$diabetes) ##after exchage the order of BODY mass and PLASMA glucose neg pos neg 436 64 pos 64 204 #Klimt(fit3,pmydata) Thanks, -- Yuanyuan Huang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a grid with color on a map
It was the NAs that fooled color2D.matplot. This gets your colors, although not exactly what you want. Look at the help for color2D.matplot to get that. I think fiddling with the x and y limits on the map() call will get the positions right. temp1-read.table(time1test.dat,header=TRUE) library(plotrix) # reverse the row order, as color2D.matplot reverses it color2D.matplot(temp1[36:1,],c(0.5,1),c(0.5,0),c(1,0),axes=FALSE) # don't erase the above plot par(new=TRUE) # do a ghost plot with just the axes plot(0,xlim=maplim[1:2],ylim=maplim[3:4],type=n) # add the map on top in black map(add=TRUE,col=black) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with reshape/reShape and indexing
This does it more or less your way: ds - split(df, df$Name) ds - lapply(ds, function(x){x$Index - seq_along(x[,1]); x}) df2 - unsplit(ds, df$Name) tapply(df2$X1, df2[,c(Name, Index)], function(x) x) athough there may exist much easier ways ... Here's one way with the plyr and reshape package: library(plyr) df.index - ddply(df, .(Name), transform, Index = seq_along(X1)) library(reshape) cast(df.index, Name ~ Index, value = X1) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] silhouette: clustering labels have to be consecutive integers starting
TS == Tao Shi shi...@hotmail.com on Wed, 10 Oct 2007 06:15:53 + writes: TS Thank you very much, Benilton and Prof. Ripley, for the TS speedy replies! TS Looking forward to the fix! TS Tao I have finally re-stumbled onto this e-mail thread, and indeed found fixed the problem. Version 1.12.0 of 'cluster' should become visible within a few days, and will allow to call silhoutte(g, dis) on a grouping vector of k different integer values which need *not* necessarily be in 1:k. Martin Maechler, ETH Zurich From: Prof Brian Ripley rip...@stats.ox.ac.uk To: Benilton Carvalho bcarv...@jhsph.edu CC: Tao Shi shi...@hotmail.com, maech...@stat.math.ethz.ch, r-help@r-project.org Subject: Re: [R] silhouette: clustering labels have to be consecutive intergers starting from 1? Date: Wed, 10 Oct 2007 05:33:03 +0100 (BST) It is a C-level problem in package cluster: valgrind gives ==11377== Invalid write of size 8 ==11377==at 0xA4015D3: sildist (sildist.c:35) ==11377==by 0x4706D8: do_dotCode (dotcode.c:1750) This is a matter for the package maintainer (Cc:ed here), not R-help. On Tue, 9 Oct 2007, Benilton Carvalho wrote: that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0 (final)... http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html then i stopped using... now, the problem seems to be back. The same examples still apply. This fails: require(cluster) set.seed(1) x - rnorm(100) g - sample(2:4, 100, rep=T) for (i in 1:100){ print(i) tmp - silhouette(g, dist(x)) } and this works: require(cluster) set.seed(1) x - rnorm(100) g - sample(2:4, 100, rep=T) for (i in 1:100){ print(i) tmp - silhouette(as.integer(factor(g)), dist(x)) } and here's the sessionInfo(): sessionInfo() R version 2.6.0 (2007-10-03) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF- 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID ENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] cluster_1.11.9 (Red Hat EL 2.6.9-42 smp - AMD opteron 848) b On Oct 9, 2007, at 8:35 PM, Tao Shi wrote: Hi list, When I was using 'silhouette' from the 'cluster' package to calculate clustering performances, R crashed. I traced the problem to the fact that my clustering labels only have 2's and 3's. when I replaced them with 1's and 2's, the problem was solved. Is the function purposely written in this way so when I have clustering labels, 2 and 3, for example, the function somehow takes the 'missing' cluster 2 into account when it calculates silhouette widths? Thanks, Tao ## ## sorry about the long attachment R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 5.1 $year [1] 2007 $month [1] 06 $day [1] 27 $`svn rev` [1] 42083 $language [1] R $version.string [1] R version 2.5.1 (2007-06-27) library(cluster) cl1 ## clustering labels [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 x1 ## 1-d input vector [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [6] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [11] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [16] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [21] 1.0163758 0.7657763 0.7370084 0.6999689 0.7366476 [26] 0.7883921 0.6925395 0.7729240 0.7202391 0.7910149 [31] 0.7397698 0.7958092 0.6978596 0.7350255 0.7294362 [36] 0.6125713 0.7174000 0.7413046 0.7044205 0.7568104 [41] 0.7048469 0.7334515 0.7143170 0.7002311 0.7540981 [46] 0.7627527 0.7712762 0.8193611 0.7801148
Re: [R] AFT-model with time-dependent covariates
The coding for an AFT model with time-dependent covariates will be very hard, and I don't know of anyone who has done it. (But I don't keep watch of other survival packages, so something might be there). In a Cox model, a subject's risk depends only on the current value of his/her covariates; in an AFT model the risk depends on the entire covariate history. (My 'accelerated age' is the sum of all the extra years I have ever gained). Coding this is not theoretically complex, but would be a pain-in-the-rear amount of bookkeeping. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotCI line types for line and for bar
Thank you! Yes, I am using the plotCI from gplots and I want the line connecting the centers to be dashed, just as for the bars. However changing the type to be p as you said does not give dashed line but no line at all (only points). lehe wrote: Anyone has some clue to this question? Thanks in advance! lehe wrote: Hi, I was wondering how to specify the line type for line instead of for bar. Here is my code: plotCI(x=mcra1avg, uiw=stdev1, type=l,col=2,lty=2) This way, I will have the bar line as dashed lty=2 and red col=2, and the line connecting the centers of the bars is also red col=2 but solid lty=1. How to make the line connecting the bar centers have the same solid lty as the bar? Thanks and regards! You neglected to say that you were using the plotCI from gplots (not the one from plotrix, which has slightly different behaviors). Here's my solution (with some data made up -- you didn't give a reproducible example). I assume that you meant above that you wanted the line connecting the centers to be dashed? mcra1avg - 1:3 stdev1 - c(0.2,0.1,0.4) library(gplots) plotCI(x=mcra1avg, uiw=stdev1, type=p,col=2,lty=2) lines(mcra1avg,col=2,lty=2) By the way, it's not all uncommon to have to wait more than 12 hours for a response on the R list -- the variability is very high ... I would say it's generally good to wait at least 24 hours before bumping ... Ben Bolker -- View this message in context: http://www.nabble.com/plotCI-line-types-for-line-and-for-bar-tp23501900p23520615.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] questions on rpart (tree changes when rearrange the order of covariates)
If two variables have exactly the same split importance, then rpart will use the one that was first in the model statement. So if rpart(group ~ age + height + weight + sex) and at some split point both age and weight gave a split with 20 correct and 9 incorrect, then age would be used to split at that node. Even though the error of the age and weight splits are the same, the set of 9 subjects that were incorrect may be different, i.e., they don't send exactly the same observations to the left and the right. Thus, the rest of the tree from that point on may be different, giving a different fit. For continuous y this rarely happens -- that two splits have exactly the same R^2 -- but it is not uncommon in classification problems. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a quick way to combine rows in a matrix
Hello, I reviewed my code and this will work now for any number of successive TA, I hope: b=matrix(1:64, ncol=4) rownames(b)=rep(c(AA,AT,TA,TT),each=4) key - rownames(b) key[key == AT] - TA c - b rownames(c)=key for(i in 2:I(nrow(c))) { if(rownames(c)[i]==TA rownames(c)[i-1]==TA) { c[i,] - colSums(c[i:I(i-1),]) c[i-1,]-NA}} # sums the rows and replace the used rows by NA values c - c[apply(c,1,function(x)any(!is.na(x))),] # removes the rows with NA values c Rock Rocko22 wrote: In the first reply, what was calculated was the overall means by group (amino acids). It does not work for a larger database. I am quite really new to R, and I worked on your question just to learn how to manipulate data with R. The following seems to work. The code could be made a lot more elegant and straightforward, but it works only when there is no more than two successive TA: Let's try with a matrix b that contains more rows than in your example: b=matrix(1:32, ncol=4) rownames(b)=rep(c(AA,AT,TA,TT),2) key - rownames(b) key[key == AT] - TA rownames(b)=key for(i in 1:I(nrow(b)-1)) { if(rownames(b)[i]==TA rownames(b)[i+1]==TA) { b[i,] - colSums(b[i:I(i+1),]) b[i+1,]-NA}} # sums the rows and replace the used rows by NA values b - b[order(b[,1],na.last=NA),] # removes the rows with NA values Of course, the rows are reordered, and that may be not wanted. The ordering was just to remove the NA rows. Rock :-D -- View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23520900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R package to fit mixture or cure survival models
Dear All, I am desperately trying to find any R package that fits a mixture survival models also know as a cure models. These are survival models where the survival function is improper, which also means that a fraction of subjects are expected not to expreience the event. A huge literature has been developed for these type of models but I couldn't find any R package that fits them. Bests Marc _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nagelkerkes R2N
Hello All, as I´m new to R and survival analysis, I´ve got a question about the Design::validate function: My Code: cox - cph(Surv(t,status) ~ var1 + var2 + var3, data=data, x=TRUE, y=TRUE, surv=TRUE) cox.val - validate(cox, B=10, dxy=TRUE, pr=TRUE); My output (cox.val): index.orig training test Dxy -0.3639222921368090891 -0.3591157308750822175 -0.3634294047761231106 R2 1.000 1.000 1.000 Slope 1.000 1.000 1.0055508323397084336 D 0.0232804472888947744 0.0226998668193014774 0.0232190381679612834 U -0.607553318187988 -0.610134584621832 0.254159617147094 Q 0.0233412026207135703 0.0227608802777636665 0.0231936222062465713 optimism index.corrected n Dxy0.0043136739010409269 -0.36823596603785002657 10 R2 0.000 1. 10 Slope -0.0055508323397084336 1.00555083233970843359 10 D -0.0005191713486598047 0.02379961863755457596 10 U -0.864294201768926 0.2567408835809379 10 Q -0.0004327419284829055 0.02377394454919647515 10 And my question ist about the R2: Why ist the value always 1.0. That doesn´t seem to me like a realistic value. And so I tried to calculate R2 with my own formula: LR - -2*cox$loglik[2] L0 - -2*cox$loglik[1] n - length(data[,ID]) R2N - (1-exp(-LR/n)) / (1-exp(L0/n)) R2N calculated that way is -0.00132314024559236. Can anybody help me to understand the formula to R2 and why the validate-function results in 1.0? Thanks, Andrea. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotCI line types for line and for bar
lehe wrote: Thank you! Yes, I am using the plotCI from gplots and I want the line connecting the centers to be dashed, just as for the bars. However changing the type to be p as you said does not give dashed line but no line at all (only points). Yes, but the next line lines(mcra1avg,col=2,lty=2) adds a line with the desired line type. Perhaps one idea about R graphics that would be useful to you is that one often builds up a desired plot by adding pieces sequentially, rather than finding a single plot command that does everything at once. Ben Bolker -- View this message in context: http://www.nabble.com/plotCI-line-types-for-line-and-for-bar-tp23501900p23521202.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strucchange | weighted models
On Tue, 12 May 2009, f.query wrote: Greetings - Am hoping to use the strucchange package to look for structural breaks in some messy regression data. A series of preliminary analyses indicate that BLUE for these data will involve some weighting the data (estimates of a particular population parameter) by a function of the variance of the estimate (say, inverse of the variance). While I've gone through the docs for strucchange (which are excellent, btw), Thanks! I don't see a simple (or obvious) way to apply some sort of 'weighting' to the regressions implemented in the package. I think there isn't in the old efp()/Fstats()/breakpoints() part, then there is no easy way. But int the new gefp() function you can use weights. If you want to do breakpoints estimation, I've got some modified code which is not included in the package...let me know if you need it. hth, Z Short of diving into source (which I could do, but I'm not sure how the various tests would be impacted by weighting of any sort), was wondering if anyone had dealt with this sort of issue - either with strucchange, or some other approach/package? Thanks in advance... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] silhouette: clustering labels have to be consecutive integers starting
Thank you very much, Martin. Warmest regards, b Em 13/05/2009, às 09:14, Martin Maechler maech...@stat.math.ethz.ch escreveu: !#x000a TS == Tao Shi shi...@hotmail.com on Wed, 10 Oct 2007 06:15:53 + writes: TS Thank you very much, Benilton and Prof. Ripley, for the TS speedy replies! TS Looking forward to the fix! TS Tao I have finally re-stumbled onto this e-mail thread, and indeed found fixed the problem. Version 1.12.0 of 'cluster' should become visible within a few days, and will allow to call silhoutte(g, dis) on a grouping vector of k different integer values which need *not* necessarily be in 1:k. Martin Maechler, ETH Zurich From: Prof Brian Ripley rip...@stats.ox.ac.uk To: Benilton Carvalho bcarv...@jhsph.edu CC: Tao Shi shi...@hotmail.com, maech...@stat.math.ethz.ch, r-help@r-project.org Subject: Re: [R] silhouette: clustering labels have to be consecutive intergers starting from 1? Date: Wed, 10 Oct 2007 05:33:03 +0100 (BST) It is a C-level problem in package cluster: valgrind gives ==11377== Invalid write of size 8 ==11377==at 0xA4015D3: sildist (sildist.c:35) ==11377==by 0x4706D8: do_dotCode (dotcode.c:1750) This is a matter for the package maintainer (Cc:ed here), not R- help. On Tue, 9 Oct 2007, Benilton Carvalho wrote: that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0 (final)... http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html then i stopped using... now, the problem seems to be back. The same examples still apply. This fails: require(cluster) set.seed(1) x - rnorm(100) g - sample(2:4, 100, rep=T) for (i in 1:100){ print(i) tmp - silhouette(g, dist(x)) } and this works: require(cluster) set.seed(1) x - rnorm(100) g - sample(2:4, 100, rep=T) for (i in 1:100){ print(i) tmp - silhouette(as.integer(factor(g)), dist(x)) } and here's the sessionInfo(): sessionInfo() R version 2.6.0 (2007-10-03) x86_64-unknown-linux-gnu locale: LC_CTYPE= en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U TF- 8;L C_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF- 8;L C_NAME= C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID ENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] cluster_1.11.9 (Red Hat EL 2.6.9-42 smp - AMD opteron 848) b On Oct 9, 2007, at 8:35 PM, Tao Shi wrote: Hi list, When I was using 'silhouette' from the 'cluster' package to calculate clustering performances, R crashed. I traced the problem to the fact that my clustering labels only have 2's and 3's. when I replaced them with 1's and 2's, the problem was solved. Is the function purposely written in this way so when I have clustering labels, 2 and 3, for example, the function somehow takes the 'missing' cluster 2 into account when it calculates silhouette widths? Thanks, Tao ## ## sorry about the long attachment R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 5.1 $year [1] 2007 $month [1] 06 $day [1] 27 $`svn rev` [1] 42083 $language [1] R $version.string [1] R version 2.5.1 (2007-06-27) library(cluster) cl1 ## clustering labels [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 x1 ## 1-d input vector [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [6] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [11] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [16] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 [21] 1.0163758 0.7657763 0.7370084 0.6999689 0.7366476 [26] 0.7883921 0.6925395 0.7729240 0.7202391 0.7910149 [31] 0.7397698 0.7958092 0.6978596 0.7350255 0.7294362 [36] 0.6125713 0.7174000 0.7413046 0.7044205 0.7568104 [41] 0.7048469 0.7334515 0.7143170 0.7002311 0.7540981 [46] 0.7627527 0.7712762 0.8193611 0.7801148 0.9061762 [51] 0.8248195 0.7932630 0.7248037 0.7423547 0.6419314 [56] 0.6001092 0.7572272 0.7631742 0.7085384 0.8710853 [61] 0.6589563 0.7464943 0.7487340 0.7751280 0.7946542 [66] 0.7666081 0.8508109 0.8314308 0.7442471 0.8006093 [71] 0.7949156 0.7852447 0.7630048 0.7104764 0.6768218 [76] 0.6806351 0.7255355 0.7431389 0.7523627 0.7670515 [81] 0.8118214 0.7215615 0.8186164 0.6941610 0.8285453 [86] 0.8395170 0.8088044 0.8182706 0.7550723 0.7948639 [91] 0.7204830 0.7109068 0.7756949
[R] Help with reshape/reShape and indexing
Hi Dana, -- Forwarded message -- From: Dana Sevak dana.se...@yahoo.com To: r-help@r-project.org Date: Tue, 12 May 2009 23:02:00 -0700 (PDT) Subject: [R] Help with reshape/reShape and indexing Dear R Helpers, I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. There are usually many ways to accomplish any given task in R, and which one you use is a matter of preference. I've settled on use the reshape package for these kinds of tasks. If you're comfortable with the solutions already suggested there's no need to continue reading. Otherwise here's another approach: I have a dataframe df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4)) df Name X1 X2 1 a 12 200 2 a 13 250 3 a 14 300 4 b 20 600 5 b 25 700 6 c 30 900 First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) df.index Name X1 X2 Index 1 a 12 200 1 2 a 13 250 2 3 a 14 300 3 4 b 20 600 1 5 b 25 700 2 6 c 30 900 1 How can I do this? Easy enough with the plyr package (loaded with reshape): df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4)) library(reshape) df$Index - ddply(df, Name, colwise(seq_along))[,1] Secondly, I would like to reshape this dataframe in the form: df2 1 2 3 a 12 13 14 b 20 25 NA c 30 NA NA Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). I don't really understand this. What happened to X2? Anyway, I would do it like this: df$X2 - NULL m.df - melt(df, measure.vars=X1) df.final - cast(m.df, ... ~ Index) df.final Name variable 123 1a X1 12 13 14 2b X1 20 25 NA 3c X1 30 NA NA But I don't see why you want to drop X2, so I would actually do df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4)) df$Index - ddply(df, Name, colwise(seq_along))[,1] df$X2 - as.character(df$X2) m.df - melt(df, measure.vars=c(X1,X2)) df.final - cast(m.df, ... ~ Index) df.final Name variable 123 1a X1 12 13 14 2a X2 200 250 300 3b X1 20 25 NA 4b X2 600 700 NA 5c X1 30 NA NA 6c X2 4 NA NA All the best, Ista If I could generate the Index column, I think I could accomplish this with: df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) colnames(df2) = c(V1, V2, V3) However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? Thank you so much for your help on these two issues. With best regards, Dana Sevak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlay cdf
Thanks alot I found the function x0 - c(0, sort(Sample)) p0 - 0:1000/1000 lines(x0, p0, type = S, col = blue) Very helpfull As it seems to plot an instantaneous representation of the variables in the gamma distribution Bill.Venables wrote: Here are some ideas you might like to consider par(mar = c(5,4,2,4)+0.1, yaxs = r) Sample - rgamma(1000,2.5,.8) hist(Sample, main = , freq = FALSE, ylim = c(0,1)) pu - par(usr)[1:2] x - seq(pu[1], pu[2], len = 5000) y - pgamma(x, 2.5, 0.8) par(new = TRUE) plot(x, y, type = l, axes = FALSE, ann = FALSE, col = red) lines(x, dgamma(x, 2.5, 0.8), col = darkgreen) axis(4, col = red) mtext(side = 4, text = Cumulative probability, col = red, line = 2.5) x0 - c(0, sort(Sample)) p0 - 0:1000/1000 lines(x0, p0, type = S, col = blue) Bill Venables http://www.cmis.csiro.au/bill.venables/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of beetle2 Sent: Wednesday, 13 May 2009 3:23 PM To: r-help@r-project.org Subject: [R] Overlay cdf Hi, Is it possible to overlay a cummulative distribution function on a histogram of a gamma distribuition. I have a gamma function Sample = rgamma(1000,2.5,.8)+1.5 hist(Sample) regards -- View this message in context: http://www.nabble.com/Overlay-cdf-tp23515551p23515551.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Overlay-cdf-tp23515551p23517150.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] converting numeric into character strings
Hi, Im trying to put some numbers into a dataframe , I have a list of numbers (change points in a time series) like such [1] 2 11 12 20 21 98 99 but I want R to recognise this as just a character string so it will put it in one row and column, ideally I want them seperated by commas so I would have for example Person Change points (seconds) A 2,11,12,20,21,98,99 B4,5,89 etc. Is there any way I can get this I've tried this: for example if the command to get the list of numbers was b-which(a!=s), then i have tried as.character(b) but I just end up with [1] 2 11 12 20 21 98 99 which is not what I want as this is more than one string and is not seperated by commas, I also tried paste(b,sep=,) but I end up with the same thing. Sorry it's a bit confusing to read but any help would be great! Melissa -- View this message in context: http://www.nabble.com/converting-numeric-into-character-strings-tp23518762p23518762.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a grid with color on a map
Thanks, Jim. This seems to be what I am looking for. Just have to fine tune the colors to get some distinctive greens, blues, yellows and oranges in there and I should be good to go. Jim Lemon-2 wrote: It was the NAs that fooled color2D.matplot. This gets your colors, although not exactly what you want. Look at the help for color2D.matplot to get that. I think fiddling with the x and y limits on the map() call will get the positions right. temp1-read.table(time1test.dat,header=TRUE) library(plotrix) # reverse the row order, as color2D.matplot reverses it color2D.matplot(temp1[36:1,],c(0.5,1),c(0.5,0),c(1,0),axes=FALSE) # don't erase the above plot par(new=TRUE) # do a ghost plot with just the axes plot(0,xlim=maplim[1:2],ylim=maplim[3:4],type=n) # add the map on top in black map(add=TRUE,col=black) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/plotting-a-grid-with-color-on-a-map-tp23514804p23521213.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read multiple large files into one dataframe
Hello Apologies if this is a simple question, I have searched the help and have not managed to work out a solution. Does anybody know an efficient method for reading many text files of the same format into one table/dataframe? I have around 90 files that contain continuous data over 3 months but that are split into individual days data and I need the whole 3 months in one file for analysis. Each days file contains a large amount of data (approx 30MB each) and so I need a memory efficient method to merge all of the files into the one dataframe object. From what I have read I will probably want to avoid using for loops etc? All files are in the same directory, none have a header row, and each contain around 180,000 rows and the same 25 columns/variables. Any suggested packages/routines would be very useful. Thanks Jennifer - ***If you are not the intended recipient, please notify our Help Desk at Email postmas...@nats.co.uk immediately. You should not copy or use this email or attachment(s) for any purpose nor disclose their contents to any other person. NATS computer systems may be monitored and communications carried on them recorded, to secure the effective operation of the system and for other lawful purposes. Please note that neither NATS nor the sender accepts any responsibility for viruses or any losses caused as a result of viruses and it is your responsibility to scan or otherwise check this email and any attachments. NATS means NATS (En Route) plc (company number: 4129273), NATS (Services) Ltd (company number 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd (company number 3155567) or NATS Holdings Ltd (company number 4138218). All companies are registered in England and their registered office is at 5th Floor, Brettenham House South, Lancaster Place, London, WC2E 7EN. ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixture of survivals or cure models
Dear All, I am desperately trying to find any R package that fits a mixture survival models also know as a cure models. This are survival model where the survival function is improper which also means that a fraction of subjects are expected not to expreience the event. A Huge literature has been developed for thes type of models but I couldn't find any R package that fits this type of models. Bests Marc _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting numeric into character strings
Dear Melissa, Try this: x - c(2, 11, 12, 20, 21, 98, 99) paste(x, collapse=,) [1] 2,11,12,20,21,98,99 See ?paste for more information. HTH, Jorge On Wed, May 13, 2009 at 5:53 AM, Melissa2k9 m.mcquil...@lancaster.ac.ukwrote: Hi, Im trying to put some numbers into a dataframe , I have a list of numbers (change points in a time series) like such [1] 2 11 12 20 21 98 99 but I want R to recognise this as just a character string so it will put it in one row and column, ideally I want them seperated by commas so I would have for example Person Change points (seconds) A 2,11,12,20,21,98,99 B4,5,89 etc. Is there any way I can get this I've tried this: for example if the command to get the list of numbers was b-which(a!=s), then i have tried as.character(b) but I just end up with [1] 2 11 12 20 21 98 99 which is not what I want as this is more than one string and is not seperated by commas, I also tried paste(b,sep=,) but I end up with the same thing. Sorry it's a bit confusing to read but any help would be great! Melissa -- View this message in context: http://www.nabble.com/converting-numeric-into-character-strings-tp23518762p23518762.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3dscatter for linux
Hi, do you have any suggestions how to make 3D scatterplot, BUT under linux. Worth mentioning is the fact that 'scatterplot3d' does not load under Ubuntu 8.10. Do you know any alternatives?? I tried cloud or persp but X,Y and Z axes are emprical in my case, and cannot be replaced by any seq(...). Thanks in advance, robert -- View this message in context: http://www.nabble.com/3dscatter-for-linux-tp23521603p23521603.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of survivals or cure models
Check out: http://www.math.mun.ca/~ypeng/research/ On Wed, May 13, 2009 at 8:34 AM, marc bernard marc_bern...@hotmail.co.uk wrote: Dear All, I am desperately trying to find any R package that fits a mixture survival models also know as a cure models. This are survival model where the survival function is improper which also means that a fraction of subjects are expected not to expreience the event. A Huge literature has been developed for thes type of models but I couldn't find any R package that fits this type of models. Bests Marc _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of survivals or cure models
Also: http://post.queensu.ca/~pengp/software.html On Wed, May 13, 2009 at 9:21 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Check out: http://www.math.mun.ca/~ypeng/research/ On Wed, May 13, 2009 at 8:34 AM, marc bernard marc_bern...@hotmail.co.uk wrote: Dear All, I am desperately trying to find any R package that fits a mixture survival models also know as a cure models. This are survival model where the survival function is improper which also means that a fraction of subjects are expected not to expreience the event. A Huge literature has been developed for thes type of models but I couldn't find any R package that fits this type of models. Bests Marc _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlap contour
Bala subramanian-2 wrote: Friends, I have two covariance matrices (m1 and m2) of same size (150x150). I used contourplot function to make contour plots individually (c1 and c2). I am interested in making one contourplot overlapping the two individual contours so that the portion of the plot above and below the diagonal can represent the c1 and c2. Someone suggest me how can i do the same. Is there any way that i can combine m1 and m2 and write the combined matrix to a file and plot it to achieve the mentioned above. I'm not quite sure what you mean (and this may be why no-one has responded so far). Do you mean m2[lower.triang(m2)] - m1[lower.triang(m1)] contour(m2) ? I can imagine a fancier solution where you use contourLines to extract the contour lines, remove points where xy, and plot them, but that seems like more work. Ben Bolker -- View this message in context: http://www.nabble.com/overlap-contour-tp23520206p23521760.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple ANOVA tests
Hi lmri. You could do this by doing something like this: Getting an Anova first:# utils::data(npk, package=MASS) ( npk.aov - aov(yield ~ block + N*P*K, npk) ) summary(npk.aov) # I want the P value from this summary of aov object. #here is the code: summary(npk.aov)[[1]]$P # [1] 0.015938790 0.004371812 0.474904093 0.028795054 0.263165283 0.168647879 # [7] 0.862752086 NA # the last one is of the P value for the residuals, which doesn't exist - so returns NA. #so you might wanna use: na.omit(summary(npk.aov)[[1]]$P) Now you have a vector of P values, and you could do whatever you want with it... Cheers, Tal On Wed, May 13, 2009 at 1:32 PM, Imri bisr...@agri.huji.ac.il wrote: Hello!!! I'm trying to do multiple ANOVA tests with R (testing the affect off different factors on the same response). As a result I get many ANOVA tables, and I want to extract a list of the Pr(F) from all the tables. Maybe someone have an idea how to do this? Thanks Imri -- View this message in context: http://www.nabble.com/Multiple-ANOVA-tests-tp23518637p23518637.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read multiple large files into one dataframe
I'd first try plyr and see if it's efficient enough, library(plyr) listOfFiles - list.files(pattern= .txt) d - ldply(listOfFiles, read.table) str(d) alternatively, d - do.call(rbind, lapply(listOfFiles, read.table)) HTH, baptiste On 13 May 2009, at 12:45, SYKES, Jennifer wrote: Hello Apologies if this is a simple question, I have searched the help and have not managed to work out a solution. Does anybody know an efficient method for reading many text files of the same format into one table/dataframe? I have around 90 files that contain continuous data over 3 months but that are split into individual days data and I need the whole 3 months in one file for analysis. Each days file contains a large amount of data (approx 30MB each) and so I need a memory efficient method to merge all of the files into the one dataframe object. From what I have read I will probably want to avoid using for loops etc? All files are in the same directory, none have a header row, and each contain around 180,000 rows and the same 25 columns/variables. Any suggested packages/ routines would be very useful. Thanks Jennifer - ***If you are not the intended recipient, please notify our Help Desk at Email postmas...@nats.co.uk immediately. You should not copy or use this email or attachment(s) for any purpose nor disclose their contents to any other person. NATS computer systems may be monitored and communications carried on them recorded, to secure the effective operation of the system and for other lawful purposes. Please note that neither NATS nor the sender accepts any responsibility for viruses or any losses caused as a result of viruses and it is your responsibility to scan or otherwise check this email and any attachments. NATS means NATS (En Route) plc (company number: 4129273), NATS (Services) Ltd (company number 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd (company number 3155567) or NATS Holdings Ltd (company number 4138218). All companies are registered in England and their registered office is at 5th Floor, Brettenham House South, Lancaster Place, London, WC2E 7EN. ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read multiple large files into one dataframe
What types of data are in each file? All numbers, or a mix of numbers and characters? Any missing data or special NA values? On Wed, May 13, 2009 at 7:45 AM, SYKES, Jennifer jennifer.sy...@nats.co.uk wrote: Hello Apologies if this is a simple question, I have searched the help and have not managed to work out a solution. Does anybody know an efficient method for reading many text files of the same format into one table/dataframe? I have around 90 files that contain continuous data over 3 months but that are split into individual days data and I need the whole 3 months in one file for analysis. Each days file contains a large amount of data (approx 30MB each) and so I need a memory efficient method to merge all of the files into the one dataframe object. From what I have read I will probably want to avoid using for loops etc? All files are in the same directory, none have a header row, and each contain around 180,000 rows and the same 25 columns/variables. Any suggested packages/routines would be very useful. Thanks Jennifer - ***If you are not the intended recipient, please notify our Help Desk at Email postmas...@nats.co.uk immediately. You should not copy or use this email or attachment(s) for any purpose nor disclose their contents to any other person. NATS computer systems may be monitored and communications carried on them recorded, to secure the effective operation of the system and for other lawful purposes. Please note that neither NATS nor the sender accepts any responsibility for viruses or any losses caused as a result of viruses and it is your responsibility to scan or otherwise check this email and any attachments. NATS means NATS (En Route) plc (company number: 4129273), NATS (Services) Ltd (company number 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd (company number 3155567) or NATS Holdings Ltd (company number 4138218). All companies are registered in England and their registered office is at 5th Floor, Brettenham House South, Lancaster Place, London, WC2E 7EN. ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3dscatter for linux
threshold wrote: Hi, do you have any suggestions how to make 3D scatterplot, BUT under linux. Worth mentioning is the fact that 'scatterplot3d' does not load under Ubuntu 8.10. Do you know any alternatives?? I tried cloud or persp but X,Y and Z axes are emprical in my case, and cannot be replaced by any seq(...). Thanks in advance, robert http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-3d:graphics-3d See esp. rgl::plot3d() Also, cloud() seems to work just fine with irregular x,y, z: d - data.frame(x=runif(10),y=runif(10),z=runif(10)) library(lattice) cloud(z~x*y,data=d) how/why doesn't scatterplot3d load? I can't find any reference to this on the mailing lists, but maybe I'm missing something. It does fine in Ubuntu 9.04 (intrepid). Ben Bolker -- View this message in context: http://www.nabble.com/3dscatter-for-linux-tp23521603p23521711.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] questions on rpart (tree changes when rearrange the order of covariates?!)
I wonder - isn't this issue one of the reasons to use RandomForests rather than CART? On Wed, May 13, 2009 at 8:03 AM, Liaw, Andy andy_l...@merck.com wrote: From: Uwe Ligges Yuanyuan wrote: Greetings, I am using rpart for classification with class method. The test data is the Indian diabetes data from package mlbench. I fitted a classification tree firstly using the original data, and then exchanged the order of Body mass and Plasma glucose which are the strongest/important variables in the growing phase. The second tree is a little different from the first one. The misclassification tables are different too. I did not change the data, but why the results are so different? Well, at some splits the variable that comes first and yields in the same reduction of the entropy criterion as another one might be used, hence another result. Uwe Ligges I recently tried writing adaboost.m1 using rpart, and was surprised that with very small training set (say n=10 or 20), I get a large improvement in test set accuracy if I randomly shuffle the columns in the data at every adaboost iteration. (With twonorm data, we're talking about 25% error vs. 19%, using n=2000 test set.) It turned out to be the way rpart deals with ties--- first come, first win. Without shuffling the columns, rpart almost never pick any variable beyond the 10th. (In twonorm, all variables are equally important, so one would expect roughly equal selection frequency.) I've gotten some pointers from Terry Therneau about where in the code to check. I may try to implement breaking ties at random (as I've done in randomForest). No promises, though... Andy Does anyone know how rpart deal with ties? Here is the codes for running the two trees. library(mlbench) data(PimaIndiansDiabetes2) mydata-PimaIndiansDiabetes2 library(rpart) fit2-rpart(diabetes~., data=mydata,method=class) plot(fit2,uniform=T,main=CART for original data) text(fit2,use.n=T,cex=0.6) printcp(fit2) table(predict(fit2,type=class),mydata$diabetes) ## misclassifcation table: rows are fitted class neg pos neg 437 68 pos 63 200 #Klimt(fit2,mydata) pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)]) fit3-rpart(diabetes~., data=pmydata,method=class) plot(fit3,uniform=T,main=CART after exchaging mass glucose) text(fit3,use.n=T,cex=0.6) printcp(fit3) table(predict(fit3,type=class),pmydata$diabetes) ##after exchage the order of BODY mass and PLASMA glucose neg pos neg 436 64 pos 64 204 #Klimt(fit3,pmydata) Thanks, -- Yuanyuan Huang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitri Liakhovitski MarketTools, Inc. dimitri.liakhovit...@markettools.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import HTML tables
Dieter Menne wrote: Dimitri Szerman-2 wrote: Hello, I was wondering if there is a function in R that imports tables directly from a HTML document. The XML package can do this: http://markmail.org/message/cyicoa3htme4gei2 Duncan Temple Lang: The htmlParse() and htmlTreeParse() functions in the XML package use the non-strict HTML parser in libxml2 and so the HTML document can be malformed. Indeed. Thanks Dieter. htmlParse() reads the document; getNodeSet allows us to easily find the table or tables of interest. We can find the th and td entries easily using XPath also. The less automated part is how to meaningfully process the content. That is where a human should be involved, deciding whether to trim white space, how to convert text to values, dealing with missing cells. We can do a lot by default, but ... There is a relatively simple function at http://www.omegahat.org/ParseXML/readHTMLTable.R that provides something resembling read.table. It is not well tested as in the past, I have just used XPath directly as, once you know XPath, extracting content from HTML/XML is very straightforward. D. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] name siz ein cluster
I would like to change to size of the names in a cluster dendrogram (not the axis or the header) (package clue). The normal things (pch, cex.label, font) do not work here. Thanks in advance! Johannes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] access to the current element of lapply
Dear All, I would like to use the 'split' function on the dataframe elements contained in a list L. For example : (df - data.frame(cbind(c(rep('A',2), rep('B',2)), rep(1:4 X1 X2 1 A 1 2 A 2 3 B 3 4 B 4 (L-split(df, df$X1)) $A X1 X2 1 A 1 2 A 2 $B X1 X2 3 B 3 4 B 4 Now, I would like to split EACH data frame, ie, according to column 2(X2). lapply(L, split, df$X2) $A $A$`1` X1 X2 1 A 1 $A$`2` X1 X2 2 A 2 $A$`3` [1] X1 X2 0 rows (or 0-length row.names) $A$`4` [1] X1 X2 0 rows (or 0-length row.names) $B $B$`1` X1 X2 3 B 3 $B$`2` X1 X2 4 B 4 $B$`3` [1] X1 X2 0 rows (or 0-length row.names) $B$`4` [1] X1 X2 0 rows (or 0-length row.names) Warning messages: 1: In split.default(seq_len(nrow(x)), f, drop = drop, ...) : data length is not a multiple of split variable 2: In split.default(seq_len(nrow(x)), f, drop = drop, ...) : data length is not a multiple of split variable I works but it's dirty. How could I do it properly, without warnings and 0 rows data frame in output ? I thought accessing to the current element of 'lapply' to recuperate the vector of the column 2 would work. i.e: lapply(L,split, L[[current]][,2]) Is there a way to do something like that in R ? Thanks in advance ! - Martial _ Découvrez toutes les possibilités de communication avec vos proches [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nagelkerkes R2N
A new version of Design will be posted to CRAN in the next 2 days. After than, update your system, including an update to the survival package. Then re-try. Your formula is wrong as it can't be negative. LR should be the likelihood ratio chi-square stat : -2 times the difference in the two loglik values. Frank Andrea Weidacher wrote: Hello All, as I´m new to R and survival analysis, I´ve got a question about the Design::validate function: My Code: cox - cph(Surv(t,status) ~ var1 + var2 + var3, data=data, x=TRUE, y=TRUE, surv=TRUE) cox.val - validate(cox, B=10, dxy=TRUE, pr=TRUE); My output (cox.val): index.orig training test Dxy -0.3639222921368090891 -0.3591157308750822175 -0.3634294047761231106 R2 1.000 1.000 1.000 Slope 1.000 1.000 1.0055508323397084336 D 0.0232804472888947744 0.0226998668193014774 0.0232190381679612834 U -0.607553318187988 -0.610134584621832 0.254159617147094 Q 0.0233412026207135703 0.0227608802777636665 0.0231936222062465713 optimism index.corrected n Dxy0.0043136739010409269 -0.36823596603785002657 10 R2 0.000 1. 10 Slope -0.0055508323397084336 1.00555083233970843359 10 D -0.0005191713486598047 0.02379961863755457596 10 U -0.864294201768926 0.2567408835809379 10 Q -0.0004327419284829055 0.02377394454919647515 10 And my question ist about the R2: Why ist the value always 1.0. That doesn´t seem to me like a realistic value. And so I tried to calculate R2 with my own formula: LR - -2*cox$loglik[2] L0 - -2*cox$loglik[1] n - length(data[,ID]) R2N - (1-exp(-LR/n)) / (1-exp(L0/n)) R2N calculated that way is -0.00132314024559236. Can anybody help me to understand the formula to R2 and why the validate-function results in 1.0? Thanks, Andrea. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpart - not for classification?
Hello! I very minor point. I typed help.search(classification). It found a bunch of things including randomForests - which makes a lot sense. I am wondering why rpart was not found. I think - it should make sense too. -- Dimitri Liakhovitski MarketTools, Inc. dimitri.liakhovit...@markettools.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] name siz ein cluster
I'm afraid I have no experience with the clue package, but if all else fails you could consider the hclust package. You change font size in the conventional way with this. Cheers, Simon. - Original Message - From: Penner, Johannes johannes.pen...@mfn-berlin.de To: r-help@r-project.org Sent: Wednesday, May 13, 2009 3:08 PM Subject: [R] name siz ein cluster I would like to change to size of the names in a cluster dendrogram (not the axis or the header) (package clue). The normal things (pch, cex.label, font) do not work here. Thanks in advance! Johannes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] access to the current element of lapply
On May 13, 2009, at 9:12 AM, Martial Sankar wrote: Dear All, I would like to use the 'split' function on the dataframe elements contained in a list L. For example : (df - data.frame(cbind(c(rep('A',2), rep('B',2)), rep(1:4 X1 X2 1 A 1 2 A 2 3 B 3 4 B 4 (L-split(df, df$X1)) $A X1 X2 1 A 1 2 A 2 $B X1 X2 3 B 3 4 B 4 Now, I would like to split EACH data frame, ie, according to column 2(X2). lapply(L, split, df$X2) $A $A$`1` X1 X2 1 A 1 $A$`2` X1 X2 2 A 2 $A$`3` [1] X1 X2 0 rows (or 0-length row.names) $A$`4` [1] X1 X2 0 rows (or 0-length row.names) $B $B$`1` X1 X2 3 B 3 $B$`2` X1 X2 4 B 4 $B$`3` [1] X1 X2 0 rows (or 0-length row.names) $B$`4` [1] X1 X2 0 rows (or 0-length row.names) Warning messages: 1: In split.default(seq_len(nrow(x)), f, drop = drop, ...) : data length is not a multiple of split variable 2: In split.default(seq_len(nrow(x)), f, drop = drop, ...) : data length is not a multiple of split variable I works but it's dirty. How could I do it properly, without warnings and 0 rows data frame in output ? I thought accessing to the current element of 'lapply' to recuperate the vector of the column 2 would work. i.e: lapply(L,split, L[[current]][,2]) Is there a way to do something like that in R ? Thanks in advance ! - Martial # Split on BOTH columns and drop unused levels L - split(df, list(df$X1, df$X2), drop = TRUE) L $A.1 X1 X2 1 A 1 $A.2 X1 X2 2 A 2 $B.3 X1 X2 3 B 3 $B.4 X1 X2 4 B 4 Is that what you want? HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read multiple large files into one dataframe
can you provide reproducible code please? even a fake example would help. I would 1) set up a loop to read in each file from a directory 2) inside the loop chop up/ aggregate the data, each file in turn and spit each new aggreagated file out to a directory using write.table(). This will reduce the memory needed by only including the info you want. Make sure each file is a data frame with the same names. 3) set up a new loop to read in each new small file and rbind them all together to make your new master file. The R gurus may have a more parsimonious solution. HTH Simon. - Original Message - From: SYKES, Jennifer jennifer.sy...@nats.co.uk To: r-help@r-project.org Sent: Wednesday, May 13, 2009 11:45 AM Subject: [R] read multiple large files into one dataframe Hello Apologies if this is a simple question, I have searched the help and have not managed to work out a solution. Does anybody know an efficient method for reading many text files of the same format into one table/dataframe? I have around 90 files that contain continuous data over 3 months but that are split into individual days data and I need the whole 3 months in one file for analysis. Each days file contains a large amount of data (approx 30MB each) and so I need a memory efficient method to merge all of the files into the one dataframe object. From what I have read I will probably want to avoid using for loops etc? All files are in the same directory, none have a header row, and each contain around 180,000 rows and the same 25 columns/variables. Any suggested packages/routines would be very useful. Thanks Jennifer - ***If you are not the intended recipient, please notify our Help Desk at Email postmas...@nats.co.uk immediately. You should not copy or use this email or attachment(s) for any purpose nor disclose their contents to any other person. NATS computer systems may be monitored and communications carried on them recorded, to secure the effective operation of the system and for other lawful purposes. Please note that neither NATS nor the sender accepts any responsibility for viruses or any losses caused as a result of viruses and it is your responsibility to scan or otherwise check this email and any attachments. NATS means NATS (En Route) plc (company number: 4129273), NATS (Services) Ltd (company number 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd (company number 3155567) or NATS Holdings Ltd (company number 4138218). All companies are registered in England and their registered office is at 5th Floor, Brettenham House South, Lancaster Place, London, WC2E 7EN. ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] name siz ein cluster
I tried for example: Plot(mycluster, font=2) But this changes only the font size of the y-axis. Regards Johannes -- Project Coordinator BIOTA West Amphibians Museum of Natural History Dep. of Research (Herpetology) Invalidenstrasse 43 D-10115 Berlin Tel: +49 (0)30 2093 8708 Fax: +49 (0)30 2093 8565 http://www.biota-africa.org http://community-ecology.biozentrum.uni-wuerzburg.de -Ursprüngliche Nachricht- Von: Simon Pickett [mailto:simon.pick...@bto.org] Gesendet: Mittwoch, 13. Mai 2009 16:30 An: Penner, Johannes; r-help@r-project.org Betreff: Re: [R] name siz ein cluster I'm afraid I have no experience with the clue package, but if all else fails you could consider the hclust package. You change font size in the conventional way with this. Cheers, Simon. - Original Message - From: Penner, Johannes johannes.pen...@mfn-berlin.de To: r-help@r-project.org Sent: Wednesday, May 13, 2009 3:08 PM Subject: [R] name siz ein cluster I would like to change to size of the names in a cluster dendrogram (not the axis or the header) (package clue). The normal things (pch, cex.label, font) do not work here. Thanks in advance! Johannes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where does the null come from?
out - apply(m, 1, cat, '\n') 1 3 2 4 out NULL On Wed, May 13, 2009 at 5:23 AM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: m = matrix(1:4, 2) apply(m, 1, cat, '\n') # 1 2 # 3 4 # NULL why the null? vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Histogram + % of cases for a given criteria
Hi all, I am doing some explorations using a dataset with the following structure (id, value, flag). For instance: a, 2.2, 1 b, 3.0, 1 c, 2.9, 0 d, 3.1, 1 ... I have plotted a standard histogram using a simple command like: hist(data$value) My question: I would like to superimpose a line ([0%-100%] scale) representing the % of values that, for each class of the histogram, have the $flag equal to 1. What strategy do you recommend? Is this easily doable in R? I hope I made myself clear. Please let me know if not. Thanks in advance, -- Sérgio Nunes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Histogram + % of cases for a given criteria
Hi all, I am doing some explorations using a dataset with the following structure (id, value, flag). For instance: a, 2.2, 1 b, 3.0, 1 c, 2.9, 0 d, 3.1, 1 ... I have plotted a standard histogram using a simple command like: hist(data$value) My question: I would like to superimpose a line ([0%-100%] scale) representing the % of values that, for each class of the histogram, have the $flag equal to 1. What strategy I hope I made myself clear. Please let me know if not. Thanks in advance, -- Sérgio Nunes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where does the null come from?
On 13-May-09 14:43:17, Gabor Grothendieck wrote: out - apply(m, 1, cat, '\n') 1 3 2 4 out NULL Or, more explicitly, from ?cat : Value: None (invisible 'NULL'). Ted. On Wed, May 13, 2009 at 5:23 AM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: _ _m = matrix(1:4, 2) _ _apply(m, 1, cat, '\n') _ _# 1 2 _ _# 3 4 _ _# NULL why the null? vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 13-May-09 Time: 15:56:04 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read multiple large files into one dataframe
A few points to consider: - If all the data are numeric, then use matrices instead of data frames. - With either data frames or matrices, there is no way (that I'm aware of anyway) in R to stack them without making at least one copy in memory. - Since none of the files has a header row, I would concatenate them into one file outside R (e.g., on *nix, cat * all.txt) and then read that in. You can also try it inside R with something like read.table(pipe()). You will want to make use of the colClasses argument in read.table() to specify the column types, though, to ensure that read.table() only go through the input once. - You're probably better off getting the data into a database (even something like sqlite) and use an R interface to that database. - 30MB x 90 = 2.7GB. Unless you're on a 64-bit machine with lots of RAM, you're not likely to have much fun with the data even when you manage to get it into R in one piece. Andy From: SYKES, Jennifer Hello Apologies if this is a simple question, I have searched the help and have not managed to work out a solution. Does anybody know an efficient method for reading many text files of the same format into one table/dataframe? I have around 90 files that contain continuous data over 3 months but that are split into individual days data and I need the whole 3 months in one file for analysis. Each days file contains a large amount of data (approx 30MB each) and so I need a memory efficient method to merge all of the files into the one dataframe object. From what I have read I will probably want to avoid using for loops etc? All files are in the same directory, none have a header row, and each contain around 180,000 rows and the same 25 columns/variables. Any suggested packages/routines would be very useful. Thanks Jennifer - ***If you are not the intended recipient, please notify our Help Desk at Email postmas...@nats.co.uk immediately. You should not copy or use this email or attachment(s) for any purpose nor disclose their contents to any other person. NATS computer systems may be monitored and communications carried on them recorded, to secure the effective operation of the system and for other lawful purposes. Please note that neither NATS nor the sender accepts any responsibility for viruses or any losses caused as a result of viruses and it is your responsibility to scan or otherwise check this email and any attachments. NATS means NATS (En Route) plc (company number: 4129273), NATS (Services) Ltd (company number 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd (company number 3155567) or NATS Holdings Ltd (company number 4138218). All companies are registered in England and their registered office is at 5th Floor, Brettenham House South, Lancaster Place, London, WC2E 7EN. ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simulation
Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with randomly generating samples
Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Why not? It took 0.05 seconds on my 5 years old laptop. Gabor Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi gabor.csa...@unil.ch UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with randomly generating samples
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Debbie Zhang Sent: Wednesday, May 13, 2009 8:18 AM To: r-help@r-project.org Subject: [R] Problems with randomly generating samples Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie How about samples - rnorm(1000*100,0,1) dim(samples) - c(1000,100) Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
what about putting in a matrix, e.g., matrix(rnorm(1000*100), 1000, 100) I hope it helps. Best, Dimitris Debbie Zhang wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
If you want k samples of size n, why generate k*n samples and put them in a k-by-n matrix where you can do what you want to each sample: k = 10 n = 100 x=matrix(rnorm(k*n),k,n) rowMeans(x) If you need to do more complex things to each sample and if k is large enough that you don't want the matrix sitting around in memory while you do these things, you could also check out ?replicate . On Wed, May 13, 2009 at 12:13 PM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limits
Uwe Ligges lig...@statistik.tu-dortmund.de wrote: So you want some software that can do symbolic calculations? In that case use other software. R is designed for numerical analyses. In particular, if you are looking for good free software, you might try Maxima. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
On Wed, May 13, 2009 at 4:26 PM, Gábor Csárdi csa...@rmki.kfki.hu wrote: On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Why not? It took 0.05 seconds on my 5 years old laptop. Second-guessing the user, I think she maybe doesn't want to type in 'rnorm(100,0,1)' 1000 times... Soln - for loop: z=list() for(i in 1:1000){z[[i]]=rnorm(100,0,1)} now inspect the individual bits: hist(z[[1]]) hist(z[[545]]) If that's the problem, then I suggest she reads an introduction to R... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
Dear Debbie, Here are two options: # Parameters N - 1000 n - 100 # Option 1 mys - replicate(N, rnorm(n)) mys # Option 2 mys2 - matrix(rnorm(N*n),ncol=N) mys2 HTH, Jorge On Wed, May 13, 2009 at 11:13 AM, Debbie Zhang debbie0...@hotmail.comwrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with reshape/reShape and indexing
To all of you who answered me: Thank you so much! Each approach taught me something new and I really appreciate your help! Best regards, Dana Sevak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with randomly generating samples
On 13-May-09 15:18:05, Debbie Zhang wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie One possibility is nsamples - 1000 sampsize - 100 Samples - matrix(rnorm(nsamples*sampsize,0,1),nrow=nsamples) Then each row of the matrix Samples will be a sample of size 'sampsize', the i-th can be accessed as Samples[i,], and there are 'nsamples' rows to choose from. Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 13-May-09 Time: 16:46:05 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
Does every 100 numbers in rnorm(100 * 1000, 0, 1) have the N(0,1) distribution? On Wed, May 13, 2009 at 11:13 PM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ode first step
Hi all, I try to assess the parameters (K1,K2) of a model that describes the adsorption of a molecule onto on adsorbent. equation: dq/dt = K1*C*(qm-q)-K2*q I know the value of 'qm' and I experimentally measure the variables 'q', 'C', and the time 't'. t C q 1 0 144.05047 0.000 2565 99.71492 0.1105625 3988 74.99426 0.1722100 4 1415 58.65572 0.2129545 5 1833 48.34586 0.2386649 6 2257 40.29413 0.2587440 7 2675 32.92470 0.2771216 8 3105 29.57162 0.2854834 9 3552 28.01424 0.2893672 10 3986 25.62167 0.2953337 11 4415 23.62612 0.3003101 12 4841 21.95523 0.3044769 13 5264 21.08464 0.3066480 14 5698 19.68040 0.3101498 15 6509 18.31788 0.3135476 16 6950 17.65868 0.3151915 17 7403 17.00206 0.3168290 18 8130 16.38856 0.3183589 19 9001 15.58544 0.3203617 20 9928 15.27882 0.3211263 21 11899 14.46415 0.3231579 22 16354 13.91779 0.3245204 23 18926 13.82630 0.3247485 24 21602 13.66776 0.3251439 25 24413 13.98560 0.3243513 26 27056 13.87143 0.3246360 27 29844 13.64881 0.3251912 It's a differential equation, thus I had a look on the command 'ode' from the deSolve package. I'm early stuck on the use of the function 'ode' cause I don't get how to define the function 'func' required by 'ode' Any help would be appreciated. Regards/Cordialement - Benoit Boulinguiez Ph.D student Ecole de Chimie de Rennes (ENSCR) Bureau 1.20 Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes Avenue du Général Leclerc CS 50837 35708 Rennes CEDEX 7 Tel 33 (0)2 23 23 80 83 Fax 33 (0)2 23 23 81 20 http://www.ensc-rennes.fr/ http://www.ensc-rennes.fr/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limits
Try the rSymPy or Ryacas packages. In the rSymPy code below the var command defines x as symbolic to sympy and then we perform the computation: library(rSymPy) Loading required package: rJava sympy(var('x')) [1] x sympy(limit(x*x + x + 2, x, 2)) [1] 8 Or using devel version define x as symbolic first to sympy and then to R: library(rSymPy) source(http://rsympy.googlecode.com/svn/trunk/R/Sym.R;) sympy(var('x')) [1] x x - Sym(x) limit(x*x + x + 2, x, 2) [1] 8 or using Ryacas: library(Ryacas) Loading required package: XML x - Sym(x) Limit(x^2+x+2, x, 2) [1] Starting Yacas! expression(8) More info is available here which you should read before using these packages: http://rsympy.googlecode.com http://ryacas.googlecode.com On Tue, May 5, 2009 at 5:39 AM, Hassan Mohamed hassan_hany_fa...@yahoo.com wrote: Hey, what is the R function for the mathematical limit ? e.g. to calculate and return the amount that the expression X^2 +X +2 approach as X approach 2 (X- 2) thanks hassan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Centering R output in Sweave/LaTeX
Good Day to All, When sweaving the following: \begin{table} \centering echo=FALSE= ftable(ifmtm$type, ifmtm$gender, ifmtm$marche , ifmtm$nfic, dnn=c(Type,Gender,Ambulant,Visit)) @ \caption{Four-way cross-tabulation on all data} \label{tab:crosstab} \end{table} the output of ftable is not centered while the latex caption is. Is there a way to center the R output in this setting ? Thanks for any help and best wishes, JL __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
Barry Rowlingson wrote: On Wed, May 13, 2009 at 4:26 PM, Gábor Csárdi csa...@rmki.kfki.hu wrote: On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Why not? It took 0.05 seconds on my 5 years old laptop. Second-guessing the user, I think she maybe doesn't want to type in 'rnorm(100,0,1)' 1000 times... Soln - for loop: z=list() for(i in 1:1000){z[[i]]=rnorm(100,0,1)} now inspect the individual bits: hist(z[[1]]) hist(z[[545]]) If that's the problem, then I suggest she reads an introduction to R... i'd suggest reading the r inferno by pat burns [1], where he deals with this sort of for-looping lists the way it deserves ;) vQ [1] http://www.burns-stat.com/pages/Tutor/R_inferno.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ode first step
Benoit Boulinguiez benoit.boulinguiez at ensc-rennes.fr writes: I try to assess the parameters (K1,K2) of a model that describes the adsorption of a molecule onto on adsorbent. equation: dq/dt = K1*C*(qm-q)-K2*q I know the value of 'qm' and I experimentally measure the variables 'q', 'C', and the time 't'. I'm early stuck on the use of the function 'ode' cause I don't get how to define the function 'func' required by 'ode' Have a look at the lsoda documentation of the earlier package odesolve, which has easier to understand examples. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a quick way to combine rows in a matrix
You can automate this step key[key == AT] - TA ## create a function to reverse a string -- see strsplit help page for this strReverse function reverse - function(x) sapply(lapply(strsplit(x, NULL), rev), paste, collapse=) key - rownames(a) # combine rownames with reverse (rownames) n-cbind(key, rev=reverse(key)) key rev [1,] AA AA [2,] AT TA [3,] TA AT [4,] TT TT # Now just sort the values in the rows (apply returns column vectors so I also use t() ) and then run do.call on first column n-t(apply(n,1, sort)) do.call(rbind, by(a, n[,1], colSums)) V1 V2 V3 V4 AA 1 5 9 13 AT 5 13 21 29 TT 4 8 12 16 I often need to combine reverse complement DNA strings, so you could do that too # DNA complement comp - function(x) chartr(ACGT, TGCA, x) n-cbind(key, rev=reverse(comp(key))) n-t(apply(n,1, sort)) do.call(rbind, by(a, n[,1], colSums)) V1 V2 V3 V4 AA 5 13 21 29 AT 2 6 10 14 TA 3 7 11 15 Chris Stubben jholtman wrote: Try this: key - rownames(a) key[key == AT] - TA do.call(rbind, by(a, key, colSums)) V2 V3 V4 V5 AA 1 5 9 13 TA 5 13 21 29 TT 4 8 12 16 On Mon, May 11, 2009 at 4:53 PM, Crosby, Jacy R jacy.r.cro...@uth.tmc.eduwrote: I'm working with genotype data in a frequency table: a=matrix(1:16, nrow=4) rownames(a)=c(AA,AT,TA,TT) a [,1] [,2] [,3] [,4] AA159 13 AT26 10 14 TA37 11 15 TT48 12 16 'AT' and 'TA' are essentially the same, and I'd like to combine (add) the rows to reflect this. The final matrix should be: [,1] [,2] [,3] [,4] AA159 13 AT513 21 29 TT48 12 16 Is there a fast way to do this? Thanks in advance! Jacy Crosby jacy.r.cro...@uth.tmc.edu -- View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23525634.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Anova
melt.updn - structure(list(date = structure(c(11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057, 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057, 13149), class = Date), variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(unrestored, restored), class = factor), value = c(1.34057641541824, 0.918021774919366, 0.905654270934854, 0.305945104043220, 0.58298856330543, 1.36580645291274, 0.874195629894938, 0.87482377014642, 0.930267689669002, 0.41753134369356, 1.09248531450337, 1.72571397293738, 0.305751868168171, 0.584498524462223, 0.983300317501076, 1.27216569968585, 0.730578393573363, 0.88361473836175, 1.16501295544266, 2.08896500025784, 0.664286881841064, 1.03859387871079, 1.39172581649833, 0.323405269371357, 1.00207568577518, 1.54383416626015, 0.611261918697393, 0.848992483196744)), .Names = c(date, variable, value), row.names = c(NA, -28L), class = data.frame) aov(value~variable, data=melt.updn) I am having problems making sure that I am doing the correct analysis. I am trying to see if there is a difference in the mean of the restored segment versus the unrestored segment (variable in x). These are repeated measures on the same treatments through time. Is there a way to control for the differences in time steps? Any ideas? thanks for the help, -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple plot margins
On Wed, 2009-05-13 at 11:22 +0200, Uwe Ligges wrote: If not, example: par(mfrow = c(2,3), mar = c(0,0,0,0), oma = c(5,5,0,0), xpd=NA) plot(1, xaxt=n, xlab=, ylab=A) plot(1, xaxt=n, yaxt=n, xlab=, ylab=) plot(1, xaxt=n, yaxt=n, xlab=, ylab=) plot(1, xlab=I, ylab=B) plot(1, xlab=II, ylab=, yaxt=n) plot(1, xlab=III, ylab=, yaxt=n) Thank you. I don't know what I did wrong, but that worked. Best regards, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calling R from .net environment
Hi, Currently I am a .net programmer and would like to use R for my statistical computations engine. I already have installed RServer250.exe so that I could call R from my .net programming environment, however unfortunately, i could not be able to find RServer250.exe in the R-(D) COM Interface region. If someone guide me how to add these COM components and call the R-code through my application, it would be very good to me. Regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
On Wed, May 13, 2009 at 5:36 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Barry Rowlingson wrote: Soln - for loop: z=list() for(i in 1:1000){z[[i]]=rnorm(100,0,1)} now inspect the individual bits: hist(z[[1]]) hist(z[[545]]) If that's the problem, then I suggest she reads an introduction to R... i'd suggest reading the r inferno by pat burns [1], where he deals with this sort of for-looping lists the way it deserves ;) I don't think extending a list this way is too expensive. Not like doing 1000 foo=rbind(foo,bar)s to a matrix. The overhead for extending a list should really only be adding a single new pointer to the list pointer structure. The existing list data isn't copied. Plus lists are more flexible. You can do: z=list() for(i in 1:1000){ z[[i]]=rnorm(i,0,1) # generate 'i' samples } and then you can see how the properties of samples of rnorm differ with increasing numbers of samples. Yes, you can probably vectorize this with lapply or something, but I prefer clarity over concision when dealing with beginners... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova
On Wed, 2009-05-13 at 12:43 -0400, stephen sefick wrote: melt.updn - structure(list(date = structure(c(11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057, 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057, 13149), class = Date), variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(unrestored, restored), class = factor), value = c(1.34057641541824, 0.918021774919366, 0.905654270934854, 0.305945104043220, 0.58298856330543, 1.36580645291274, 0.874195629894938, 0.87482377014642, 0.930267689669002, 0.41753134369356, 1.09248531450337, 1.72571397293738, 0.305751868168171, 0.584498524462223, 0.983300317501076, 1.27216569968585, 0.730578393573363, 0.88361473836175, 1.16501295544266, 2.08896500025784, 0.664286881841064, 1.03859387871079, 1.39172581649833, 0.323405269371357, 1.00207568577518, 1.54383416626015, 0.611261918697393, 0.848992483196744)), .Names = c(date, variable, value), row.names = c(NA, -28L), class = data.frame) aov(value~variable, data=melt.updn) You can think of this as a linear model and just use lm: lm(value~variable, data=melt.updn) I am having problems making sure that I am doing the correct analysis. I am trying to see if there is a difference in the mean of the restored segment versus the unrestored segment (variable in x). These are repeated measures on the same treatments through time. Is there a way to control for the differences in time steps? Any ideas? thanks for the help, One option is to fit this model using generalised least squares: ## do some plotting to look at potential differences: require(lattice) xyplot(value ~ time | variable, data = melt.updn, type = c(p,smooth)) ## so perhaps some evidence of trend, ## different in the two groups possibly bwplot(value ~ variable, data = melt.updn) ## doesn't look like there is much difference though require(nlme) melt.updn$time - rep(with(melt.updn[1:14,], date - date[1]) + 1, 2) ## include fixed time effect to account for any trend for example? ## use a CAR(1) structure allows for different separations in sampling times lmod - gls(value ~ variable + time, data = melt.updn, corr = corCAR1(form= ~ time | variable)) summary(lmod) intervals(lmod) ## fitting problems with these dummy data ## test CAR(1) structure - do we need? lmod2 - gls(value ~ variable + time, data = melt.updn) anova(lmod, lmod2) ## no need for the structure here summary(lmod2) ## looks like no difference in un/restored anova(lmod2) Just a few thoughts, without knowing exactly your data and design it is difficult to say more. With only two groups, it is difficult to more. I also assume these are dummy data otherwise there really doesn't look like there is any difference between the two groups of samples. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Checking a (new) package - examples require other package functions
I am creating an R package. I ran R CMD check on the package, and everything passed until it tried to run the examples. Then, the result was: * checking examples ... ERROR Running examples in REEMtree-Ex.R failed. The error most likely occurred in: ### * AutoCorrelationLRtest flush(stderr()); flush(stdout()) ### Name: AutoCorrelationLRtest ### Title: Test for autocorrelation in the residuals of a RE-EM tree ### Aliases: AutoCorrelationLRtest ### Keywords: htest tree models ### ** Examples # Estimation without autocorrelation simpleEMresult-RandomEffectsTree(Y~D+t+X, data=simpleREEMdata, random=~1|ID, simpleREEMdata$ID) Error: couldn't find function RandomEffectsTree Execution halted The function RandomEffectsTree is defined in the R code for the package. How can I refer to other functions from the package in examples? (I have the Writing R-extensions PDF, so it would be enough to point me to the right page, if the answer is in there and I just missed it.) Thanks! Rebecca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replace() help
Can anyone see what I'm doing wrong here (highlighted below)? This is driving me crazy... probably a ')' or something equally moronic... genw1[,1] A2 A3 A5 A7 A9 A00010 A00012 A00013 A00014 A00015 A00017 A00018 A00019 A00021 A00023 A00024 CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC Etc...this is a rather large vector table(genw1[,1]) ??CCCG 25 10632 1 genw2-mat.or.vec(nrow(genw1),ncol(genw1)) rownames(genw2)-rownames(genw1) colnames(genw2)-colnames(genw1) genw2[,1]-replace(genw1[,1],which(genw1[,1]==CC), HC) Warning message: In `[-.factor`(`*tmp*`, list, value = HC) : invalid factor level, NAs generated Just for error checking (this is working properly): which(genw1[,1]==CC) [1] 1 2 3 4 5 6 7 8 9101112 131415161718 [19]192021222324252627282930 313233343536 Etc... And it works here... x-matrix(c('CC', 'CC', '??', 'CG'),nrow=2 ) x [,1] [,2] [1,] CC ?? [2,] CC CG x2-mat.or.vec(nrow(x), ncol(x)) x2[,1]-replace(x[,1],which(x[,1]==CC), HC) x2 [,1] [,2] [1,] HC 0 [2,] HC 0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calling R from .net environment
Take a look at this article on CodeProject: http://www.codeproject.com/KB/cs/RtoCSharp.aspx Cheers, Dave -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Arun Kumar Saha Sent: Wednesday, May 13, 2009 10:33 AM To: r-h...@stat.math.ethz.ch Subject: [R] Calling R from .net environment Hi, Currently I am a .net programmer and would like to use R for my statistical computations engine. I already have installed RServer250.exe so that I could call R from my .net programming environment, however unfortunately, i could not be able to find RServer250.exe in the R-(D) COM Interface region. If someone guide me how to add these COM components and call the R-code through my application, it would be very good to me. Regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mann-Kendall test
Dear useRs, I've been trying to run a Mann-Kendall test in my data in order to detect trends. I studied the examples given at the Kendall package and I can understand pretty well how it works on time-series data. However, my data consists of values in different sites per year, as I display below; Year 1 | Year 2 | Year 3 | ... Site 1 x x x ... Site 2x x x ... Site 3x x x ... ... ... ... ... ... (where 'x' represents different values) There's the MannKendall() function on package 'Kendall' and the tau() function on package 'pheno', and I guess they should do the trend detection I need. The problem is I don't know how to manipulate my data in order to get the results. Should I run the M-K test on each Site, on each Year or on the entire dataset? Also, there are some probabilities I should take into account when running a M-K test, but I can't seem to find out how to obtain them. Thanks in advance, Rafael. Veja quais são os assuntos do momento no Yahoo! +Buscados [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple plot margins
Here is a response to almost exactly the same question from a couple of weeks ago: http://finzi.psych.upenn.edu/R/Rhelp08/2009-April/196967.html -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Andre Nathan Sent: Tuesday, May 12, 2009 11:12 AM To: r-help@r-project.org Subject: [R] Multiple plot margins Hello I'm plotting 6 graphs using mfrow = c(2, 3). In these plots, only graphs in the first column have titles for the y axis, and only the ones in the last row have titles for the x axis. I'd like all plots to be of the same size, and I'm trying to keep them as near each other as possible, but I'm having the following problem. If I make a single call to par(mar = ...), to leave room on the left and bottom for the axes titles, a lot of space will be wasted because not all graphs need titles; however, if I make one call of par(mar = ...) per plot, to have finer control of the margins, the first column and last row plots will be smaller than the rest, because the titles use up some of their space. I thought that setting large enough values for oma would do what I want, but it doesn't appear to work if mar is too small. To illustrate better what I'm trying to do: l +-+ +-+ +-+ a | | | | | | b | | | | | | e | | | | | | l +-+ +-+ +-+ l +-+ +-+ +-+ a | | | | | | b | | | | | | e | | | | | | l +-+ +-+ +-+ label label label where the margins between each plot should be narrow. Should I just plot the graphs without axis titles and then use text() to manually position them? Thanks in advance, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple anova question
Dear R group, Simple anova question: I am attempting to recreate a figure (from chapter 10 of Mordern Statistics for the Life Sciences, chapter 10, figure 10.8). It is an interaction diagram plotting BYIELD (continuous) as a function of BSPACING (categorical) with different lines/colours for another categorical variable BVARIETY. The data is replicated into four categorical BBLOCK(s). The corresponding analysis looks like this: BYIELD~BBLOCK+BSPACING+BVARIETY What I want to extract from this model is simply the expected value all possible combination of factors. I can do this by adding the correct combinations of model coefficients, but this seems silly. Surely there is a one-line function for sorting this sort of thing out? Many thanks, Allen -- View this message in context: http://www.nabble.com/simple-anova-question-tp23528280p23528280.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.