Re: [R] R-help Digest, Vol 36, Issue 21
Hello, dear R users. I've already sent a question here, but I'm not sure that it had been read. I need to visualize classification of my numerical data based on 2-3 factors. As I suppose, the best way is a tree. With an orbitrary function at the ends (leaves), or at least with means of my data at the ends. What is the way to do it? As I found, ctree offers binary classification, but it that the only way? Of course, tree is not only way, may be you could offer other ways. Thank you. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-help Digest, Vol 36, Issue 21
Evgeniy Kachalin wrote: Hello, dear R users. I've already sent a question here, but I'm not sure that it had been read. I need to visualize classification of my numerical data based on 2-3 factors. As I suppose, the best way is a tree. With an orbitrary function at the ends (leaves), or at least with means of my data at the ends. What is the way to do it? As I found, ctree offers binary classification, but it that the only way? Of course, tree is not only way, may be you could offer other ways. Or the best way of it is to do it with replacement, like a 'heatmap', but with means in the cells instead of colors, if it is possible. Sorry for the second letter. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-help Digest, Vol 35, Issue 7
Uwe Ligges пишет: Evgeniy Kachalin wrote: Hello, dear participants! Could you tip me, is there any simple and nice way to build scatter-plot for three different types of data (, and o and * - signs, for example) with legend. Now i can guess only that way: plot(x~y,data=subset(mydata,factor1=='1'), pch='.',col='blue') points(x~y,data=subset(mydata,factor1=='2'), pch='*',col='green') points( etc What is the simple and nice way? Thank you very much for your kindness and help. Example: with(iris, plot(Sepal.Length, Sepal.Width, pch = as.integer(Species))) with(iris, legend(7, 4.4, legend = unique(as.character(Species)), pch = unique(as.integer(Species Uwe, sorry for my stupid question. You mean that when pch=factor , plot can recycle the factor and use it for subscripts or marks. Then pch=as.integer(Species) results in c(1,2,3) for 3 factor levels. And I need symbols 15,16,17 and colors red, blue, green. So then I do: iris$Species-spec.symb iris$Species-spec.col levels(spec.symb)-c(15,16,17) levels(spec.col)-c('red','green','blue') That's the only way? More of that!!! 'Plot' does not like factors in 'pch'. So it must be so: plot(x~y,data, pch=as.integer(as.character(spec.symb))). That's totally crazy... -- Evgeniy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-help Digest, Vol 35, Issue 7
Hello, dear participants! Could you tip me, is there any simple and nice way to build scatter-plot for three different types of data (, and o and * - signs, for example) with legend. Now i can guess only that way: plot(x~y,data=subset(mydata,factor1=='1'), pch='.',col='blue') points(x~y,data=subset(mydata,factor1=='2'), pch='*',col='green') points( etc What is the simple and nice way? Thank you very much for your kindness and help. -- Evgeniy Kachalin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Mass 'identify' on 2d-plot
Barry Rowlingson пишет: Evgeniy Kachalin wrote: What is ability in R to graphically (per mouse) define some area and to select all the cases felt in it? 'identify' is OK for 5-10 cases, but what if cases=1000? You can use 'locator' to let the user click a number of points to define a polygon, and then use one of the point-in-polygon functions provided by one of the spatial packages to work out whats in your polygon. Look at splancs, spatstat, sp - pretty much anything beginning with 'sp' - on CRAN. In splancs you can just do: poly = getpoly() - which lets the user draw a polygon on screen, then: inPoly = inpip(xypts,poly) points(xypts[inpip,], pch=19,col=red) and that will plot the selected points in solid red dots. I don't think there's a way to draw a freehand figure on an R plot, you have to go click, click, click, and draw straight lines. I don't get what is 'xypts' in this case... One step earlier i've plotted plot(y~x,data=dat). What is xypts? -- Evgeniy Kachalin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Mass 'identify' on 2d-plot
Hello, dear R-users. I have 2-d dotplot with two variables: x, y. Dots on this dotplot are grouped in human-recogniseable areas. These areas are not round-shaped nor oval-shaped. They are free-form, but still recogniseable by an operator. What is ability in R to graphically (per mouse) define some area and to select all the cases felt in it? 'identify' is OK for 5-10 cases, but what if cases=1000? Thank you very much for advice. -- Evgeniy Kachalin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Mass 'identify' on 2d-plot
Duncan Temple Lang пишет: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Barry Rowlingson wrote: Evgeniy Kachalin wrote: What is ability in R to graphically (per mouse) define some area and to select all the cases felt in it? 'identify' is OK for 5-10 cases, but what if cases=1000? You can use 'locator' to let the user click a number of points to define a polygon, and then use one of the point-in-polygon functions provided by one of the spatial packages to work out whats in your polygon. Look at splancs, spatstat, sp - pretty much anything beginning with 'sp' - on CRAN. In splancs you can just do: poly = getpoly() - which lets the user draw a polygon on screen, then: inPoly = inpip(xypts,poly) points(xypts[inpip,], pch=19,col=red) and that will plot the selected points in solid red dots. I don't think there's a way to draw a freehand figure on an R plot, you have to go click, click, click, and draw straight lines. FWIW, there is an experimental package on the Omegahat site (and repository) named RGtkIPrimitives that works with the gtkDevice only to do rubber banding and free form region identification. Ho do I do GTK device on my WinXP? No way? ;) -- Evgeniy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Impaired boxplot functionality - mean instead of median
Hello to all users and wizards. I am regulary using 'boxplot' function or its analogue - 'bwplot' from the 'lattice' library. But they are, as far as I understand, totally flawed in functionality: they miss ability to select what they would draw 'in the middle' - median, mean. What the box means - standard error, 90% or something else. What the whiskers mean - 100%, 99% or something else. Is there any way to realize it? Or is there any other good data visualization function for comparing means of various data groups? Ideally I would like to have a bit more customised function for doing that. For example, 'boxplot(a~b,data=d,mid='mean'). -- Evgeniy, ICQ 38317310. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Impaired boxplot functionality - mean instead of median
Martin Maechler пишет: Boxplots were invented by John W. Tukey and I think should be counted among the top small but smart achievements from the 20th century. Very wisely he did *not* use mean and standard deviations. Even though it's possible to draw boxplots that are not boxplots (and people only recently explained how to do this with R on this mailing list), I'm arguing very strongly against this. If I see a boxplot - I'd want it to be a boxplot and not have the silly (please excuse) 10%90% whiskers which declare 20% of the points as outliers {in the boxplot sense}. If you want the mean +/- sd plot, do *not* misuse boxplots for them, please! So I analize genetics data. I have some factor (gene variant, c(1,2,3)) and the quantitative variable corresponding to that factor. How do I visualize this situation? Compare mean of samples corresponding to factor values? Should boxplot support 'mean-in-the-middle', it would fit my needs ideally. How do I plot mean +/- SD plot? Also there is a way to rewrite boxplot.stats and replace fivenum there for self-made function. Then I would need to write self-made boxplot.formula (or boxplot.default?) function. And all this stuff would not be configurable. I'm still novice in R, so I need simple way to pre-visualize my data and estimate approximate result. -- Evgeniy, ICQ 38317310. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Impaired boxplot functionality - mean instead of median
Marc Schwartz (via MN) пишет: On Thu, 2005-12-01 at 19:40 +0300, Evgeniy Kachalin wrote: Martin Maechler пишет: So I analize genetics data. I have some factor (gene variant, c(1,2,3)) and the quantitative variable corresponding to that factor. How do I visualize this situation? Compare mean of samples corresponding to factor values? Should boxplot support 'mean-in-the-middle', it would fit my needs ideally. How do I plot mean +/- SD plot? Also there is a way to rewrite boxplot.stats and replace fivenum there for self-made function. Then I would need to write self-made boxplot.formula (or boxplot.default?) function. And all this stuff would not be configurable. I'm still novice in R, so I need simple way to pre-visualize my data and estimate approximate result. If you want means and SDs, you might want to look at: 1. plotCI() and plotmeans() in the gplots package So plotmeans is incapable of: boxplot(numerical~fact1+fact2). Is there any way further? -- Evgeniy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Impaired boxplot functionality - mean instead of median
Marc Schwartz (via MN) пишет: Marc Schwartz (via MN) пишет: So plotmeans is incapable of: boxplot(numerical~fact1+fact2). Is there any way further? I think that somehow we are talking past each other here. plotmeans() does what it is designed to do, which is to simplify the process of plotting group-wise point estimates and user defined error bars/intervals around the point estimates. In your case, these intervals would be standard deviations around each of the group means as you have indicated. Review the examples in ?plotmeans. As Martin and others have pointed out, you need to remove boxplots from the equation here, as they were not designed to plot means and standard deviations. Again, what I'm talking about: plotmeans is incapable of analyzing the formula. For example, I have two factors: A - a, b, c, and B - d, e, f. If i plot: boxplot(num~A+B) what do I get? Eight boxes: ad, ae, af, ba, be, bf, cd, ce, cf. If I plot: plotmeans(num~A+B) - what do I get? Nothing. Because plotmeans cannot combine two factors in various combination. Is there a simple way to do it? Anyway... That's wrong way, all what is neccessary is to have a boxplot with mean istead of median. Is there simple way to do it? Statistical software like Statistica 7.0 offers any possible combination of what Boxplot could mean. Is it possible to have only one modification to R's boxplot? Thank you for kind answers. Also please tell me, where should I send replies: to conference adress or to those who answer me directly. -- Evgeniy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Impaired boxplot functionality - mean instead of median
Frank E Harrell Jr пишет: Evgeniy Kachalin wrote: Marc Schwartz (via MN) пишет: Marc Schwartz (via MN) пишет: library(Hmisc) library(lattice) ?panel.bpplot bwplot(, panel=panel.bpplot) By default, panel.bpplot shows the mean (dot) and median (line) plus several quantiles. To bother Martin in a friendly way, I think that means can be useful additions - not that they are so useful by themselves, but that when they differ a lot from the median, non-statisticians gain further information about asymmetry. Also, even though the simple box plot is elegant, I sometimes think it has a high ink to information ratio. I have gained a lot from seeing outer quantiles on the plot, and I don't like to show outer points for fear of someone labeling them outliers. For describing raw data distributions, I never find standard deviations useful, however. = fa doz fabp2 1900 2 4 1500 2 6 1000 2 8750 3 10 750 1 11 1750 2 12 500 3 bwplot(doz~factor(fabp2),data=fa,panel=panel.bpplot) Error in sort(x, partial = unique(c(lo, hi))) : unsupported options for partial sorting That's NOT simple way. I need just one change. Is there any good way? $-( -- Evgeniy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Impaired boxplot functionality - mean instead of median
Wiener, Matthew пишет: interaction(A, B) will create a single factor made up of the combinations of the two factors A and B. Perhaps that would let you use plotmeans. Hope this helps, Matt Wiener So you think plotmeans(num~interaction(A,B)) will work? How? There is NO 'num' data for a.d, a.e, a.f etc. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Impaired boxplot functionality - mean instead of median
Austin, Matt пишет: Check your syntax on the bwplot call. fa - data.frame(doz=sample(500:2000, size=500), fabp2=rep(1:20, 25)) bwplot(factor(fabp2) ~ doz, data=fa, panel=panel.bpplot) Yes, that's almost the same But there is a huge amount of data on the graphic, too much for estimation of rather simple and small dataset... And also too much for publication in journals, I think. :| -- Evgeniy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html