Re: [R] Adding a legend to a (multi-facet) plot produced by ggplot().
How about defining your dataset differently, making the colouring property a variable? xxx <- data.frame(x=rep(x, 4), y=c(y2, y3), grp=factor(rep(c("a","b"),each=20, times=2)), type=factor(rep(c("clyde", "irving"), each=40))) ggplot(xxx, aes(x,y, colour=type, shape=type)) + geom_point() + geom_abline(intercept=3, slope=2) + facet_wrap(vars(grp)) + scale_colour_manual(values=c("blue", "red")) + scale_shape_manual(values=c(20,3)) Then you could also plot the four groups separately if you wanted to: ggplot(xxx, aes(x,y, colour=type, shape=type)) + geom_point() + geom_abline(intercept=3, slope=2) + facet_grid(rows=vars(type), cols=vars(grp)) + scale_colour_manual(values=c("blue", "red")) + scale_shape_manual(values=c(20,3)) Antony Unwin University of Augsburg, Germany > From: Rolf Turner > Subject: [R] Adding a legend to a (multi-facet) plot produced by ggplot(). > Date: 1 December 2019 at 01:04:46 CET > To: R help > > > > I have been struggling to add a legend as indicated in the subject line, > with no success at all. I find the help to be completely bewildering. > > I have attached the code of what I have tried in the context of a simple > reproducible example. > > I have also attached a pdf file of a plot produced with base graphics to > illustrate roughly what I am after. > > I would be grateful if someone could point me in the right direction. > > cheers, > > Rolf Turner > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] OutliersO3 version 0.5.3 released
Dear all, A revised version of OutliersO3 is available on CRAN: <https://cran.r-project.org/web/packages/OutliersO3/index.html <https://cran.r-project.org/web/packages/OutliersO3/index.html>>. The package has been restructured. The default is now that the tolerance level is set individually for each of the (six) outlier methods included. Plots have been added, as have outlier tables and scores for further analysis. It is also possible to draw an O3 plot using your own outlier identification method, see the vignette for more details. There are four vignettes to illustrate the use of the package. Queries, comments, suggestions are welcome. Thanks to Michael Friendly, Tae-Rae Kim, Nina Wu, and, in particular, Bill Venables for their comments on the old version. Regards Antony Professor Antony Unwin Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] The OutliersO3 package is now on CRAN
Dear all, The new package OutliersO3 is now available on CRAN: <https://cran.r-project.org/web/packages/OutliersO3/index.html>. The aim is to graphically compare results of outlier analyses for all possible combinations of variables in a dataset. Various kinds of O3 (Overview of Outliers) plots can be drawn to show which cases are classified as outliers for which combinations of variables. Up to five different methods can be used to identify the potential outliers in a dataset. There is a vignette: https://cran.r-project.org/web/packages/OutliersO3/vignettes/O3-vignette.html and a video of a talk on O3 plots from useR!: https://channel9.msdn.com/events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference/When-is-an-Outlier-an-Outlier-The-O3-plot?term=unwin Queries, comments, suggestions are welcome. Regards Antony Professor Antony Unwin Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Course in Dublin (May 24th-May 26th, 2017) Introductory -> Modern
An R course from introductory to modern will be given by Louis Aslett (Durham University, author of the packages PhaseType and ReliabilityTheory) and Antony Unwin (author of the book “Graphical Data Analysis with R” CRC Press 2015 http://www.gradaanwr.net <http://www.gradaanwr.net/>). The course will be held in Dublin at the IPA on Lansdowne Road (next to the Rugby ground) from May 24th to May 26th, 2017. Details at http://insightsc.ie/training/r-statistical-software/ <http://insightsc.ie/training/r-statistical-software/> or send an email to train...@insightsc.ie <mailto:train...@insightsc.ie> for further information Antony Unwin Insight Statistical Consulting, Dublin, Ireland University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Course in Dublin (January 30th-February 1st, 2017) Intoductory -> Modern
An R course from introductory to modern will be given by Louis Aslett (Oxford University, author of the packages PhaseType and ReliabilityTheory) and Antony Unwin (author of the book “Graphical Data Analysis with R” CRC Press 2015 http://www.gradaanwr.net <http://www.gradaanwr.net/>). The course will be held in Dublin from January30th to February 1st, 2017. Details at http://insightsc.ie/training/r-statistical-software/ <http://insightsc.ie/training/r-statistical-software/> Antony Unwin Insight Statistical Consulting, Dublin, Ireland University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Course in Dublin (July 20th-22nd, 2016) Intoductory -> Modern
An R course from introductory to modern will be given by Louis Aslett (Oxford University, author of the packages PhaseType and ReliabilityTheory) and Antony Unwin (author of the book “Graphical Data Analysis with R” CRC Press 2015 http://www.gradaanwr.net). The course will be offered again on September 7th-9th, 2016 in Dublin. Details at http://insightsc.ie/training/r-statistical-software/ <http://insightsc.ie/training/r-statistical-software/> Antony Unwin Insight Statistical Consulting, Dublin, Ireland University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Course in Dublin (February 3rd-5th)
The course will be given by Louis Aslett (Oxford University, author of the packages PhaseType and ReliabilityTheory) and Antony Unwin (author of the book “Graphical Data Analysis with R” CRC Press 2015). Details at http://insightsc.ie/training/r-statistical-software/ <http://insightsc.ie/training/r-statistical-software/> Antony Unwin University of Augsburg, Germany and Insight Statistical Consulting, Dublin, Ireland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Course in Dublin (September 14-16)
Details at http://insightsc.ie/training/r-statistical-software/ http://insightsc.ie/training/r-statistical-software/ Antony Unwin University of Augsburg, Germany and Insight Statistical Consulting, Dublin, Ireland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Course in Dublin (April 15-17)
Details at http://insightsc.ie/training/r-statistical-software/ Antony Unwin University of Augsburg, Germany and Insight Statistical Consulting, Dublin, Ireland __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bill Veanables Workshop
Bill Venables talks R :: Augsburg University, Germany :: 2-3 July 2012 Bill Venables will give a two-day R Workshop in Augsburg on the 2nd and 3rd July 2012, an expanded version of the course, which he has been invited to give at this year's useR! meeting in Nashville. Details: www.math.uni-augsburg.de/termin/R-workshop.html Organised by the Department of Computer-Oriented Statistics and Data Analysis, University of Augsburg Antony Unwin un...@math.uni-augsburg.de [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re. When is *interactive* data visualization useful to use?
Hello Tal, You asked *When is it helpful to use interactive plots? Either for data exploration (for ourselves) and data presentation (for a client)?* My answer: It's helpful for checking data quality, for exploration with and without clients, for checking results, and for data presenting. Notes: (1) It's difficult to explain interactive data visualization in print, demonstrations are so much more effective. (2) Interactive data visualization is fun, both for the analyst, and more important, for the dataset owners. You not only get better interaction with the data, you get better interaction with the scientists you cooperate with. They are prepared to contribute, because they can understand what is going on. That is not always the case with statistical models. (3) The key is not animation but direct manipulation. The aim is to be able to directly interact with all statistical objects in a graphic: querying, linking, reordering, reformatting, zooming, whatever. (4) You write of point-based graphics, what about area-based graphics like histograms, barcharts and mosaicplots? For categorical data the ability to select groups and look at spineplots of other variables to compare proportions is very effective. (And don't forget linking to maps for spatial data.) (5) You mention outliers. How do you decide what is an outlier? Interactive parallel coordinate plots are extremely useful, either for identifying outliers or for checking ones found with an analytic approach. (6) Interactive data visualization is not in competition with other approaches, it complements them. Results found with models should be checked graphically and results found graphically should be checked analytically. Your comment about data dredging is important, though why people think this only happens with graphics and not with modelling approaches always puzzles me! (7) There are often interesting features of a dataset (not just errors and outlier groups) that can be found graphically that would be difficult or impossible to find analytically. Have a look at Interactive Graphics for Data Analysis: Principles and Examples by Martin Theus and Simon Urbanek (Chapman Hall). There are some excellent explanations and case studies there. I could go on (and on), but what you really need is a good demo. Best regards Antony PS Have you reported the bugs in GGobi and Mondrian you have found to the software authors? Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where has the stats-rosuda-devel mailing list gone?
Oliver, Apologies for the confusion, there was a server upgrade in the computer centre here which gave us some grief. The list should be fine now. Best regards Antony Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany From: o.mann...@auckland.ac.nz o.mann...@auckland.ac.nz Date: 14 May 2010 12:51:03 AM CEST To: 'r-help@r-project.org' r-help@r-project.org Subject: [R] Where has the stats-rosuda-devel mailing list gone? I require some assistance with JGR, but following the mailing list link from http://jgr.markushelbig.org/FAQ.html leads me tohttp://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel which responds with No such list stats-rosuda-devel I was previously subscribed to this mailing list and want to resubscribe, but where has it gone? Many thanks, Oliver Mannion Programmer COMPASS - Centre of Methods and Policy Application in the Social Sciences www.compass.auckland.ac.nz The University of Auckland, New Zealand Phone +(649) 373 7999 ext 89760 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Visualizing binary response data?
You could also try using interactive graphics in iplots. Linking from a barchart of your binary response variable to your eight continuous predictors in a parallel coordinate plot and to your four categorical predictors in some form of mosaicplot could be very informative. Graphics are not necessarily the method of choice to select your predictor variables, as Frank Harrell has pointed out. It is also sensible not to rely on modelling alone. Graphic displays can help you better understand your data and models. The two approaches are complementary. Antony Unwin University of Augsburg Germany On Tue, May 4, 2010 at 9:04 PM, Kim Jung Hwa kimhwamaill...@gmail.comwrote: Hi All, I'm dealing with binary response data for the first time, and I'm confused about what kind of graphics I could explore in order to pick relevant predictors and their relation with response variable. I have 8-10 continuous predictors and 4-5 categorical predictors. Can anyone suggest what kind of graphics I can explore to see how predictors behave w.r.t. response variable... Any help would be greatly appreciated, thanks, Kim __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs plots in R
If you want to do efficient exploratory data analysis on this kind of dataset, then interactive graphics with parallel coordinate plots (ipcp in iplots) should help. Of course, it depends what you mean by large. It might be worth looking at the book Graphics of Large Datasets for some ideas. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany Tel: + 49 821 5982218 From: Sharma, Dhruv [EMAIL PROTECTED] Date: 19 October 2008 10:58:53 pm GMT+02:00 To: r-help@r-project.org Subject: [R] pairs plots in R Hi, is there a way to take a data frame with 100+ columns and large data set to do efficient exploratory analysis in R with pairs? I find using pairs on the whole matrix is slow and the resulting matrix is tiny. Also the variable of interest for me is a binary var Y or N . Is there an efficient way to graphically view many variable relationships that does not look teeny ? I could do pairs 10 at a time but this seems too brute force. thanks Dhruv [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using interactive plots to get information about data points
I have been experimenting with interactive packages such iplots and playwith. Consider the following sample dataset: A B C D 1 5 5 9 3 2 8 4 1 7 3 0 7 2 2 6 Let's say I make a plot of variable A. I would like to be able to click on a data point (e.g. 3) and have a pop-up window tell me the corresponding value for variable D (e.g. 4). You're right that iplots can't do that (it's on the wishlist), but it offers alternatives. As a multiwindowing package, it is natural to have graphics displays open for all variables of current interest. This means that selecting a point highlights it in all displays and you can see or query the corresponding values. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany Tel: + 49 821 5982218 [EMAIL PROTECTED] http://stats.math.uni-augsburg.de/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
JGR's Copy Commands command works well for me (even if it is both fascinating and embarrassing how little is sometimes left over). It retains only commands that worked, so it is still not the minimum possible. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany Tel: + 49 821 5982218 [EMAIL PROTECTED] http://stats.math.uni-augsburg.de/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Datasets in R
Carlos, There are many sources of real datasets (in R itself, on the web), you just need to look a little. For teaching purposes, I think it is always better to use real datasets than to use simulated ones. One thing bothers me, though. You imply that in all the examples you have the data are well fit with linear models, the residuals are normal and there is no sign of heteroscedacity. That sounds a very unusual set of examples! Best Antony From: Roland Rau [EMAIL PROTECTED] Date: 30 May 2008 12:23:17 AM GMT+02:00 To: Carlos López [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] Datasets in R Carlos López wrote: I´m trying to find datasets that will give me residuals, after applying the lm function, with no normality, non linearity, and heteroscedacity so I can try to exemplify those cases in the linear regression model. Can you give any advice on what datasets would be appropiate? I can´t use the ones in the alr3 package because those have already been seen in class. Thank you very much :-) natorro if you know what you are looking for (or not looking for), wouldn't it be the easiest and fastest thing to do to simulate such a dataset yourself? Best, Roland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Response to R across the university
On 18 Apr 2008, at 6:42 pm, Peter Dalgaard wrote: Antony Unwin wrote: ... The course itself went very well. We encouraged people to bring their laptops and work in groups. Using JGR as the interface to R helped a lot, as it was easier for people to load their own data and use the help. Of course, JGR is compulsory in Augsburg. Speaking of JGR... What are the appropriate channels to complain and/ or contribute? This will do fine, though [EMAIL PROTECTED] would be the official route and Markus Helbig ([EMAIL PROTECTED]) is the key person. I had looked into it at an earlier point (on Fedora Linux) and got stuck on some fairly simple usability issues, like font choice and color scheme. Things like - if you select a bigger font, the window size remains the same. Changes to window size do not survive to subsequent invokations. - output is quite unreadable in proportional fonts, so why make them available? - some fonts have poor contrast, but there seems to be no way to select boldface versions. - the latest version has turned to a blue-on-gray scheme, which doesn't help with the contrast either This is all pretty trivial stuff, but the bottom line is that all the really exciting stuff isn't really of much use if students cannot read it in the back rows. Your points should certainly be looked into. Having the font big enough for students to read in the back row has not been a problem for me. A couple other maybe not all that trivial things to do is to improve the data import (it is losing out on most of the things that I tried) Now what would Brian say to a comment like that? Please insert your favourite put-down here: And then perhaps you would be kind enough to let us know in a little more detail what hasn't worked for you. and to get the wires connected between the DataTable and the edit() command. Thanks for your comments. Antony [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Response to R across the university
This email isn't asking for assistance, but I thought R-help readers would find it interesting. This week we offered a half-day introduction to R for researchers at Augsburg University. The response was astonishing. Although Augsburg has no medical faculty and no engineers, there was far too much demand, with interest from every faculty (barring theology, for one small village of indomitable Gauls still holds out against the R invaders --- perhaps that should be obdurate rather than indomitable) and we had participants from computer science, geography, physics, law, linguistics, education, sociology, marketing, psychology, finance, ... The course itself went very well. We encouraged people to bring their laptops and work in groups. Using JGR as the interface to R helped a lot, as it was easier for people to load their own data and use the help. Of course, JGR is compulsory in Augsburg. Giving everyone a Butterbreze (a local delicacy) halfway through may have contributed to the good humour of the course as well! Statistics doesn't always have a positive image. I can recommend running an R course as one way of making a good impression. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany Tel: + 49 821 5982218 http://stats.math.uni-augsburg.de/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spidergram
A parallel coordinate plot would do fine. Load the package iplots and then use the command ipcp(x1, x2,...) Antony Unwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Raw histogram plots
Why not use the interactive histogram in iplots? ihist(x) Then you can vary the binwidth interactively and get a very quick idea of the structure of your data by looking at a range of plots with different binwidths. Relying on a single plot to reveal everything about a variable's distribution is not a good idea. A couple of people suggested estimating the density. That may miss roundings, discretisation or other odd structures. We should never underestimate what Peter Huber called the rawness of raw data. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot Showing All Points
On 18 Dec 2007, at 2:42 pm, Duncan Murdoch wrote: (I must admit to being very surprised that jittering and sunflower plots have been suggested for a dataset of 5000 points. Do those who mentioned these methods have examples on that scale where they are effective?) Sure. The original post said there were about 50-60 unique locations. This plot: x - rbinom(5000, 20, 0.15) y - rbinom(5000, 20, 0.15) plot(x,y) has a few more unique locations; tune those probabilities if you want it closer. Due to the overlap, the distribution is very unclear. But this plot plot(jitter(x), jitter(y)) makes the distribution quite clear. No it doesn't! It makes it moderately clearer than the plot without jittering. One good alternative here is the fluctuation diagram variant of a mosaic plot: xx-as.factor(x) yy-as.factor(y) imosaic(xx,yy, type=f) Using jittering for categorical data is really not to be recommended and will certainly degrade in performance as the dataset gets bigger. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot Showing All Points
On 18 Dec 2007, at 4:49 pm, Duncan Murdoch wrote: One good alternative here is the fluctuation diagram variant of a mosaic plot: xx-as.factor(x) yy-as.factor(y) imosaic(xx,yy, type=f) That plot is better than jittering, but there's the problem in the mosaic plot of understanding the scale of the rectangles: is it area or diameter that encodes the count? Area is used. With a jittered plot, you lose resolution when the number of points gets too high because you just see a mess of ink, but at least you only require the viewer to count in order to get a close numerical reading from the plot. If someone needs a count, they should be given a table. Graphics are for qualitative conclusions not details. Anyway, counting will only work for really small datasets. I could also claim that while imperfect, at least jittering is widely applicable. For example, if the data were not on a regular grid, perhaps because they had been generated like this: xloc - rnorm(50) yloc - rnorm(50) index - sample(1:50, 5000, rep=TRUE, prob = abs(xloc)) x - xloc[index] y - yloc[index] then jittering still works as well (or as poorly), but the imosaic would not work at all. That's right and that's (almost) the sort of example I was thinking of. For a limited number of locations like this a bubble plot would be best (which has already been suggested in this thread, I think). For many locations and few replications I would still go for varying pointsize and transparency. Incidentally, to check your suggestion I ran your code and discovered that the transparency in iplot does not seem to like replications. Very strange, we'll have to check why. I then looked closely at the numbers of replications generated and discovered that case 25 was picked 325 times and case 40 only once. Rather too extreme for my liking! Running it again gave very similar results, though not exactly the same: this time it was 325 times for case 25 and case 40 was not picked at all. Other numbers varied slightly. This is not what I expected, any ideas? P.S. iplots 1.1-1 may have an init problem in Windows: in my first attempt, the plot made the boxes too large to fit in their cells, but it fixed itself when I resized the window, and the bug doesn't seem to be repeatable. Thanks. This happens occasionally on the Mac too. Refreshing solves it in practice, but we need to find out why it can happen (and stop it happening!). Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Packages - a great resource, but hard to find the right one
Johannes Hüsing wrote Above all there are lots of packages. As the software editor of the Journal of Statistical Software I suggested we should review R packages. You mean: prior to submission? No. No one has shown any enthusiasm for this suggestion, but I think it would help. Any volunteers? Thing is, I may like to volunteer, but not in the here's a package for you to review by week 32 way. Rather in the way that I search a package which fits my problem. That's what I was hoping for. One package lets me down and I'd like to know other users and the maintainer about it. The other one works black magic and I'd like to drop a raving review about it. This needs an infrastructure with a low barrier to entry. A wiki is not the worst idea if the initial infrastructure is geared at addressing problems rather than packages. We should differentiate between rave reviews of features that just happened to be very useful to someone and reviews of a package as a whole. Both have their place and at the moment we don't have either. If you are willing to review an R package or aspects of R for JSS please let me know. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Packages - a great resource, but hard to find the right one
On 23 Nov 2007, at 4:51 pm, hadley wickham wrote: There are two common types of review. When reviewing a paper, you are helping the author to make a better paper (and it's initiated by the author). When reviewing a book, you are providing advise on whether someone should make an expensive purchase (and it's initiated by an third party). Reviewing an R package seems somewhat in between. How would you deal with new version of an R package? It seems like there is the potential for reviews to become stale very quickly. This is a strange argument. A good package will get a good review, which may help it to become better. A review of a weak package can point out how it can be fixed. Reviews will not become stale, just because packages are frequently updated by their authors (like some that could be mentioned). These are generally smaller changes. A constructive review will not just be concerned with details, but more with the overall aims of the package and how they are achieved (or not achieved). Another model to look at would be that of an encyclopedia, something like the existing task views. To me, it would be of more benefit if JSS provided support, peer review, and regular review, for these. Why should JSS, one of the few journals for statistical software, review texts? Task views are a good idea, but are general. They give only a brief and subjective overview (and can hardly be expected to do more). Entries would be more of a survey, and could provide links to the literature, much like a chapter of MASS. If you were not an enthusiastic author of many R packages I would start to think that you are afraid of being reviewed, Hadley! What have you against someone studying a package, a group of packages or some other aspect of R in detail? Maybe I had better start reviewers on your packages first... Thanks to several people who have contacted me independently and offered to review packages, I'll keep the list informed about how that goes. Apologies for JSS's webpage being down to-day, Jan de Leuw tells me it's something to do with Thanksgiving weekend. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Packages - a great resource, but hard to find the right one
There have been several constructive responses to John Sorkin's comment, but none of them are fully satisfactory. Of course, if you know the name of the function you are looking for, there are lots of ways to search provided that everyone calls the function by a name that matches your search. If you think there might be a function, but you don't know the name, then you have to be lucky in how you search. R is a language and the suggestions so far seem to me like dictionary suggestions, whereas maybe what John is looking for is something more like a thesarus. R packages are a strange collection, as befits a growing language. There are large packages, small packages, good packages (and not so good packages), personal mixtures of tools in packages, packages to accompany books, superceded packages, unusual packages, everything. Above all there are lots of packages. As the software editor of the Journal of Statistical Software I suggested we should review R packages. No one has shown any enthusiasm for this suggestion, but I think it would help. Any volunteers? Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, Germany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tart charts
Michael, Try this alternative: # from http://research.microsoft.com/users/lamport/pubs/hair.pdf hairsex - matrix( c(46, 45, 13, 12, 1, 101, 0, 20), 2, 4, byrow=TRUE) dimnames(hairsex) - list(Gender=c(Female, Male), Hair color=c(Blond, Brown, Red, Other) ) library(vcd) mosaic(hairsex, shade=TRUE) There are uses for pie charts, but this isn't one of the better ones. There are many kinds of mosaic plots, but this isn't one of the better ones. A multiple barchart looks good here. I did like your idea of using colours, it emphasised the number of women with dark blue hair. Antony [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sprucing up the R homepage
It's a good idea to spruce up the graphics on R's webpage, but before we get too excited about improving how they are drawn, shouldn't we think about improving what has been drawn? The original graphic showed off a wide variety of graphics which can be drawn with R, all applied to the swiss fertility dataset. Are these the kinds of graphics we would want to draw in a real analysis? I think a single parallel coordinate plot is more informative than this collection and would be easier to explain. If you want to try it for yourself, use the package iplots with data (swiss) and then ipcp(swiss). So maybe someone should suggest graphics from another dataset to adorn the webpage and demonstrate R's graphics capabilities. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.