RE: [R] how R parses expression?
I'm not really sure exactly what you are getting at. Much of R's functionality is built using the R language itself. If you are looking to interface with other languages you might want to start with the Writing R Extensions manual which has a section on The R API: entry points for C code. If you are using the windows version this manual ia available from the help menu (assuming you installed the documentation.) Tom Mulholland R 2.0 4-Oct-2004 Windows XP -Original Message- From: xudongyuan [mailto:[EMAIL PROTECTED] Sent: Tuesday, 7 December 2004 11:40 AM To: [EMAIL PROTECTED] Subject: [R] how R parses expression? Importance: High Hi.All and R developers: Since I am a beginner on R,I have some questions when I studied the source code.I wonder if anyone have time to help me? My question is how the R expressions change to the c code. That is when I input an expression to the GUI or from a file, how R converts the expression to the parse tree in C code(maybe it is a C function, then what is it?), and then does the sequent processes. thanks dongyuan xu __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] tree class in R?
Have you had a look at the rpart package? If you haven't installed it it may be worth doing so. Then you can type require(rpart) ?rpart.object Tom -Original Message- From: DFARRAR [mailto:[EMAIL PROTECTED] Sent: Monday, 6 December 2004 11:09 AM To: [EMAIL PROTECTED] Subject: [R] tree class in R? I am trying to store a couple numbers for each partition, in a subset of the partitions of my data set. Of course, one can accomplish this using a binary tree. (The first split is on inclusion/exclusion of the first object, and so on.) I can probably simulate a tree using vectors. (One vector gives the index of left child node, another the index of the right child node.) However, it seems like there must be a useful class associated with the clustering or recursive partitioning procedures, perhaps not out there for everyone to see. I poked around on the R page and didn't see anything that clearly met my needs very directly. I hope I would not have to learn recursive partitioning in R to find what I need. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] factor matrix
I'm not sure, but is this what you want matrix(as.numeric(factor(c(T,F,F,T))), 2,2) Tom Mulholland -Original Message- From: Adrian Baddeley [mailto:[EMAIL PROTECTED] Sent: Friday, 3 December 2004 1:45 PM To: [EMAIL PROTECTED] Subject: [R] factor matrix Sorry if this is a FAQ. Is there a good reason why a factor has to be a one-dimensional vector and cannot be a matrix? I want to construct matrices of categorical values. Vain attempts like matrix(factor(c(T,F,F,T), 2,2) yield a matrix of character strings representing the factor levels, not the levels themselves, while factor(matrix(c(T,F,F,T), 2,2)) converts the matrix to a logical vector of length 4 then converts the vector to a factor. Tia --- Adrian Baddeley __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] factor matrix
This is my playing about The last one is a matrix of factors of dimension 2 by 2. It;s just that it does not look that way. tt - matrix(as.numeric(factor(c(T,F,F,T))), 2,2) str(tt) num [1:2, 1:2] 2 1 1 2 tt - matrix((factor(c(T,F,F,T))), 2,2) str(tt) chr [1:2, 1:2] TRUE FALSE FALSE TRUE tt - c(T,F,F,T) str(tt) logi [1:4] TRUE FALSE FALSE TRUE dim(tt) - c(2,2) str(tt) logi [1:2, 1:2] TRUE FALSE FALSE TRUE tt - factor(c(T,F,F,T)) dim(tt) - c(2,2) str(tt) factor [1:2, 1:2] TRUE FALSE FALSE TRUE - attr(*, levels)= chr [1:2] FALSE TRUE - attr(*, class)= chr factor tt [1] TRUE FALSE FALSE TRUE Levels: FALSE TRUE -Original Message- From: Adrian Baddeley [mailto:[EMAIL PROTECTED] Sent: Friday, 3 December 2004 2:44 PM To: Mulholland, Tom Cc: [EMAIL PROTECTED] Subject: RE: [R] factor matrix matrix(as.numeric(factor(c(T,F,F,T))), 2,2) No, this produces a matrix with numeric values, not categorical values. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Protocol for answering basic questions
I would support the notion that there is no defined point, after which you do not need to ask basic questions. I would not like the list to be split. There is no need to change anything fundamental. I do not believe that it is rude to expect people to put effort into ensuring that they are not needlessly using other people's time, because of their lack of skill. If the response's are at time's a little bit curt, it is an exceptionally small price to pay for the aid given by the list. Imho, people who follow the posting guide do not receive inappropriate replies. Tom Mulholland -Original Message- From: Richard A. O'Keefe [mailto:[EMAIL PROTECTED] Sent: Thursday, 2 December 2004 11:50 AM To: [EMAIL PROTECTED] Subject: RE: [R] Protocol for answering basic questions ... As for arbitrary thresholds like 2 years, I have been using R since 1996 or 1997, and I would still find it necessary to be on the 'nonexpert' mailing list. I beg the keepers of the flame: DON'T split the list. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: Reasons not to answer very basic questions in a straightforward way; was: Re: [R] creating a sequence of object names
Your statement seems innocent enough on the face of it, but there are two facets that I think are worthy of note. The first is that of time, and more specifically who's time. As a user of other lists I can say that this is the best list in terms of getting the answer to my problem, albeit sometime's obliquely. I intermittently respond to questions generally of the type you refer to. I say intermittently because I don't have the time to do more than that. Why do I respond to these questions? Well I made some of the same basic errors. As a much more knowledgeable user, I think twice (well more like six times) before I post because I understand the amount of time it takes to create a response that is worthwhile. I'll get to the reason for not creating simple answers in the next point. If I had to pay for the quality of support that I get on this list, there is no way that I could afford it. I take what I get and I am grateful for the time given by so many. To assume that my time is more important than those who will give me the answer is disrespectful. Secondly is a process referred to as crowding out. With reference to the list there is a danger that it would cease to be a source of wisdom and start being a repetitive FAQ. As the list stands now I learn much more from other people's questions than I do from my own. I read about different ways of approaching various tasks and while I barely comprehend some of the more difficult questions they provoke my curiosity. I can read an FAQ anytime, I can read all of the manuals, they won't go away. At the moment the list is full of variation with the odd thread like this, which sparks more of a philosophical content. If 90% of the list was full of questions that are tiresome because of dullness or more succinctly tedious, why would I continue to either ask questions of it or respond to them. In essence what I find useful on the list would be crowded out by repetitive questions. Experience has shown me that where you have a demand for quick solutions from people busy getting on with their lives, it can overwhelm your own life. One such experience happened the last time I was in London, I happened to be standing next to one of those little currency exchange booths waiting for a friend. I heard some people having trouble working out where the British Museum was. I gave them some help. It was only after a while that I thought to start counting how many requests I received (well I was on Tottenham Court Rd) but eventually I counted 35. One can maintain that sort of help for a while, but I couldn't stand there all day. I was abused by a couple for eventually leaving and not answering their question. I know there are users of R who will not use the mailing list because they are intimidated by the manner of the list, but the users I have talked to acknowledge that they are looking for an easy solution and are not interested in contributing to the list. Th! ey have also pointed out that they can see why the list does what it does. I get the feeling that a lot of subscribers to this list would understand where you are coming from, even though they may not look at the list the same way that you do. The bottom line is that I have had a reply to every question that I have put on the list and those replies have always helped me to solve my problem. Show that you've put some effort in and people will match that effort and more*. Your note had effort and consequently was treated as meritorious, although the answers may not have been what you wished. Tom Mulholland * K9, Dr Who, BBC Television -Original Message- ... I know very well that it is basic manners to read those materials before asking questions here, but you should also understand that people sometimes get stuck with very simple problems if they are driven by stress or run down. They can save a lot of time and concentrate on and develop their primary jobs instead. And I don't think you should be worried about 900 silly questions out of 1000 messages posted because they are at least well-educated people who know what reading basic materials before posting questions means. ... I beg your pardon if this message is not relevant to this help list. With kind regards, John --- Uwe Ligges [EMAIL PROTECTED] wrote: John wrote: Thank you, Uwe. I've found a way to do the job by reading the FAQ 7.21 although it is not giving a precise explanation to a novice or casual user at first reading. For example, if you type the first two But the corresponding help files do so, for sure, and the FAQ 7.21 points you to ?assign and ?get. lines in the FAQ, you get an error as you do not have the variable, a, initially. I am sure that more and more people get interested in and serious about using R if advanced users are kind enough to answer simple and silly questions as well which are already explained in basic documentations. Or is this
[R] RE: Adding a line in the graph of 'plot()'
The general nature of your question means there are a multitude of answers. if you type '?abline' it will give you an example of a line drawn over a plot. type ?segments, ?lines, ?points, ?text and go through the examples and try them out The help pages to the base package 'grid' includes an introduction to using the various drawing mechanisms see ?par for the 'new' argument logical, defaulting to 'FALSE'. If set to 'TRUE', the next high-level plotting command (actually 'plot.new') should _not clean_ the frame before drawing as if it was on a *_new_* device. Alternatively the 'lattice' package can produce a wide range of plots. Then try the various contributed documents that exist on the CRAN mirrors to further your knowledge. It's all there. Tom Mulholland -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, 29 November 2004 9:35 AM To: [EMAIL PROTECTED] Subject: [R]: Adding a line in the graph of 'plot()' Hello. I'm looking for a way to add a line in a plot of a points that should lie along a particular line. Can I add a line to the 'plot()' function, maybe using 'abline()' so that the line is visible in the graph of 'plot()'? How? More generally, can I overlay plots over one another? Thanks. Dean Vrecko Simon Fraser University __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Response Surface
type ?wireframe rather than wireframe() Tom Mulholland -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Friday, 26 November 2004 12:36 PM To: [EMAIL PROTECTED] Subject: [R] Response Surface Hi. I'm a student at Simon Fraser University in British Columbia, Canada. I can't for the life of me figure out how to plot a 3D surface (A 3D response surface to be more specific) in R. I found your email address on a web board, and saw someone mention wireframe(), but using the help in R yielded no results. Any suggestions? Thanks. Dean Vrecko __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] barplot(2?) with CI from a zero reference line
I didn't know how to do this but I knew it had to been asked about. Try getS3method(barplot2,default) Make sure you've loaded gplots. I guessed default, but I wonder how you would find out the class if it had been something else. I guess that's something to work on when I'm next twiddling my thumbs. Tom Mulholland -Original Message- From: Jean-Louis Abitbol [mailto:[EMAIL PROTECTED] Sent: Friday, 26 November 2004 2:56 PM To: [EMAIL PROTECTED] Subject: [R] barplot(2?) with CI from a zero reference line Dear R Users, (and dear Marc) First of all many thanks for the answers to my previous questions. I would like to barplot the mean percent change of a variate with it's CI. Bars should start from the zero reference line to height (in barplot2). Is there a way to tweak barplot2, for example, to do that ? I have tried to see what the function was but unlike other functions was not able to list it by barplot2. Is it because it is called through UseMethods ? Thanks for any help. Jean-Louis __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to correct this
This raises the question of best practice. My answer was predicated on the fact that Jin Li had been attempting to use grid.circle in the first place without success. I rashly made the assumption that there was already a move to try and use some of the more sophisticated techniques within R. This is a good example of the comments in the hidden costs thread, where the pathways to learning R came under some scrutiny. It is also similar to the [R] How to insert one element into a vector? where it is noted that append can be used to insert the element. That is the function appears to be originally written for one purpose, but it is evident that it has a broader application that is not immediately recognizable from the function name. When you are new to R it can seem confusing that you use rect for rectangles but symbols for circles, or segments for lines and lines for not lines, but they really are lines. I am not yet proficient enough to always know which is the best approach. That's even with defining best as quickest, most easily maintained or most readable etc etc. Now to the point. I have formed a collection of graphics that I have prepared over the last two years which I use to remind myself of the little idiosyncrasies of the various techniques. These of course have evolved as I have. They mostly use data that I cannot make available. I thought it might be a good idea to produce reproducible code that shows the bewildering variety ways to skin the proverbial animal. That is to produce code that can create a PDF flipbook of plots. One of the first things that I do when I load a package, is to run the examples that produce graphical output. I tend to work backwards and understand processes better when I know what the final output looks like. I am mathematically challenged, but can often appreciate what is happening once I see the plot. Ideally the code would include all the bells and whistles. I say this because I have spent hours trying to figure out just exactly what something is supposed to do before finally figuring out that it wa! s really much simpler than I had thought. The bells and whistles should also show how you sometimes have to use par outside of the function (or remember that the ... is there for a reason) to get the effect that you want. For example when I load the vcd package to do mosaicplots I think I have to use par(xpd = TRUE) to get my multi-line labels not to be clipped. As an evolving beast I see this as a way of demonstrating the techniques that are generally regarded as being best practice in a comprehensive manner. In short I am volunteering. What for? I am not quite sure, but it includes example plots using data that helps in clarifying how the plot should be used. The last point means that I am not capable of producing some plots (and the examples in some packages already do this well) as I have no idea what they mean even when I have plotted the example. Tom Mulholland -Original Message- From: Paul Murrell [mailto:[EMAIL PROTECTED] Sent: Tuesday, 23 November 2004 3:05 AM To: Mulholland, Tom Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [R] How to correct this Hi Mulholland, Tom wrote: Taking note of the first post, this is what I assume you wish. Note Paul's caveat in the help file If you resize the device, all bets are off! require(gridBase) x-seq(0,1,0.2) y-x pred-matrix(c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5), 6, 6) image(x, y, pred, col = gray(20:100/100), asp='s', axes=F, xlab= , ylab=) points(0.5, 0.5, col = 5) # the centre of the image In this case, using grid (or gridBase) is probably overkill. The symbols() function should do what you want. For example, ... symbols(rep(0.5, 4), rep(0.5, 4), circles=1:4, add=TRUE) Paul vps - baseViewports() pushViewport(vps$plot) grid.circle(x=0.5, y=0.5, r=0.1, draw=TRUE, gp=gpar(col=5)) grid.circle(x=0.5, y=0.5, r=0.3, draw=TRUE, gp=gpar(col=5)) grid.circle(x=0.5, y=0.5, r=0.5, draw=TRUE, gp=gpar(col=5)) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, 22 November 2004 1:21 PM To: [EMAIL PROTECTED] Subject: RE: [R] How to correct this Hi there, I would like to add a few circles to the following image: x-seq(0,1,0.2) y-x pred-matrix(c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5), 6, 6) image(x, y, pred, col = gray(20:100/100), asp='s', axes=F, xlab= , ylab=) points(0.5, 0.5, col = 5) # the centre of the image The centre of these circles needs to be overlapped with the centre of the image. Any helps are greatly appreciated. Regards, Jin -Original Message- From: Mulholland, Tom
RE: [R] Running R from CD?
I have noticed that R 2.0 did run slower than I thought it should. It's only now that you've raised the issue that I realise how much slower. However since I only use the CD when I am working on other people's machines I can't really say if there are other factors impacting upon the performance. I'll dig up the old disk, make some comparisons and forward the results. The bottom line is that it is not a big issue for me. Tom Mulholland -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] ... Subject: Re: [R] Running R from CD? ... BTW, I believe running R 2.0.x from a CD will be a lot slower than 1.9.1 because of lazy loading and frequent file accesses: that's a theoretical issue we intend to address for 2.1.0, but not one anyone has yet commented that it is a problem. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to correct this
Taking note of the first post, this is what I assume you wish. Note Paul's caveat in the help file If you resize the device, all bets are off! require(gridBase) x-seq(0,1,0.2) y-x pred-matrix(c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5), 6, 6) image(x, y, pred, col = gray(20:100/100), asp='s', axes=F, xlab= , ylab=) points(0.5, 0.5, col = 5) # the centre of the image vps - baseViewports() pushViewport(vps$plot) grid.circle(x=0.5, y=0.5, r=0.1, draw=TRUE, gp=gpar(col=5)) grid.circle(x=0.5, y=0.5, r=0.3, draw=TRUE, gp=gpar(col=5)) grid.circle(x=0.5, y=0.5, r=0.5, draw=TRUE, gp=gpar(col=5)) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, 22 November 2004 1:21 PM To: [EMAIL PROTECTED] Subject: RE: [R] How to correct this Hi there, I would like to add a few circles to the following image: x-seq(0,1,0.2) y-x pred-matrix(c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.9, 0.9, 0.7, 0.5, 0.5, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5), 6, 6) image(x, y, pred, col = gray(20:100/100), asp='s', axes=F, xlab= , ylab=) points(0.5, 0.5, col = 5) # the centre of the image The centre of these circles needs to be overlapped with the centre of the image. Any helps are greatly appreciated. Regards, Jin -Original Message- From: Mulholland, Tom [mailto:[EMAIL PROTECTED] Sent: Monday, 22 November 2004 12:29 P To: Li, Jin (CSE, Atherton) Subject: RE: [R] How to correct this I think you need to create a complete set of code that can be replicated by anyone trying to help. I ran the three grid.circle commands on my current plot and it did what I expected it to do. It plotted three circles centred in the current viewport. See the jpeg. The last command using points makes me think that you need to understand about units and the setting up of viewports. I have not played around with this much but I think thr newsletter had an article which may be of use (although it uses old code I think the differences are minor) Ciao, Tom -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, 22 November 2004 10:07 AM To: [EMAIL PROTECTED] Subject: [R] How to correct this Hi there, I tried to add a few circles on an existing figure using the following codes grid.circle(x=0.5, y=0.5, r=0.1, draw=TRUE, gp=gpar(col=5)) grid.circle(x=0.5, y=0.5, r=0.3, draw=TRUE, gp=gpar(col=5)) grid.circle(x=0.5, y=0.5, r=0.5, draw=TRUE, gp=gpar(col=5)) points(0.5, 0.5, col = 5) # centre of the circle , but all circles moved away from the centre. Could we do any corrections to this? Thanks. Regards, Jin == Jin Li, PhD Climate Impacts Modeller CSIRO Sustainable Ecosystems Atherton, QLD 4883 Australia Ph: 61 7 4091 8802 Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] == [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to plot this
If this is a quick and dirty process you want rather than learning all the capabilities that are in R then I would copy the density curve (or the bits you like) into your favourite image editor, and use it's capabilities to pretty it up. However there are a number of options. Firstly you have chosen to plot density(y). When I looked at the help for density it gives the values returned by density. If you want a custom plot maybe you should try dcurve - density(y) you could then directly access the $x and $y components as you would in any plot For instance plot(density(y)) gives you the grey line. However plot(dcurve$x,dcurve$y,type = l) gives you a different type of plot. As for arrowheads one could create an appropriate polygon to stick at each end. Which for a one off might be a bit of overkill. Sometime in all of this you'll also probably encounter clipping, in which case par(xpd = TRUE) will often help. Just remember to turn if off or you may find unwanted graphics appearing later on. For putting the labels where you want you could use mtext. This gives you control over where you want to place the text. A word of caution. If you are going to start prettying up you plots to very specific standards make sure that you are working on the final device from which you wish to take the final copy. Each of the devices have their own capabilities which are often not related to R but rather to their own environment. That is you can't get a plot looking perfect in a window and assume that the same code sent to a postscript device will produce identical results. R can give you very good graphics, often straight out of the box, but like any publishing process it can be a bit fiddly. Tom Mulholland Senior Demographer Department for Planning and Infrastructure Perth, WA, Australia. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, 17 November 2004 2:39 PM To: [EMAIL PROTECTED] Subject: [R] How to plot this Hi there, I produced a plot using the following codes: y-rnorm(1000, 2, 0) x0-c(0, 0) y0-c(0, 0) y1-c(0, 1) x1-c(0, 4) plot(density(y), ylab=Abundance of species, xlab=Environmental gradient, main= , xlim=c(0, 4), ylim=c(0, 1), lty=2, col=4, xaxt=n, yaxt=n, frame.plot=F) lines(x0, y1) # add an axis lines(x1, y0) # add an axis arrows(3.95, 0, 4, 0, angle = 15, length = 0.1) arrows(0, 0.98, 0, 1, angle = 15, length = 0.1) Please help me to remove the grey horizontal line and put the axis labels closer to the axes. And also appreciate any suggestions on how to make those arrows look nicer, e.g. a filled small arrow for each axis, like what from points(0, 1, pch=17), but a slightly narrowed one. Thanks. Regards, Jin Li Jin Li, PhD Climate Impacts Modeller CSIRO Sustainable Ecosystems Atherton, QLD 4883, Australia [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to updating R to the newest version conveniently
I've seen various answers to this question and there does not seem to be a single best way. I use a separate library for downloaded packages. In windows I set the R_LIBS environment variable. See the usual suspects such as the appropriate FAQ and the r-admin pdf file. On the exisitng installation I run code like this myPackages - .packages(all.available = TRUE,lib.loc = c:/progs/mylib) save(myPackages,file = f:\backup\settings\myPackages.rdata,compress = T) This stores a list of all the packages in mylib so that when a new install comes I can just retrieve my backup and do a new install. When everything is working well a new version can be downloaded in the old directory (having cleaned it out first) and the update from CRAN option in windows can be used. However with R2.0 there was a need to recompile packages so those that did not have a new version did not update, but didn't work with the new version. load(f:\backup\settings\myPackages.rdata) install.packages(myPackages,lib = c:/progs/mylib, CRAN = http://cran.au.r-project.org/;) I can't guarantee the code as I have just put it together from what I recall (this is how I did it at home) I don't have that sort of access to the work PC so I have to get a tech support person to do it all for me and they have to do it manually because they don't understand the process. Ciao, Tom -Original Message- From: Yong Wang [mailto:[EMAIL PROTECTED] Sent: Friday, 12 November 2004 10:25 AM To: [EMAIL PROTECTED] Subject: [R] How to updating R to the newest version conveniently Dear R users I have been using R for a while. However, I don't know what is the convenient way to update R to the newest version while keep all packages I previously downloaded and installed from CRAN, if updating all those packages the same will be even better. for the time being, I reinstall all those package evrytime after updating the version. Thank you. best regards yong __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Lattices: Cloud: Background
When I first started using lattice I found the colour schemes a bit confusing. So eventually I came up with the colours I wanted. The code below was one of those attempts. One thing that happened however was that I kept shutting down the graphics window that pops up and the colours would revert to their default. So if you run all of the code the first window will pop up correctly the system will pause for 5 seconds, close the window and run the code again. When the code runs it reverts to the grey background. Keep persevering because when it all comes together you can produce some very good looking graphics. Note: Not all the colours on the plot are set using lset. The text in the key is set directly within the xyplot call. require(lattice) SetAltColBlue - function(x=NULL) { lset(list(background = list(col = transparent), add.text=list(col=yellow,cex=1.3), add.line=list(col=navy,cex=1.3), bar.fill = list(col = transparent), box.rectangle = list(col = grey), box.umbrella = list(col = grey), box.dot = list(col=grey), dot.line = list(col = grey), dot.symbol = list(col = grey), plot.line = list(col = grey), plot.symbol = list(col = grey), regions = list(col = heat.colors(100)), strip.shingle = list(col = c(steelblue1)), strip.background = list(col = c(navy)), reference.line = list(col = navy), axis.text=list(col=navy,cex=0.8), axis.line=list(col=grey50), superpose.line = list(col = c(navy, navy, navy, navy, navy, navy, navy), lty = 1:7,lwd=c(1.5,1.5,1.5,1,1,1,1)), superpose.symbol = list(col=c(steelblue1,navy,blue,black)), par.xlab.text = list(col=navy,cex=0.9), par.ylab.text = list(col=navy,cex=0.9), par.main.text = list(col=navy,cex=2), par.sub.text = list(col=navy,cex=0.8), box.3d=list(col=grey))) } SetAltColBlue() data(iris) xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species, data = iris, allow.multiple = TRUE, scales = free, layout = c(2, 2), main=Title,sub=sub text, auto.key = list(col=steelblue4,x = .6, y = .7, corner = c(0, 0))) bringToTop() Sys.sleep(5) dev.off() data(iris) xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species, data = iris, allow.multiple = TRUE, scales = free, layout = c(2, 2), auto.key = list(x = .6, y = .7, corner = c(0, 0))) Ciao, Tom _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential and may be protected by professional privilege. The contents are intended only for the named recipients of this e-mail. If you are not the intended recipient, you are hereby notified that any use, reproduction, disclosure or distribution of the information contained in this e-mail is prohibited. Please notify the sender immediately. -Original Message- From: Mueller, Adrienne [mailto:[EMAIL PROTECTED] Sent: Friday, 16 January 2004 3:05 AM To: [EMAIL PROTECTED] Subject: [R] Lattices: Cloud: Background Hi, There's probably some simple way of doing this, but I'm just not seeing it - How do I get the background to be white instead of grey when I have a cloud plot (using the lattices package)? par(bg=white) isn't working. I'm assuming par commands won't work on lattice plots. What should I use instead? Thanks, Adrienne [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Is there an R or S implementation of PAMSIL or PAMMEDSIL
I have some data that is dwarfed by one large cluster. I came across a paper titled A New Partitioning Around Medoids Algorithm (van der Laan, Pollard Bryan, 2002) http://www.bepress.com/ucbbiostat/paper105/ that describes PAMSIL and PAMMEDSIL that look as though they might be more appropriate for the data I have. There does not appear to be much out there which is describing itself by these names. So any help would be appreciated. Tom _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] draft of posting guide
I think there will always be disagreement when commenting about the appropriateness of social behaviour. So I think we will do well to understand the purpose of any proposed posting guide. It is not clear to me where the list is going with regards to this topic. If the aim is to produce a comprehensive posting guide to sit with other R documents, I wish the list well and will check on progress some time in the future. I can't see some points being reconciled quickly. If we are talking about something else, I have previously suggested a short monthly reminder, then it may be possible to make some progress. Frank Harrell noted that with some prompting, new users can ask questions better. If we focus on the mechanics of question asking rather than on the social aspects we may find it easier to produce something. I guess I'm asking the question What are the prompts? If I were to make a checklist it would be Before asking the question Have you read the FAQs? If you use windows, have you read the Windows FAQ? Have you searched the R-help archives? Have you read the online help for relevant functions? Have you checked to see if the answer is in one of the reference manuals, supplementary documents or Newsletters? Do you have the latest version of R? Is this an R question? Once you need to ask the question Do you need to include a workable example so people understand your problem? Do you need to include details about your operating system? Do you need to include which version of R are you using? This obviously would need something else as some of the questions beg questions themselves. It is however moving towards what I had in my mind when I first suggested the monthly reminder. Tom _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] mailing list for basic questions - preliminary sum up
The fact that I've been using R for quite a while now and did not know about this document is supporting evidence of the need to get this sort of information out there. However that big list is going to daunt some people, it would have daunted me at the beginning. At a time when you are digesting a whole new universe (wonderful though it is) some short and pithy help is welcome. I expect that there would be a consensus on which topics are essential to improve the quality of questions. It is this select group of comments that could be put in a standard email sent out once a month. I would assume that the link to Eric Raymond's How To Ask Questions The Smart Way1 would be part of the advice. That is if you're going to spoon feed, then spoon feed the advice that helps the list most and encourage new list members to take the time to read the longer document. I think this special treatment is warranted because the issue is not about R per-se, it is about this list, so it makes more sense to have it coming out of this list rather than an entry in the R-FAQ. Although I can't see a reason for not doing both. Hmmm. Before I post this I had better go and see what Eric has to say about this sort of message. Ciao, Tom 1 An assumption on my part is that there is fundamental agreement that the document is the best source for advice on how to ask questions of this list _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential and may be protected by professional privilege. The contents are intended only for the named recipients of this e-mail. If you are not the intended recipient, you are hereby notified that any use, reproduction, disclosure or distribution of the information contained in this e-mail is prohibited. Please notify the sender immediately. -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Thursday, 18 December 2003 8:26 AM To: 'Tom Mulholland' Cc: [EMAIL PROTECTED] Subject: RE: [R] mailing list for basic questions - preliminary sum up From: Tom Mulholland I have empathy for lots of the points already made, more often on the life is not always easy and you have to work at it flavour because that's where you make the real gains. One particular message early in the piece cited an example of what a good request might look like. Other lists sometime send out regular messages (although they tend to be about the rules of the list) that are intended to make sure that important pieces of information are regularly repeated. I know that there is more than enough talent on this list to put together suggestions for getting quick responses that could be sent out regularly. The sorts of things that might be in it would be when you should attach details of operating system, version etc. (or if they should always be there) as well as comments like those by Spencer Graves and it could include the checklist that someone mentioned (I think that was Frank Harrell). It would almost be a pro-forma for messages and while people don't have to use it, it may help those who do think before they post (we'll never stop some people, because that's just the way they are) Tom Mulholland Tom Mulholland Associates Please see Eric Raymond's How To Ask Questions The Smart Way (http://www.catb.org/~esr/faqs/smart-questions.html). (One of these days I shall take up Martin's suggestion and write an entry for R-FAQ pointing to it. The problem is getting people to actually read the FAQ, let alone links in the FAQ...) Best, Andy -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] read.spss question warning compression bias
So it would appear that if the above is correct, there is no user adjustment to the bias value. The only scenario that I can envision is if the user SAVE's the .sav file in an uncompressed format, where the bias value **might** be set to 0. Perhaps a r-help reader with access to current SPSS manuals can confirm the above. The windows version 11.5.0 appears the same (I assume the negative sign on -99 was somehow dropped) COMPRESSED and UNCOMPRESSED Subcommands COMPRESSED saves the file in compressed form. UNCOMPRESSED saves the file in uncom-pressed form. In a compressed file, small integers (from 99 to 155) are stored in one byteinstead of the eight bytes used in an uncompressed file. The only specification is the keyword COMPRESSED or UNCOMPRESSED. There are noadditional specifications. Compressed data files occupy less disk space than do uncompressed data files. Compressed data files take longer to read than do uncompressed data files. The GET command, which reads SPSS-format data files, does not need to specify whetherthe files it reads are compressed or uncompressed. Only one of the subcommands COMPRESSED or UNCOMPRESSED can be specified perSAVE command. COMPRESSED is usually the default, though UNCOMPRESSED may bethe default on some systems. Ciao, Tom _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential and may be protected by professional privilege. The contents are intended only for the named recipients of this e-mail. If you are not the intended recipient, you are hereby notified that any use, reproduction, disclosure or distribution of the information contained in this e-mail is prohibited. Please notify the sender immediately. -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: Friday, 12 December 2003 3:56 AM To: Thomas Lumley Cc: [EMAIL PROTECTED] Subject: Re: [R] read.spss question warning compression bias On Thu, 2003-12-11 at 12:32, Thomas Lumley wrote: On Thu, 11 Dec 2003, Marc Schwartz wrote: An additional question might be, if the file is not compressed, what is the default bias value set by SPSS? If it is 0, then the check is meaningless. On the other hand, if the default value is 100, whether or not the file is compressed, then the warning message would serve a purpose in flagging the possibility of other issues. Reasonably, that setting may be SPSS version specific. I think the issue is that the format is not documented, so the author of the code (Ben Pfaff) didn't know what a change in the value would imply. If the file is apparently read correctly it seems that it doesn't imply anything. -thomas Thanks for the clarification Thomas. I did some searching of the PSPP site and found the following: http://www.gnu.org/software/pspp/manual/pspp_18.html#SEC170 The compression bias is defined as: flt64 bias; Compression bias. Always set to 100. The significance of this value is that only numbers between (1 - bias) and (251 - bias) can be compressed. So it would seem to potentially impact aspects of the file compression data structure, when compression is used. I am not sure if the Always set to 100 is unique to PSPP in how Ben elected to do things. Presumably if that is always the case, even with SPSS, one might reasonably wonder: why have it, if it does not vary? It leaves things unclear as to under what circumstances this value would change. I did some Googling and found the following text snippet from a presumably dated SPSS manual for the syntax of the SAVE command: SAVE OUTFILE=file [/VERSION={3**}] {2 } [/UNSELECTED=[{RETAIN}] {DELETE} [/KEEP={ALL** }] [/DROP=varlist] {varlist} [/RENAME=(old varlist=new varlist)...] [/MAP] [/{COMPRESSED }] {UNCOMPRESSED} **Default if the subcommand is omitted. COMPRESSED and UNCOMPRESSED Subcommands COMPRESSED saves the file in compressed form. UNCOMPRESSED saves the file in uncompressed form. In a compressed file, small integers (from 99 to 155) are stored in one byte instead of the eight bytes used in an uncompressed file. The only specification is the keyword COMPRESSED or UNCOMPRESSED. There are no additional specifications. Compressed data files occupy less disk space than do uncompressed data files. Compressed data files take longer to read than do uncompressed data files. The GET command, which reads SPSS-format data files, does not need to specify whether the files it reads are compressed or uncompressed. Only one of the subcommands COMPRESSED or UNCOMPRESSED can be specified per SAVE command. COMPRESSED is usually the default, though UNCOMPRESSED may be the default on some systems. So it would appear that if the above is correct, there is no user adjustment to the bias value. The only scenario that I can envision is if
RE: [R] hdf library for windows
The question puzzled me at first, because of your use of library. It looks as if the hdf5 r package utilises the windows hdf5 library binary. My reading is that you will have to compile the package yourself after you have downloaded the hdf windows dll from hdf.ncsa.uiuc.edu The instructions are in win.readme.txt of the package source which you can download at planetmirror or aarnet. I think the use of the hdf dll is the reason a windows binary cannot be made available for direct download. Ciao, Tom _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] A suggestion regarding multiple replies
As with most of the replies so far, I enjoy the way the list works. A couple of observations however are that it is evident that off list replies already happen and imho more importantly is the fact that initially quite straightforward queries can turn into something much more interesting. I find this type of query to be among the most helpful. Partly because they tend to deal with issues that I think I have already got covered. An example of this was the use of asp=1 in a plot to keep the aspect ratio correct. One might argue that having to go to plot.default to find this reference rather than in plot was the problem, but what it did to me was to ensure that I follow through deeper and deeper into the workings of R. There are times when it is only after you have found the answer that you realise why the answer had to be where it was (as with plot.default) and that's when the real learning begins. I use the list as a way of exploring different aspects of R (often those that I have no direct need of at the time.) Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Saturday, 15 November 2003 6:02 AM To: [EMAIL PROTECTED] Subject: [R] A suggestion regarding multiple replies Please don't take this the wrong way. There are a lot of extremely helpful people who subscribe to r-help. I was wondering if it is time to adopt a strategy a-la Splus help whereby people reply to the author and the author summarizes all the replies? Just a thought and have a good weekend. Partha __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] upgrading R
This has been discussed with previous upgrades. One such discussion was How to update installed packages to a new version of R?. http://finzi.psych.upenn.edu/R/Rhelp02/archive/8316.html There are a variety of methods to choose from. You may also learn about updating packages directly from the net which can take care of the issues of having to reinstall packages. _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential and may be protected by professional privilege. The contents are intended only for the named recipients of this e-mail. If you are not the intended recipient, you are hereby notified that any use, reproduction, disclosure or distribution of the information contained in this e-mail is prohibited. Please notify the sender immediately. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Friday, 10 October 2003 12:16 AM To: Weiming Zhang Cc: [EMAIL PROTECTED] Subject: Re: [R] upgrading R I am not sure if this is a proper way, but here is what I did recently installing consecutive alpha and beta releases of 1.8.0 on a Win2000 machine: (1) I uninstalled previous version using the uninstall provided with R. This leaves all the additional packages in the library folder (2) I reinstalled the new version into the same folder structure. The problem might be that some 1.7.1 packages might be different from 1.8.0, so it might be safer to reinstall them from CRAN. Andy __ Andy Jaworski Engineering Systems Technology Center 3M Center, 518-1-01 St. Paul, MN 55144-1000 - E-mail: [EMAIL PROTECTED] Tel: (651) 733-6092 Fax: (651) 736-3122 |-+ | | Weiming Zhang | | | [EMAIL PROTECTED]| | | edu | | | Sent by: | | | [EMAIL PROTECTED]| | | ath.ethz.ch | | || | || | | 10/09/2003 10:38 | | || |-+ --- --| | | | To: [EMAIL PROTECTED] | | cc: | | Subject: [R] upgrading R | --- --| Hi, I have installed a lot of extra packages for R 1.7.1. If I install R 1.8.0, will I have to reinstall all those packages? Is there a way that I can upgrading R without losing old packages? Thank you. wz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Re: diamond graphs, patents and rootograms
Talking about Excel, you can produce excellent graphs in Excel. Yes you have to work at it, but you can get there. The problem is that they are not the default. My gut feeling is that R will make more of an impact in the presentation of graphics than any implementation in Excel. So even if a patent were granted and it made itself into a mainstream package, would it change the world or would the people who currently use three dee (3D) graphs think that they looked a bit square. Or would the implementation allow us to change the colour of each diamond or maybe we could put a picture of our daughter as a background. Does change happen from the tool or the user. So I guess my reason for saying R will make more of an impact is because the average R user cares about what they are doing when compared to the average Excel user. also On Thu, 28 Aug 2003, David Scott wrote: What is a hanging rootogram? ;-D Now I can't work out what the wink means, but they're implemented in the VCD package Also http://www.math.yorku.ca/SCS/vcd/vcdstory.pdf for more info _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] R tools for large files
As some of the conversation has noted the 30 second mark as an arbitrary benchmark I would also chime in that there is also an assumption that any non-R related issues that impact upon being able to usefully use R should be ignored. In the real world we can't always control everything about our environment. So if there are improvements that can be made that help mitigate the reality of the world, I would welcome them. As a little test I broke the rules of my organisation and actually put a dataset on my C: drive. Not unexpectedly, the performance vastly improved. What would in the normal (at home) be a 10 second load becomes a 40 second load in a corporate environment. I have found the conversation helpful and it would appear that there are opportunities for improvement that I would find helpful in my production environment. The other aside is that I have no UNIX like tools, not because they don't exist, but because the environment I work in does not allow me to use them. This is not sufficient reason for me to bleat about it. It just is. By and large, I just get on with it. My point is that while I accept that these issues are peripheral to R, they do impact upon the useability of R. I'm sure that there are people working with large databases in R (The SPSS datasets that I regularly interact with vary between 97MB and 200MB) It could be finger trouble on my part, but I find I have to subset them before I can read them into R. If I thought I could usefully convert these datasets into something that R could pick and choose from without reaching the out of memory problem, I would be very happy. In the meantime my lack of expertise has left me with a workable albeit clumsy process. I will continue to champion R in my organisation, but the present score is SPSS-50, SAS-149, R-1. But all the really creative charts only come from one engine in this place. system.time(load(P:/.../0203Mapdata.rdata)) [1] 9.79 0.97 37.45NANA system.time(load(C:/TEMP/0203Mapdata.rdata)) [1] 10.07 0.18 10.49NANA version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor7.1 year 2003 month06 day 16 language R _ Tom Mulholland Senior Policy Officer WA Country Health Service Tel: (08) 9222 4062 The contents of this e-mail transmission are confidential and may be protected by professional privilege. The contents are intended only for the named recipients of this e-mail. If you are not the intended recipient, you are hereby notified that any use, reproduction, disclosure or distribution of the information contained in this e-mail is prohibited. Please notify the sender immediately. -Original Message- From: Murray Jorgensen [mailto:[EMAIL PROTECTED] Sent: Monday, 25 August 2003 5:16 PM To: Prof Brian Ripley Cc: R-help Subject: Re: [R] R tools for large files At 08:12 25/08/2003 +0100, Prof Brian Ripley wrote: I think that is only a medium-sized file. Large for my purposes means more than I really want to read into memory which in turn means takes more than 30s. I'm at home now and the file isn't so I'm not sure if the file is large or not. More responses interspesed below. BTW, I forgot to mention that I'm using Windows and so do not have nice unix tools readily available. On Mon, 25 Aug 2003, Murray Jorgensen wrote: I'm wondering if anyone has written some functions or code for handling very large files in R. I am working with a data file that is 41 variables times who knows how many observations making up 27MB altogether. The sort of thing that I am thinking of having R do is - count the number of lines in a file You can do that without reading the file into memory: use system(paste(wc -l, filename)) Don't think that I can do that in Windows XL. or read in blocks of lines via a connection But that does sound promising! - form a data frame by selecting all cases whose line numbers are in a supplied vector (which could be used to extract random subfiles of particular sizes) R should handle that easily in today's memory sizes. Buy some more RAM if you don't already have 1/2Gb. As others have said, for a real large file, use a RDBMS to do the selection for you. It's just that R is so good in reading in initial segments of a file that I can't believe that it can't be effective in reading more general (pre-specified) subsets. Murray -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of
RE: [R] Boosting,bagging and bumping. Questions about R tools and predictions.
http://www.boosting.org/publications.html I found some of the papers on this page useful in understanding the concepts you refer to. I will leave it to the better informed members of the group to talk about the packages that relate to this field. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, 23 July 2003 8:10 AM To: [EMAIL PROTECTED] Subject: [R] Boosting,bagging and bumping. Questions about R tools and predictions. I'm interested in further understanding the differences in using many classification trees to improve classification rates. I'm also interested in finding out what I can do in R and which methods will allow prediction. Can anybody point me to a citation or discussion? _ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: [EMAIL PROTECTED] The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Recode from 2 variables
I am trying to create a new variable which uses the suburb names if HR and HRRES are the same but which uses HRRES if they are different. Any assistance would be appreciated as my brain has just packed up. I'm not sure I can teach myself anymore new tricks this afternoon. HR HRRES SUBURB What I am trying to get 954Wheatbelt Great Southern ALBANY Great Southern 3177 Wheatbelt Wheatbelt ARDATH Ardath 3564 Wheatbelt Metro ARMADALE Metro 3825 Wheatbelt Wheatbelt ARTHUR RIVER Arthur River 5049 Wheatbelt South West AUSTRALIND SouthWest 5445 Wheatbelt Wheatbelt BABAKIN Babakin 5769 Wheatbelt Wheatbelt BADGINGARRA Bagingarra 6093 Wheatbelt Wheatbelt BAKERS HILL Bakers Hill 7065 Wheatbelt Wheatbelt BALLIDU Ballidu 9396 Wheatbelt Metro BAYSWATER Metro 9657 Wheatbelt Wheatbelt BEACON Beacon 12492 Wheatbelt Wheatbelt BENCUBBIN Bencubbin 13122 Metro Metro BENTLEY Bentley 13788 Metro Metro BEVERLEY Beverley 14436 Metro Metro BINDI BINDI Bindi Bindi 14517 Metro Metro BINDOON Bindoon 16218 Metro Wheatbelt BODALLIN Wheatbelt _ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: [EMAIL PROTECTED] The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Recode from 2 variables
Thank you. When I received the reply it dawned on me why my attempts had been unsuccessful. HR, HRRES and SUBURB are all factors. It's time to go home I was at this solution an hour ago and somehow missed it. I've now got it working. -Original Message- From: Petr Pikal [mailto:[EMAIL PROTECTED] Sent: Thursday, 17 July 2003 4:27 PM To: Mulholland, Tom Cc: [EMAIL PROTECTED] Subject: Re: [R] Recode from 2 variables Hi On 17 Jul 2003 at 16:17, Mulholland, Tom wrote: I am trying to create a new variable which uses the suburb names if HR and HRRES are the same but which uses HRRES if they are different. Any assistance would be appreciated as my brain has just packed up. I'm not sure I can teach myself anymore new tricks this afternoon. Something like ifelse(HR==HRRES,suburb,HRRES) should help [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] postscript/eps label clipping
I guess I was wrong there. However it does seem that it will come down to fontsize 9 without clipping (or if it does I find it hard to see). -Original Message- From: Mulholland, Tom Sent: Friday, 11 July 2003 1:38 PM To: David Forrest; [EMAIL PROTECTED] Subject: RE: [R] postscript/eps label clipping Never having used postscript as an output method I looked to see what you were talking about. I noted that ps.options needs to be called before calling postscript. ps.options does have pointsize within it and silly though it may seem, its what I would do next. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] postscript/eps label clipping
Never having used postscript as an output method I looked to see what you were talking about. I noted that ps.options needs to be called before calling postscript. ps.options does have pointsize within it and silly though it may seem, its what I would do next. _ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: [EMAIL PROTECTED] The contents of this e-mail transmission are confidential and may be protected by professional privilege. The contents are intended only for the named recipients of this e-mail. If you are not the intended recipient, you are hereby notified that any use, reproduction, disclosure or distribution of the information contained in this e-mail is prohibited. Please notify the sender immediately. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Friday, 11 July 2003 1:17 PM To: [EMAIL PROTECTED] Subject: [R] postscript/eps label clipping The following code produces an eps file with the tops of each of the ylabs clipped off. par(mfrow=c(2,2)) plot(runif(10), ylab=Function(Lengthy Expression),xlab=Prediction) plot(runif(10), ylab=expression(Delta * Beta^2),xlab=Prediction) plot(runif(10), ylab=Function(Lengthy Expression),xlab=Prediction) plot(runif(10), ylab=expression(Delta * Beta^2),xlab=Prediction) dev.print(postscript,file=foo.eps, horizontal=FALSE,onefile=FALSE,paper=special, pointsize=7, width=5,height=4) ?postscript seems to indicate paper=special, width=, height=, and pointsize= are the recommended way to produce nice latex graphics. If I don't set a pointsize, the letters aren't clipped, but the graphs are tiny with respect to the x/y labels. Is there something else I should be adjusting instead? Thanks for your time, Dave -- Dave Forrest(434)924-3954w(111B) (804)642-0662h (804)695-2026p [EMAIL PROTECTED]http://mug.sys.virginia.edu/~drf5n/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Generating a vector for breaks in a histogram
My gut feeling is that stacked dotplots would have given you the same insight. In general terms it's about getting the right tool for the right job. My comment was about the order of choosing rather than ignoring totally. If I recall correctly the article about dot plots was about old fashioned hand drawn dot plots where dots were either stacked above each other or if more appropriate next to each other as near as possible to where they should be located on the axis. This results in a pattern that looks very similar to the histogram. The argument being made if I recall correctly is that if you choose the wrong bins for a histogram you may well end up with the same type of result that you had with the densityplot. My practical way of looking at this is to look at what happens to the overall shape of the histogram when you change the bins. The issue is how quickly and reliably do you get to the truth using the various techniques. As you've noted the density plot doesn't seem to deal with some types of data as well as it does others. So when I am looking at data I use a variety of methods, and histograms come later than rugplots or density plots, but I tend to do both of those together. I'm just learning and welcome guidance in a field that I do not claim expertise in. _ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: [EMAIL PROTECTED] The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Generating a vector for breaks in a histogram
One of my discoveries while learning the art of R, is that time has moved on since I did my basic statistics in school (although to my dismay the teaching of statistics in school appears also to have not noticed the movement.) I have seen a few references when people want to pie chart something, for the advice to be find a better way. I've been reading some of the ash work (see package of same name and loads of papers on the web), also some interesting work on dot plots as an alternative to histograms. They make me feel that unless the data that you have in both histograms accidentally works well with the same set of bins you may not get the comparative assessment that you think you are getting. I am beginning to form the opinion that in most cases (if not all) there are better alternatives to histograms. _ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: [EMAIL PROTECTED] The contents of this e-mail transmission are confidential an...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Strip location and grid colour in Lattice
I am probably missing something quite obvious, but any help would be appreciated. I am continually getting people misreading the lattice plots because they are expecting the strip (with the factor names in them) to be below the graph. Is there anyway of achieving this. Secondly, from a more personal note I find the grid formed by the axes to be a bit overpowering and would like to make it a little less bold by changing it to a grey of some kind. I can't see that the scales options have anythig in their that I could use. I can change the label colours and tick marks, but then I draw a blank. While I'm on a role, I find that quite often I have to resort to the at and label sections of the scales function to get my tickmarks looking OK. This seems to be when am producing line graphs with one of the scales being a date (POSIXct). What is not clear to me is if all POSIXct variables are the same. The xyplot doco indicates that the at co-ordinates should be native co-ordinates. Can anyone point me to where in the voluminous documentation one looks to understand what this means. I have found that on some occasions the co-ordinates are in seconds (as the documentation on POSIXct states, but this afternoon I found that the values seemed to be in years. Which wasn't a problem other than I wish I could understand what was actually happening. For the years example, when the data is originally imported the years came in as integers. str(rbd) `data.frame': 541 obs. of 6 variables: $ Year: int 1993 1994 1995 1996 1997 1998 1999 2000 2001 1993 ... $ Hosp: Factor w/ 75 levels ALBANY HOSP..,..: 23 23 23 23 23 23 23 23 23 28 ... $ Beddays : int 2431 2507 2201 2985 2702 2461 2535 2970 3271 1246 ... $ HD : Factor w/ 21 levels Avon HD.,Bunb..,..: 10 10 10 10 10 10 10 10 10 10 ... $ HR : Factor w/ 6 levels Goldfields-..,..: 3 3 3 3 3 3 3 3 3 3 ... $ HospCode: int 127 127 127 127 127 127 127 127 127 128 ... Thinking that I needed a date I promptly put rbd$Year - as.POSIXct(ISOdate(rbd$Year,6,30)) then onwards and forwards for (h in levels(rbd$HR)){ HRData - subset(rbd,HR==h) HRData$CommnDesc - HRData$CommnDesc[,drop=T] temp -c((FormatLabels(levels(HRData$CommnDesc)[1],20))) for (j in 2:length(levels(HRData$CommnDesc))){ temp - c(temp,FormatLabels(levels(HRData$CommnDesc)[j],20)) } levels(HRData$CommnDesc) - temp p1 - bwplot(Beddays~Year |CommnDesc,HRData, panel = panel.linejoin, horizontal=F,bty=n, as.table=T, par.strip.text=list(lines=3.5,cex=0.8,style=1), main=paste(h,Inpatient Beddays), scales=list(x=list(cex=0.8,rot=90, at=c(2,6,9), labels=c(94,98,01),col=navy)) ) print(p1) savePlot(file=paste(OutputPath,Inpatient beddays -(lattice) by region,h, ,j,sep=),type=wmf) } Of course there are a few things in here that are probably not the right way to do things, but I tend to be more interested in the output, rather than whether or not my programming is up to speed. But it has been a little bug bear of mine about dropping factors when subsetting the data. I've noticed subset options as I've been going through assorted bits and pieces, but there never seems to be enough time to follow up. This is in striking contrast to a previous attempt (most of the code however is at home not here), but the functions that I worked out for the at and label functions were ProcLab - function(DateData,breakNum){ maxplot - round(as.numeric(max(DateData)),digits=0) minplot - round(as.numeric(min(DateData)),digits=0) maxplotnum- round(((maxplot-minplot)/86400)+1,digits=0) jumpnum - (maxplotnum/((breakNum)-1))*.98 lablist - seq(min(DateData),max(DateData),jumpnum*86400) } ProcAt - function(DateData,breakNum){ maxplot - round(as.numeric(max(DateData)),digits=0) minplot - round(as.numeric(min(DateData)),digits=0) maxplotnum- round(((maxplot-minplot)/86400)+1,digits=0) jumpnum - (maxplotnum/((breakNum)-1))*.98 atlist - seq(0,maxplotnum,jumpnum) } The kludges were in because without them the whole thing fell over, presumeably because I would needed to have set the limits as well. _ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] The contents of this e-mail transmission are confidential and may be protected by professional privilege. The contents are intended only for the named recipients of this e-mail. If you are not the intended recipient, you are hereby notified that any use, reproduction, disclosure or distribution of the information contained in this e-mail is prohibited. Please notify the sender immediately. [[alternate HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help