[R] Learn R in a Day - new ebook
Dear all, I'd like to make you aware of my new ebook - Learn R in a Day - which provides the reader with key programming skills through an examples-oriented approach and is ideally suited for academics, scientists, mathematicians and engineers. Amazon.com: http://www.amazon.com/Learn-R-Day-Steven-Murray-ebook/dp/B00GC2LKOK/ref=sr_1_1?s=digital-text&ie=UTF8&qid=1393005750&sr=1-1&keywords=learn+r+in+a+day Amazon UK: http://www.amazon.co.uk/Learn-R-Day-Steven-Murray-ebook/dp/B00GC2LKOK The book assumes no prior knowledge of computer programming and progressively covers all the essential steps needed to become confident and proficient in using R within a day. Topics include how to input, manipulate, format, iterate (loop), query, perform basic statistics on, and plot data, via a step-by-step technique and demonstrations using in-built datasets which the reader is encouraged to replicate on their computer. Each chapter also includes exercises (with solutions) to practice key skills and empower the reader to build on the essentials gained during this introductory course. Steve Murray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summing data frame columns on identical data
Dear all, I have 9 data frames, and I'm simply trying to sum the values of column 3 (on a row-by-row basis). However, there are a slightly different number of rows in each data frame, so I'm receiving the following error: "Error in Ops.data.frame(mrunoff_207101[3], mrunoff_207102[3]) : + only defined for equally-sized data frames". Here is what I'm attempting to do: > arunoff_2071 <- cbind(mrunoff_207101[1:2], (mrunoff_207101[3] + > mrunoff_207102[3] + mrunoff_207103[3] + mrunoff_207104[3] + mrunoff_207105[3] > + mrunoff_207106[3] + mrunoff_207107[3] + mrunoff_207108[3] + > mrunoff_207109[3])) Is there an easy way of summing based on congruent values in columns 1 and 2? The only way I can think of would be to use merge, but this would involve doing this for every pair of data frames. The data for each data frame look like this: > head(mrunoff_207101) Latitude Longitude FPC 1 5.75 0.25 0.0112384744 2 6.25 0.25 0.0019959067 3 6.75 0.25 0.0003245941 4 7.25 0.25 0.0011973676 5 7.75 0.25 0.0001062602 6 8.25 0.25 0.0451578423 Any suggestions on how to achieve this easily will be very welcome. Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouped bars in barplot
Dear all, I am trying to make a barplot with clustered pairs of bars, using class=numeric data and the following command: barplot(c(bline_precip[10,9], bline_runoff[10,9], cccma_precip[10,9], cccma_runoff[10,9], csiro_precip[10,9], csiro_runoff[10,9], ipsl_precip[10,9], ipsl_runoff[10,9], mpi_precip[10,9], mpi_runoff[10,9], ncar_precip[10,9], ncar_runoff[10,9], ukmo_precip[10,9], ukmo_runoff[10,9]), beside=TRUE, space=c(0,2)) This results in all bars being packed tightly together, but with no gap between each pair. I suspect the problem is something to do with the data not being a matrix, but I've tried using as.matrix for each data element and this doesn't seem to work. If any one has any suggestions I'd be very grateful to hear them. Also, I'm hoping to put a label beneath each pair of bars on the x-axis, in the centre. At present I can only get labels to appear directly underneath a single bar, as opposed to the centre of the pair of bars. Does anyone have any suggestions for solving this? Many thanks for any help offered. Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Alignment of lines within barplot bars
Dear all, I have a barplot upon which I hope to superimpose horizontal lines extending across the width of each bar. I am able to partly achieve this through the following set of commands: positions <- barplot(bar_values, col="grey") par(new=TRUE) plot(positions, horiz_values, col="red", pch="_", ylim=c(min(bar_values), max(bar_values))) ...however this results in small, off-centred lines, which don't extend across the width of each bar. I've tried using 'cex' to increase the width, but of course this also increases the height of the line and results in it spanning a large range of y-axis values. I'm sure this shouldn't be too tricky to achieve, nor that uncommon a problem! It may be that I'm taking the wrong approach. Any help offered would be gratefully received. Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unwanted boxes in legend
Ok thanks Peter! That should solve it. Thanks again for looking into this. Steve > Date: Wed, 21 Apr 2010 05:04:33 -0600 > From: ehl...@ucalgary.ca > To: smurray...@hotmail.com > CC: r-help@r-project.org; tgstew...@gmail.com > Subject: Re: [R] Unwanted boxes in legend > > On 2010-04-21 4:35, Peter Ehlers wrote: >> The 'border' argument was added in 2.1.10. > > Egad! Did I really type that? > I meant 'in R 2.10.0'. > > -Peter Ehlers > >> >> >> On 2010-04-21 1:53, Steve Murray wrote: >>> >>> Thanks Peter, >>> >>> I'm using version 2.8.0 (2008-10-20). This version should be recent >>> enough to pick up fundamentals such as this, right? I guess the >>> obvious thing to do is update and try again... >>> >>> Cheers, >>> >>> Steve >>> >>> >>>> Date: Tue, 20 Apr 2010 11:55:20 -0600 >>>> From: ehl...@ucalgary.ca >>>> To: tgstew...@gmail.com >>>> CC: smurray...@hotmail.com; r-help@r-project.org >>>> Subject: Re: [R] Unwanted boxes in legend >>>> >>>> On 2010-04-19 8:11, Thomas Stewart wrote: >>>>> Try border=c(0,0,1,0). >>>>> -tgs >>>> >>>> If the 'border' argument is not recognized, then this won't work. >>>> >>>> Steve: >>>> What version of R are you using? I have no problems with the >>>> suggestions I gave you in R 2.10.1 or R 2.11.0 alpha. >>>> >>>> -Peter Ehlers >>>> >>>>> >>>>> On Mon, Apr 19, 2010 at 4:21 AM, Steve Murraywrote: >>>>> >>>>>> >>>>>> Dear all, >>>>>> >>>>>> Thanks for the response, however I'm getting the following error >>>>>> message >>>>>> when I execute the legend command using the 'border' argument: >>>>>> >>>>>> Error in legend(10, par("usr")[4], c("A", "B", : >>>>>> unused argument(s) (border = FALSE) >>>>>> >>>>>> >>>>>> Is anyone aware of any alternative means of switching off boxes >>>>>> around all >>>>>> but one of the elements in a legend? >>>>>> >>>>>> Many thanks for any input, >>>>>> >>>>>> Steve >>>>>> >>>>>> >>>>>> >>>>>>> Date: Thu, 15 Apr 2010 12:13:40 -0600 >>>>>>> From: ehl...@ucalgary.ca >>>>>>> To: smurray...@hotmail.com >>>>>>> CC: r-help@r-project.org >>>>>>> Subject: Re: [R] Unwanted boxes in legend >>>>>>> >>>>>>> On 2010-04-15 11:10, Steve Murray wrote: >>>>>>>> >>>>>>>> Dear all, >>>>>>>> >>>>>>>> I am using the following code to generate a legend in my plot >>>>>> (consisting of both bars and points), but end up with boxes around my >>>>>> points: >>>>>>>> >>>>>>>> legend(10, par("usr")[4], c("A", "B", "C", "D"), fill=c(NA,NA, >>>>>>>> "grey28", >>>>>> NA), pch=c(16,4,NA,18), col=c("red","blue","grey28","yellow"), >>>>>> lty=FALSE, >>>>>> bty="n", horiz=FALSE) >>>>>>>> >>>>>>>> I want a box around the third element of the legend (to represent >>>>>>>> the >>>>>> bar 'fill' colour), but not for the others, where points are shown >>>>>> instead. >>>>>>>> >>>>>>>> What am I doing wrong above and how do I correct it? >>>>>>> >>>>>>> Add the 'border' argument: >>>>>>> >>>>>>> either >>>>>>> >>>>>>> border = FALSE # in which case no box is drawn for any element >>>>>>> >>>>>>> or >>>>>>> >>>>>>> border = c(NA, NA, "black", NA) >>>>>>> >>>>>>> -Peter Ehlers >>>>>>> >>>>>>>> >>>>>>>> Many thanks, >>>>>>>> >>>>>>>> Steve >>>>>>>> >>>>>>>> >>>>>>>> __ >>>>>>>> R-help@r-project.org mailing list >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Peter Ehlers >>>>>>> University of Calgary >>>>>> >>>>>> >>>>>> __ >>>>>> R-help@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>> >>>> -- >>>> Peter Ehlers >>>> University of Calgary >>> >>> _ >>> http://clk.atdmt.com/UKM/go/19780/direct/01/ >>> Do you have a story that started on Hotmail? Tell us now >>> >> > > -- > Peter Ehlers > University of Calgary _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unwanted boxes in legend
Thanks Peter, I'm using version 2.8.0 (2008-10-20). This version should be recent enough to pick up fundamentals such as this, right? I guess the obvious thing to do is update and try again... Cheers, Steve > Date: Tue, 20 Apr 2010 11:55:20 -0600 > From: ehl...@ucalgary.ca > To: tgstew...@gmail.com > CC: smurray...@hotmail.com; r-help@r-project.org > Subject: Re: [R] Unwanted boxes in legend > > On 2010-04-19 8:11, Thomas Stewart wrote: >> Try border=c(0,0,1,0). >> -tgs > > If the 'border' argument is not recognized, then this won't work. > > Steve: > What version of R are you using? I have no problems with the > suggestions I gave you in R 2.10.1 or R 2.11.0 alpha. > > -Peter Ehlers > >> >> On Mon, Apr 19, 2010 at 4:21 AM, Steve Murraywrote: >> >>> >>> Dear all, >>> >>> Thanks for the response, however I'm getting the following error message >>> when I execute the legend command using the 'border' argument: >>> >>> Error in legend(10, par("usr")[4], c("A", "B", : >>> unused argument(s) (border = FALSE) >>> >>> >>> Is anyone aware of any alternative means of switching off boxes around all >>> but one of the elements in a legend? >>> >>> Many thanks for any input, >>> >>> Steve >>> >>> >>> >>>> Date: Thu, 15 Apr 2010 12:13:40 -0600 >>>> From: ehl...@ucalgary.ca >>>> To: smurray...@hotmail.com >>>> CC: r-help@r-project.org >>>> Subject: Re: [R] Unwanted boxes in legend >>>> >>>> On 2010-04-15 11:10, Steve Murray wrote: >>>>> >>>>> Dear all, >>>>> >>>>> I am using the following code to generate a legend in my plot >>> (consisting of both bars and points), but end up with boxes around my >>> points: >>>>> >>>>> legend(10, par("usr")[4], c("A", "B", "C", "D"), fill=c(NA,NA, "grey28", >>> NA), pch=c(16,4,NA,18), col=c("red","blue","grey28","yellow"), lty=FALSE, >>> bty="n", horiz=FALSE) >>>>> >>>>> I want a box around the third element of the legend (to represent the >>> bar 'fill' colour), but not for the others, where points are shown instead. >>>>> >>>>> What am I doing wrong above and how do I correct it? >>>> >>>> Add the 'border' argument: >>>> >>>> either >>>> >>>> border = FALSE # in which case no box is drawn for any element >>>> >>>> or >>>> >>>> border = c(NA, NA, "black", NA) >>>> >>>> -Peter Ehlers >>>> >>>>> >>>>> Many thanks, >>>>> >>>>> Steve >>>>> >>>>> >>>>> __ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> >>>> >>>> -- >>>> Peter Ehlers >>>> University of Calgary >>> >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > > -- > Peter Ehlers > University of Calgary _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unwanted boxes in legend
Dear all, Thanks for the response, however I'm getting the following error message when I execute the legend command using the 'border' argument: Error in legend(10, par("usr")[4], c("A", "B", : unused argument(s) (border = FALSE) Is anyone aware of any alternative means of switching off boxes around all but one of the elements in a legend? Many thanks for any input, Steve > Date: Thu, 15 Apr 2010 12:13:40 -0600 > From: ehl...@ucalgary.ca > To: smurray...@hotmail.com > CC: r-help@r-project.org > Subject: Re: [R] Unwanted boxes in legend > > On 2010-04-15 11:10, Steve Murray wrote: >> >> Dear all, >> >> I am using the following code to generate a legend in my plot (consisting of >> both bars and points), but end up with boxes around my points: >> >> legend(10, par("usr")[4], c("A", "B", "C", "D"), fill=c(NA,NA, "grey28", >> NA), pch=c(16,4,NA,18), col=c("red","blue","grey28","yellow"), lty=FALSE, >> bty="n", horiz=FALSE) >> >> I want a box around the third element of the legend (to represent the bar >> 'fill' colour), but not for the others, where points are shown instead. >> >> What am I doing wrong above and how do I correct it? > > Add the 'border' argument: > > either > > border = FALSE # in which case no box is drawn for any element > > or > > border = c(NA, NA, "black", NA) > > -Peter Ehlers > >> >> Many thanks, >> >> Steve >> >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > Peter Ehlers > University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unwanted boxes in legend
Dear all, I am using the following code to generate a legend in my plot (consisting of both bars and points), but end up with boxes around my points: legend(10, par("usr")[4], c("A", "B", "C", "D"), fill=c(NA,NA, "grey28", NA), pch=c(16,4,NA,18), col=c("red","blue","grey28","yellow"), lty=FALSE, bty="n", horiz=FALSE) I want a box around the third element of the legend (to represent the bar 'fill' colour), but not for the others, where points are shown instead. What am I doing wrong above and how do I correct it? Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Alignment of x-axis labels
Dear all, I'm having trouble getting the correct spacing between x-axis labels on a barplot. This is the command I'm using to generate the plot: temp <- barplot(precip, beside=TRUE, xaxt="n", las=1, xpd=FALSE, col="grey28", ylim=c(0, max(precip))) Here is the structure of temp: > str(temp) num [1:96, 1] 0.7 1.9 3.1 4.3 5.5 6.7 7.9 9.1 10.3 11.5 ... And here is the structure of the data being plotted: > str(precip) num [1:96] 1841 2871 9254 22335 30682 ... > length(precip) [1] 96 These are monthly data points for 8 years (8 * 12 = 96), but I only want to have labels for each year (1978 to 1985), rather than every month. So I tried using the following command, but this results in the labels not being far enough apart, and therefore they don't fill the length of the x-axis (and don't align properly with the corresponding first bar of every year): axis(1, at=seq(1,96,12), 1978:1985) This one has stumped me somewhat, so I'd be grateful to receive any suggestions as to how I might resolve this. Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing data based on certain conditions
Dear all, Thanks for the contributions so far. I've had a look at these and the closest I've come to solving it is the following: > data_ave <- ave(data$rammday, by=c(data$month, data$year)) Warning messages: 1: In split.default(x, g) : data length is not a multiple of split variable 2: In split.default(seq_along(x), f, drop = drop, ...) : data length is not a multiple of split variable I'm slightly confused by the warning message, as the data lengths do appear the same: > dim(data) [1] 1073 6 > length(data$year) [1] 1073 > length(data$month) [1] 1073 Maybe the approach I'm taking is wrong. Any suggestions would be gratefully received. Many thanks, Steve > Date: Wed, 31 Mar 2010 23:31:25 +0200 > From: stephan.kola...@gmx.de > To: smurray...@hotmail.com > CC: r-help@r-project.org > Subject: Re: [R] Summing data based on certain conditions > > ?by may also be helpful. > > Stephan > > > Steve Murray schrieb: >> Dear all, >> >> I have a dataset of 1073 rows, the first 15 which look as follows: >> >>> data[1:15,] >> date year month day rammday thmmday >> 1 3/8/1988 1988 3 8 1.43 0.94 >> 2 3/15/1988 1988 3 15 2.86 0.66 >> 3 3/22/1988 1988 3 22 5.06 3.43 >> 4 3/29/1988 1988 3 29 18.76 10.93 >> 5 4/5/1988 1988 4 5 4.49 2.70 >> 6 4/12/1988 1988 4 12 8.57 4.59 >> 7 4/16/1988 1988 4 16 31.18 22.18 >> 8 4/19/1988 1988 4 19 19.67 12.33 >> 9 4/26/1988 1988 4 26 3.14 1.79 >> 10 5/3/1988 1988 5 3 11.51 6.33 >> 11 5/10/1988 1988 5 10 5.64 2.89 >> 12 5/17/1988 1988 5 17 37.46 20.89 >> 13 5/24/1988 1988 5 24 9.86 9.81 >> 14 5/31/1988 1988 5 31 13.00 8.63 >> 15 6/7/1988 1988 6 7 0.43 0.00 >> >> >> I am looking for a way by which I can create monthly totals of rammday >> (rainfall in mm/day; column 5) by doing the following: >> >> For each case where the month value and the year are the same (e.g. 3 and >> 1988, in the first four rows), find the mean of the the corresponding >> rammday values and then times by the number of days in that month (i.e. 31 >> in this case). >> >> Note however that the number of month values in each case isn't always the >> same (e.g. in this subset of data, there are 4 values for month 3, 5 for >> month 4 and 5 for month 5). Also the months will of course recycle for the >> following years, so it's not simply a case of finding a monthly total for >> *all* the 3s in the whole dataset, just those associated with each year in >> turn. >> >> How would I go about doing this in R? >> >> Any help will be gratefully received. >> >> Many thanks, >> >> Steve >> >> >> >> _ >> We want to hear all your funny, exciting and crazy Hotmail stories. Tell us >> now >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summing data based on certain conditions
Dear all, I have a dataset of 1073 rows, the first 15 which look as follows: > data[1:15,] date year month day rammday thmmday 1 3/8/1988 1988 3 8 1.43 0.94 2 3/15/1988 1988 3 15 2.86 0.66 3 3/22/1988 1988 3 22 5.06 3.43 4 3/29/1988 1988 3 29 18.76 10.93 5 4/5/1988 1988 4 5 4.49 2.70 6 4/12/1988 1988 4 12 8.57 4.59 7 4/16/1988 1988 4 16 31.18 22.18 8 4/19/1988 1988 4 19 19.67 12.33 9 4/26/1988 1988 4 26 3.14 1.79 10 5/3/1988 1988 5 3 11.51 6.33 11 5/10/1988 1988 5 10 5.64 2.89 12 5/17/1988 1988 5 17 37.46 20.89 13 5/24/1988 1988 5 24 9.86 9.81 14 5/31/1988 1988 5 31 13.00 8.63 15 6/7/1988 1988 6 7 0.43 0.00 I am looking for a way by which I can create monthly totals of rammday (rainfall in mm/day; column 5) by doing the following: For each case where the month value and the year are the same (e.g. 3 and 1988, in the first four rows), find the mean of the the corresponding rammday values and then times by the number of days in that month (i.e. 31 in this case). Note however that the number of month values in each case isn't always the same (e.g. in this subset of data, there are 4 values for month 3, 5 for month 4 and 5 for month 5). Also the months will of course recycle for the following years, so it's not simply a case of finding a monthly total for *all* the 3s in the whole dataset, just those associated with each year in turn. How would I go about doing this in R? Any help will be gratefully received. Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Legends for line and point data
Dear all, I have a plot which contains 4 data series displayed using solid, dashed and dotted lines, and also points. How do I use lty and pch together to signify that the first legend item is a solid line, the second is point data (pch=16), the third is dashed and the fourth is dotted? Many thanks, Steve _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creation of a new data frame
Dear all, I have a data frame of 18556 rows and 19 columns and wish to create a new grid from these data of dimensions 360 rows and 720 columns. The existing data frame is set up so that every 38 rows makes up one row of the new data frame, with 2 NA values at the end of each 'block' that should be removed before being inserted into the new grid. So to be clear, row 1 of the new data frame consists of rows 1:38 of the existing data frame, with the 2 NA values on the end removed. Row 2 is 39:77 and so on. How do I go about creating this new 360 x 720 grid from existing data? Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filling a grid based on existing data
Sorry, that was a poorly worded question. You're right in that the gaps are in fact NAs and I would be proposing to remove these entirely. So it's really a case of filling up a 720 x 360 grid (by row) based on the data in the 18556 rows x 19 columns data frame. I've tried doing: data_mat <- matrix(data, nrow=360, ncol=720, byrow=TRUE) Warning message: In matrix(river, nrow = 360, ncol = 720, byrow = TRUE) : data length [19] is not a sub-multiple or multiple of the number of rows [360] But this results in a mess! head(data_mat) [,1] [,2] [,3] [,4] [,5] [1,] Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 [2,] Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 [3,] Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 [4,] Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 [5,] Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 [6,] Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 Numeric,18556 If it's of any use, the original data are structured as follows: str(data) 'data.frame': 18556 obs. of 19 variables: $ V1 : num 0 0 0 0 0 0 0 0 0 0 ... $ V2 : num 0 0 0 0 0 0 0 0 0 0 ... $ V3 : num 0 0 0 0 0 0 0 0 0 0 ... $ V4 : num 0 0 0 0 0 0 0 0 0 0 ... $ V5 : num 0 0 0 0 0 0 0 0 0 0 ... $ V6 : num 0 0 0 0 0 0 0 0 0 0 ... $ V7 : num 0 0 0 0 0 0 0 0 0 0 ... $ V8 : num 0 0 0 0 0 0 0 0 0 0 ... $ V9 : num 0 0 0 0 0 0 0 0 0 0 ... $ V10: num 0 0 0 0 0 0 0 0 0 0 ... $ V11: num 0 0 0 0 0 0 0 0 0 0 ... $ V12: num 0 0 0 0 0 0 0 0 0 0 ... $ V13: num 0 0 0 0 0 0 0 0 0 0 ... $ V14: num 0 0 0 0 0 0 0 0 0 0 ... $ V15: num 0 0 0 0 0 0 0 0 0 0 ... $ V16: num 0 0 0 0 0 0 0 0 0 0 ... $ V17: num 0 0 0 0 0 0 0 0 0 0 ... $ V18: num 0 0 0 0 0 0 0 0 0 0 ... $ V19: num 0 0 0 0 0 0 0 0 0 0 ... Don't worry about all the zeros - there are plenty of greater values later on! Any help would be gratefully received. Thanks, Steve > From: dwinsem...@comcast.net > To: smurray...@hotmail.com > Subject: Re: [R] Filling a grid based on existing data > Date: Wed, 24 Mar 2010 20:50:40 -0400 > > > On Mar 24, 2010, at 2:34 PM, Steve Murray wrote: > >> >> Dear all, >> >> I currently have a data frame of dimensions 18556 rows by 19 >> columns. I want to convert this into a grid of dimensions 720 rows >> by 360 columns. The problem in this case is that not all rows in the >> initial data frame are complete (there are gaps). >> >> Therefore I am perhaps looking for a way of filling a 720 x 360 grid >> by reading in all values in each row until one is encountered which >> does not have 19 columns. In these cases, the row in the new grid >> should be filled (as the gaps occur every 720 values), and filling >> should re-start on the next row of the new grid. I hope this is >> reasonably clear! > > Perhaps it could become clearer. Dataframes in R do not have "gaps". > They may have NA's but all the columns and rows have _something_. > And ... with what were you proposing to Fill these "gaps". > > -- > David > >> >> Many thanks for any help, >> >> Steve >> >> >> _ >> Got a cool Hotmail story? Tell us now >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Filling a grid based on existing data
Dear all, I currently have a data frame of dimensions 18556 rows by 19 columns. I want to convert this into a grid of dimensions 720 rows by 360 columns. The problem in this case is that not all rows in the initial data frame are complete (there are gaps). Therefore I am perhaps looking for a way of filling a 720 x 360 grid by reading in all values in each row until one is encountered which does not have 19 columns. In these cases, the row in the new grid should be filled (as the gaps occur every 720 values), and filling should re-start on the next row of the new grid. I hope this is reasonably clear! Many thanks for any help, Steve _ Got a cool Hotmail story? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating mean for a number of columns
Dear all, I am attempting to perform what should be a relatively simple calculation on a number of data frame columns. I am hoping to find the average on a per-row basis for each of the 50 columns. If on a particular row a 'NA' value is encountered, then this should be ignored and the mean formed on the basis of the other rows. However, I'm finding that the end result for each row is identical (when it shouldn't be). I suspect I'm making a syntax error, but can't seem to spot it... > PDSI_Jan <- cbind(pdsi_195101[,1:2], mean(c(pdsi_195101[,3], > pdsi_195201[,3], pdsi_195301[,3], pdsi_195401[,3], pdsi_195501[,3], > pdsi_195601[,3], pdsi_195701[,3], pdsi_195801[,3], pdsi_195801[,3], > pdsi_195901[,3], pdsi_196001[,3], pdsi_196101[,3], pdsi_196201[,3], > pdsi_196301[,3], pdsi_196401[,3], pdsi_196501[,3], pdsi_196601[,3], > pdsi_196701[,3], pdsi_196801[,3], pdsi_196901[,3], pdsi_197001[,3], > pdsi_197101[,3], pdsi_197201[,3], pdsi_197301[,3], pdsi_197401[,3], > pdsi_197501[,3], pdsi_197601[,3], pdsi_197701[,3], pdsi_197801[,3], > pdsi_197901[,3], pdsi_198001[,3], pdsi_198101[,3], pdsi_198201[,3], > pdsi_198301[,3], pdsi_198401[,3], pdsi_198501[,3], pdsi_198601[,3], > pdsi_198701[,3], pdsi_198801[,3], pdsi_198901[,3], pdsi_199001[,3], > pdsi_199101[,3], pdsi_199201[,3], pdsi_199301[,3], pdsi_199401[,3], > pdsi_199501[,3], pdsi_199601[,3], pdsi_199701[,3], pdsi_199801[,3], > pdsi_199901[,3], pdsi_21[,3])), na.rm=TRUE) The object structure for each of the data frames being used is as follows: > str(pdsi_195101) 'data.frame': 2756 obs. of 3 variables: $ Latitude : chr "-48.75" "-51.25" "-53.75" "-48.75" ... $ Longitude: Factor w/ 144 levels "-178.75","-176.25",..: 1 1 1 2 3 3 4 6 6 6 ... $ PDSI : num 4.7 -1.94 -1.29 -0.68 -0.66 -0.49 -0.51 2.52 3.68 4.17 ... Many thanks for any help offered, Steve _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Resampling a grid to coarsen its resolution
That sounds like a sensible way of dealing with the - values... ...but doesn't solve the more important question of how to perform the resampling. Are there are functions in R which have been designed to achieve this? Or is there a standard way of going about this? Many thanks for any advice, Steve _ Send us your Hotmail stories and be featured in our newsletter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Resampling a grid to coarsen its resolution
Dear all, I have a grid (data frame) dataset at 0.5 x 0.5 degrees spatial resolution (720 columns x 360 rows; regular spacing) and wish to coarsen this to a resolution of 2.5 x 2.5 degrees. A simple calculation which takes the mean of a block of points to form the regridded values would do the trick. Values which should be excluded from the calculation are - (unless all points within a block are -, in which case - should be returned as the 'new' cell). How would I go about achieving this in R? Any help or guidelines would be very much appreciated. Many thanks, Steve _ Got a cool Hotmail story? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting by rows based on multiple criteria
Spot on, thanks very much indeed. Steve _ Got a cool Hotmail story? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting by rows based on multiple criteria
Dear all, I have a data frame of 6 columns and ~6 rows which I hope to perform the following calculation on. For each row, I wish to determine whether there are a greater number of positive or negative numbers. Then, if there are more positive numbers in the row, count how many occur - but if there are more negative numbers in the row, count them instead and insert a minus symbol before (e..g -4, to denote that this was determined on negative numbers). If the number of positive and negative numbers in a given row are equal, then return a zero for that row. Finally, if all values in a row are zero (i.e. neither positve or negative) then return -9. I've had a go at this myself for some considerable time, trying the use of an 'apply' (by rows) function, but seem unable to successfully achieve the above. Any help would be very gratefully received. Many thanks, Steve _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ifelse on a series of rows for multiple criteria
Dear all, I am attempting to perform a calculation which counts the number of positive (or negative) values based on the sample mean (on a per-row basis). If the mean is>0 then only positive values should be counted, and if the mean is <0 then only negative values should be counted. In cases where the mean is equal to zero, the value -9 should be returned. The following is an example of the data frame I'm working on (all values are of class 'numeric'). > head(combdframe) V1 V2 V3 V4 V5 V6 1 -328.0999 3404.038 791.7602 211.23932 513.0479 -1178.079 2 -383.4249 3207.306 808.7268 141.20352 424.2388 -1164.402 3 -295.9050 2930.754 918.1146 11.74804 464.2448 -1133.109 4 -326.8606 2703.638 1052.2824 -104.17344 246.2851 -1103.887 5 -296.7194 2663.987 1202.7648 -87.15331 255.1338 -1090.147 6 -227.1772 2619.096 1343.1529 -75.89626 381.6089 -1064.733 The mean of the first row is 571 and therefore a count of the positive values should be returned for this row (=4). *If* the mean was -571, then a count of the negative values would be returned (=2). If the 7th row was composed of values 1.5, -1.5, 2.5, -2.5, 0 and 0 (i.e. the mean = 0), then -9 should be returned for this row. I've attempted to construct this code as follows: direction_func <- function(combdframe) { ifelse(mean(i> 0), sum(i> 0), ifelse(mean(i < 0), sum(i < 0), -9)) } for (i in nrow(combdframe)) { direction <- apply(combdframe[i,],1, direction_func) } ...but this, and varients on this, result in a bit of a mess! Any guidance on how to perform this (whether it be a correction of the above or a whole new approach) would be very much appreciated. Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding ranges on a per-row basis from several objects
Dear all, I'm attemping to find the overall range of values from column 5 of a series of data frames, on a per-row basis, and assign the results to a new object. At present, I'm only able to receive the overall range of all values, whereas I'm intending to get the results of the range for each corresponding row. This is what I have so far: for (i in seq(nrow(dframe[1]))) range_models <- rbind(range(c(dframe1[i,5]), range(dframe2[i,5]), range(dframe3[i,5]), range(dframe4[i,5]), range(dframe5[i,5]), range(dframe6[i,5]))) } If it's of any help, the structure of each of the data frames is: > str(dframe1) 'data.frame': 61538 obs. of 6 variables: $ Latitude : num -0.25 -0.25 -0.25 -0.25 -0.25 -0.25 -0.25 -0.25 -0.25 -0.25 ... $ Longitude : num -48.8 -49.2 -49.8 -50.2 -50.8 ... $ Runoff_fut : num 1549 1335 1254 1066 ... $ Runoff_bline: num 1877 1719 1550 1438 1362 ... $ Difference : num -328 -383 -296 -327 -297 ... $ PerctDiff : num -17.5 -22.3 -19.1 -22.7 -21.8 ... So, just to clarify, I'm trying to find the range of values for row 1 based on dframe1 through to dframe6, then row 2, row 3 etc etc. and put the range results for each row into a new object of equal row number to the dframe objects. Many thanks for any help offered, Steve _ Got a cool Hotmail story? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Isolating a column within a loop
Thanks! I've learnt something there! Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Isolating a column within a loop
Dear all, I am trying to calculate the mean of one column for many data frames. The code I am using is as follows: > for (i in seq(nrow(index))) { assign(paste(model, "_mean_",index$year[i], index$month[i], sep=''), mean(get(paste(model, index$year[i], index$month[i], "[,3]", sep='' } Error in get(paste(model, index$year[i], index$month[i], "[,3]", sep = "")) : object 'cccma207101[,3]' not found The error message I'm getting is strange, because this object does exist: str(cccma207101[,3]) num [1:61538] 0.687 2.661 0 0 0 ... If I leave the "[,3]" out of the loop, the code seems to work fine - so I'm isolating this as the cause of the problem for now. My question is therefore, how can I use the above code to extract column three from each of the (360) data frames? Many thanks for any help offered, Steve _ Got a cool Hotmail story? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Output and ArcGIS
Dear all, Just to let you know that thanks to your help, I've managed to solve it. For future reference, if anyone's interested (!), if you're having problems reading R-generated data from a Mac, into ArcMap on a PC, then ensure that you're using eol="\r\n" in the write.table command and that you don't have factor or character data when they're really meant to be numeric! To overcome the latter, I did: mrunoff$Longitude <- as.numeric(levels(mrunoff$Longitude))[mrunoff$Longitude] Hope this is of use to someone, somewhere, someday! Thanks again for your advice, Steve _ Got a cool Hotmail story? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Output and ArcGIS
Dear all, Thanks for the replies so far. Just to emphasise, I'm not using Excel in any way. I have many many files to output, so it'd take considerable time to export from R, reprocess in Excel, then load into Arc! On a PC I'm able to go directly from R to ArcMap (9.3) without having to go via Excel. I've simply been viewing the data in Notepad, which was fine for observing the removal of the end-of-line characters and general format of the data (3 columns). My data are structured as follows: > str(mrunoff) 'data.frame': 61538 obs. of 3 variables: $ Latitude : chr "5.75" "6.25" "6.75" "7.25" ... $ Longitude: Factor w/ 720 levels "0.25","0.75",..: 1 1 1 1 1 1 1 1 1 1 ... $ Runoff : num 0.687 2.661 0 0 0 ... I can't use col.names=NA, as I do have column names! These are also required by Arc as identifiers. Also, as you can see, there are no complications in the variable names which, as you rightly say, can cause problems in Arc. If anyone has any further suggestions regarding how to overcome this problem of generating data from R on a Mac for input to ArcGIS on a PC, then I'd be very grateful to hear them. Many thanks again, Steve _ Got a cool Hotmail story? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Output and ArcGIS
Dear all, I've been using R on a Mac to process some data for export to ArcMap GIS (which only runs on Windows). ArcMap seems to require tab-delimited data (my data are in 3 columns), so I've been using the sep="\t" argument. However, this resulted in strange end-of-line characters when displayed on a PC. I looked in the write.table help file to find that eol="\r\n" can be used to get around this problem, and it does indeed prevent these unwanted characters from appearing. However, the data still aren't properly recognised by ArcMap when created on a Mac (the exact same code works fine from a PC). In ArcMap, when I select 'Display XY Data', the X (Longitude) and Y (Latitude) columns aren't available to select. It's as if Arc isn't correctly interpreting the output from R - this is despite me using col.names=TRUE in the write.table command. Any light shed on this will be very gratefully received. Many thanks for your help, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looping multiple dimensions
Dear all, I have 30 arrays, each with dimensions 720,360,12. The naming format for each of these 30 objects is: mrunoff_5221, mrunoff_5222... mrunoff_5250. For example: > str(mrunoff_5221) num [1:720, 1:360, 1:12] NA NA NA NA NA NA NA NA NA NA ... (the initial NA's are nothing to worry about) I am looking for a way by which I can extract each of the third dimension of these grids (1:12) in turn, along with the first and second dimensions, to create new objects in the following style: #2071 mrunoff_207101 <- mrunoff_5221[,,1] mrunoff_207102 <- mrunoff_5221[,,2] mrunoff_207103 <- mrunoff_5221[,,3] mrunoff_207104 <- mrunoff_5221[,,4] mrunoff_207105 <- mrunoff_5221[,,5] ...(etc. - up to [,,12]) #2072 mrunoff_207201 <- mrunoff_5222[,,1] mrunoff_207202 <- mrunoff_5222[,,2] mrunoff_207203 <- mrunoff_5222[,,3] mrunoff_207204 <- mrunoff_5222[,,4] mrunoff_207205 <- mrunoff_5222[,,5] ...(etc. - up to [,,12]) and mrunoff_ continues to 2100 and 5250 respectively. Clearly, this is a cumbersome and non-sustainable way to proceed! There will be 360 new objects in total, and I imagine that there must be a more effective way of achieving this, either via a loop or, possibly, one of the 'apply' functions. Yet my attempts to date have so far resulted in... well, a complete mess! If anyone has any suggestions as to a more efficient means of achieving this, then I'd be very grateful to hear them. Many thanks, Steve _ Tell us your greatest, weirdest and funniest Hotmail stories __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Testing the significance of gradients in trends
Dear all, I want to determine if the slopes of the trends I have in my plot are significantly different from each other (I have 2 time-series trends). What statistical test is most suitable for this purpose and is it available in the R base package? Many thanks, Steve _ Use Hotmail to send and receive mail from your different email accounts __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Testing for strength of fit using R
Dear all, I am trying to validate a model by comparing simulated output values against observed values. I have produced a simple X-y scatter plot with a 1:1 line, so that the closer the points fall to this line, the better the 'fit' between the modelled data and the observation data. I am now attempting to quantify the strength of this fit by using a statistical test in R. I am no statistics guru, but from my limited understanding, I suspect that I need to use the Chi Squared test (I am more than happy to be corrected on this though!). However, this results in the following: > chisq.test(data$Simulation,data$Observation) Pearson's Chi-squared test data: data$Simulation and data$Observation X-squared = 567, df = 550, p-value = 0.2989 Warning message: In chisq.test(data$Simulation, data$Observation) : Chi-squared approximation may be incorrect The ?chisq.test document suggests that the objects should be of vector or matrix format, so I tried the following, but still receive a warning message (and different results): > chisq.test(as.matrix(data[,4:5])) Pearson's Chi-squared test data: as.matrix(data[, 4:5]) X-squared = 130.8284, df = 26, p-value = 6.095e-16 Warning message: In chisq.test(as.matrix(data[, 4:5])) : Chi-squared approximation may be incorrect What am I doing wrong and how can I successfully measure how well the simulated values fit the observed values? If it's of any help, here are how my data are structured - note that I am only using columns 4 and 5 (Observation and Simulation). > str(data) 'data.frame': 27 obs. of 5 variables: $ Location : Factor w/ 27 levels "Australia","Brazil",..: 8 2 13 19 22 14 16 23 6 7 ... $ Vegetation : Factor w/ 21 levels "Beech","Broadleaf evergreen laurel",..: 17 21 2 16 15 16 9 16 3 4 ... $ Vegetation.Class: Factor w/ 4 levels "Boreal and Temperate Evergreen",..: 3 3 4 1 1 1 4 1 4 1 ... $ Observation : num 24 8.9 14.7 26.7 42.4 31.7 30.8 7.5 14 22 ... $ Simulation : num 33.9 7.8 9.74 7.6 11.8 10.7 12 28.1 1.7 1.7 ... I hope someone is able to point me in the right direction. Many thanks, Steve _ Have more than one Hotmail account? Link them together to easily access both http://clk.atdmt.com/UKM/go/186394591/direct/01/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset returning unexpected result
Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Bill=2C It seems to be 'character' - odd...! > str(int1901$Latitude) =A0chr [1:61537] "5.75" "6.25" "6.75" "7.25" "7.75" "8.25" ... Thanks again=2C Steve > What does str(int1901) show to be the type for Latitude? (I'm guessing > it's a factor.) > > -- > > David Winsemius=2C MD > Heritage Laboratories > West Hartford=2C CT > =20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset returning unexpected result
<77eb52c6dd32ba4d87471dcd70c8d70001f11...@na-pa-vbe03.na.tibco.com> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Dear Bill and all=2C Yep you were right - for some strange reason (I'm not sure how...)=2C the l= atitude data were of class 'character' instead of 'numeric'. I've put that = right and that seems to have fixed it! Many thanks for your help=2C Steve =20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subset returning unexpected result
Dear all, I am attempting to subset a data frame based on a range of latitude values. I want to extract the values of 'interception' where latitude ranges between 50 and 60. I am doing this using the following code, yet it doesn't return the results I expected: > test <- subset(int1901, Latitude>=50 & Latitude <60, select=c(Latitude, > Interception)) > head(test) Latitude Interception 2 6.25 0.04725863 3 6.75 67.02455139 82 50.75 51.74784088 83 51.25 57.04327774 84 51.75 51.51020432 85 52.25 53.30662537 As you can see, latitude values outside the 50 to 60 range have been retained (e.g. the top two rows of 'test'). Why is this, and how can I ensure that I subset the data as initially intended? Many thanks for any help offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trendline for a subset of data
Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Thanks Mark=2C the reg.line trick seemed to work really well. David - hopefully the hex-text will have gone now - if not=2C please accept= my apologies as=2C this is=2C as far as I know=2C the first time this has = happened. If it hasn't gone=2C then I'm afraid I'm a little clueless as to = how to remove it! Thanks again both of you=2C Steve =0A= _=0A= Save time by using Hotmail to access your other email accounts.=0A= http://clk.atdmt.com/UKM/go/167688463/direct/01/= __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trendline for a subset of data
Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Thanks for flagging up the 'segments' command. However=2C I'm having troubl= e getting it to work - this is probably due to me misunderstanding the docu= mentation for this command. The plot section of my script appears as follows: test3 <- 36:45 plot(data_means=2C type=3D"b"=2C pch=3D4=2C ylab=3D""=2C xlab=3D""=2C xaxt= =3D"n"=2C yaxt=3D"n"=2C col=3D3=2C ylim=3Dc(250=2C380)) abline(lm(data_means[36:45] ~ test3)=2C lty=3D2) As you can see=2C I'm only plotting points 36 to 45 from the object 'data_m= eans'. This intentionally results in much white space along the x-axis unti= l point 36 is reached. However=2C when inserting the trendline=2C this runs= along the entire length of the x-axis instead of just through points 36 to= 45. Ideally=2C a line which slightly overshoots the data subset would look= the best=2C but one constrained to the extents of the 36:45 would also do = the job just fine. So=2C my question is=2C how do I use 'segments' (or otherwise) to create a = linear trendline which only extends through points 36:45? (and overshoots a= t either end very slightly=2C if possible=2C rather than running along the = entire length of the x-axis). Many thanks again=2C Steve > CC: r-help@r-project.org > From: dwinsem...@comcast.net > To: smurray...@hotmail.com > Subject: Re: [R] Trendline for a subset of data > Date: Fri=2C 9 Oct 2009 09:27:43 -0400 > > > On Oct 9=2C 2009=2C at 5:50 AM=2C Steve Murray wrote: > >> >> Dear all=2C >> >> I am using abline(lm ...) to insert a linear trendline through a >> portion of my data (e.g. dataset[=2C36:45]). However=2C I am finding >> that whilst the trendline is correctly displayed and representative >> of the data portion I've chosen=2C the line continues to run beyond >> this data segment and continues until it intersects the vertical >> axes at each side of the plot. >> >> How do I display the line so that it only runs between point 36 and >> 45 (as shown in the example above) as doesn't continue to display a >> line throughout the rest of the plot space? >> > > ?segments > >> Many thanks=2C >> >> Steve >> > > David Winsemius=2C MD > Heritage Laboratories > West Hartford=2C CT > =0A= _=0A= View your other email accounts from your Hotmail inbox. Add them now.=0A= __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trendline for a subset of data
Dear all, I am using abline(lm ...) to insert a linear trendline through a portion of my data (e.g. dataset[,36:45]). However, I am finding that whilst the trendline is correctly displayed and representative of the data portion I've chosen, the line continues to run beyond this data segment and continues until it intersects the vertical axes at each side of the plot. How do I display the line so that it only runs between point 36 and 45 (as shown in the example above) as doesn't continue to display a line throughout the rest of the plot space? Many thanks, Steve _ View your other email accounts from your Hotmail inbox. Add them now. http://clk.atdmt.com/UKM/go/167688463/direct/01/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there any "month" object like "LETTERS" ?
month.abb should do the trick _ View your other email accounts from your Hotmail inbox. Add them now. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using 'unlist' (incorrectly?!) to collate a series of objects
Dear R Users, I am attempting to write a new netCDF file (using the ncdf) package, using 120 grids I've created and which are held in R's memory. I am reaching the point where I try to put the data into the newly created file, but receive the following error: > put.var.ncdf(evap_file, evap_dims, unlist(noquote(file_list))) Error in put.var.ncdf(evap_file, evap_dims, unlist(noquote(file_list))) : put.var.ncdf: error: you asked to write 31104000 values, but the passed data array only has 120 entries! I think I understand why this is: the 120 grids contain 31104000 values in total, however, it seems that only the names of the 120 objects are being passed to the file. Earlier on in the script, I generated the file names using the following code: > for (i in seq(nrow(index))) { file_list[[i]] <- paste(index$month[i], index$year[i], sep='') print(file_list[i]) } I was hoping therefore, that when I do put.var.ncdf and use the 'unlist' function (see original section of code), that since the data associated with the names of the grids are held in memory, both the names *and data* would be passed to the newly created file. However, it seems that only the names are being recognised. My question is therefore, is there an easy way of passing all 120 grids, using the naming convention held in file_list, to an object, which can subsequently be used in the put.var.ncdf statement? Many thanks for any help, Steve _ icons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a list of combinations
Dear R Users, I have 120 objects stored in R's memory and I want to pass the names of these many objects to be held as just one single object. The naming convention is month, year in sequence for all months between January 1986 to December 1995 (e.g. Jan86, Feb86, Mar86... through to Dec95). I hope to pass all these names (and their data I guess) to an object called file_list, however, I'm experiencing some problems whereby only the first (and possibly last) names seem to make the list, with the remainder recorded as 'NA' values. Here is my code as it stands: index <- expand.grid(month=month.abb, year=seq(from=86,to=95, by=1)) for (i in seq(nrow(index))) { file_list <- paste(index$month[i], index$year[i], sep='') print(file_list[i]) } Output is as follows: [1] "Jan86" [1] NA [1] NA [1] NA #[continues to row 120 as NA] > file_list; file_list[i] [1] "Dec95" [1] NA > head(index) # this seems to be working fine month year 1 Jan 86 2 Feb 86 3 Mar 86 4 Apr 86 5 May 86 6 Jun 86 Any help on how I can populate file_list correctly with all 120 combinations of month + year (without NAs!) would be gratefully received. Thanks, Steve _ icons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replacing NA values in one column of a data.frame
Dear all, I'm trying to replace NA values with - in one column of a data frame. I've tried using is.na and the testdata[testdata$onecolumn==NA] <- approach, but whilst neither generate errors, neither result in -s appearing - the NAs remain there! I'd be grateful for any advice on what I'm doing wrong or any other suitable approaches. Many thanks, Steve _ oticons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reshape package: Casting data to form a grid
Dear R Users, I'm trying to use the 'cast' function in the 'reshape' package to convert column-format data to gridded-format data. A sample of my dataset is as follows: head(finalframe) Latitude Longitude Temperature OrigLat p-value Blaney 1 -90-38.75 NA -87.75 17.10167 NA 2 -90135.75 NA -87.75 17.10167 NA 3 -90 80.25 NA -87.75 17.10167 NA 4 -90 95.75 NA -87.75 17.10167 NA 5 -90 66.75 NA -87.75 17.10167 NA 6 -90 75.75 NA -87.75 17.10167 NA I'm attempting to form a grid based on the OrigLat, Longitude and Blaney columns, to form the rows, columns and values of the new grid respectively. The command I've been using is: cast_test <- cast(finalframe, finalframe$OrigLat~variable, finalframe$Longitude~variable, finalframe$Blaney~variable) Error: Casting formula contains variables not found in molten data: finalframe$OrigLat, variable And I've tried removing the ~variable suffixes: cast_test <- cast(finalframe, finalframe$OrigLat, finalframe$Longitude, finalframe$Blaney) Error: Casting formula contains variables not found in molten data: -87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75 [etc etc] I'm not sure how to get round this error, nor what the 'molten data' is that the error is referring to. I'm assuming it means the data frame presented above, yet the variables are clearly present! Any help or advice on this would be most welcomed. Many thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rounding to the nearest 5
Dear all, A hopefully simple question: how do I round a series of values (held in an object) to the nearest 5? I've checked out trunc, round, floor and ceiling, but these appear to be more tailored towards rounding decimal places. Thanks, Steve _ icons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning values based on a separate reference (lookup) table
Thanks to you both for your responses. 3 I think these approaches will *nearly* do the trick, however, the problem is that the reference/lookup table is based on 'bins' of latitude values, eg.>61, 60-56, 55-51, 50-46 etc. whereas the actual data (in my 720 x 360 data frame) are not binned, e.g. 89.75, 89.25, 88.75, 88.25, 87.75 etc. - instead they 'increment' by -0.5 each time, and therefore many of the 367200 values which are in the data frame will have latitude values falling into the same 'reference' bin. It's for this reason that I think the 'merge' approach might fall down, unless there's a way of telling 'merge' that latitude can still be considered to match if they fall within a range. For example, if my 720 x 360 data frame has values whose corresponding latitude (row name) values are, say, 56.3, 55.9, 58.2, 56.8 and 57.3, then the original value in the grid needs to be assigned a 'p' value which corresponds with what is read off of the reference table from the bin 56-60. Hope this makes sense! If not, please feel free to ask for clarification. Many thanks again, Steve _ oticons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning values based on a separate reference (lookup) table
Dear R Users, I have a data frame of 360 rows by 720 columns (259200 values). For each value in this grid I am hoping to apply an equation to, to generate a new grid. One of the parts of the equation (called 'p') relies on reading from a separate reference table. This is Table 4 at: http://www.fao.org/docrep/s2022e/s2022e07.htm#3.1.3%20blaney%20criddle%20method (scroll down a little). Therefore, 'p' relies on the latitude of the values in the initial 360 x 720 data frame. The row names of the data frame contain the latitude values and these range from between 89.75 to -89.75 (the latter being South of the Equator). My question is, how do I go about forming a loop to read each of the 259200 values and assign it a 'p' value (from the associated reference table), based on it's latitude? My thinking was to do a series of 'if' statements, but this soon got very, very messy - any ideas which get the job done (and aren't a riddle to follow), would be most welcome. Many thanks for any advice, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ncdf package problem - put.var.ncdf
Dear all, I am attempting to convert 10 NetCDF files into a single NetCDF file, due to the data input requirements of a model I hope to use. I am using the ncdf package, version 1.6. The data are global-scale water values, on a monthly basis for 10 years (ie. 120 months of data in total; at present the data are separated by year, with 12 months of data in each file - mrunoff_1986 through to mrunoff_1995). For each month there are 720 longitude x 360 latitude values, each with a corresponding runoff value (although some of these may be NAs). My problem is that I'm getting an error with the put.var.ncdf command, as shown below. Here is my code so far: # READ IN NetCDF FILES FROM DISK library(ncdf) year <- 1986:1995 file_list <- paste("mrunoff-",year,".nc", sep="") file_list start <- 1986 for (i in file_list) { assign(paste("netcdf_",start,"_temp", sep=""),open.ncdf(i)) start = start+1 } # Start converting 10 files into 1 latitude <- get.var.ncdf(netcdf_1986_temp, "Lat") longitude <- get.var.ncdf(netcdf_1986_temp, "Lon") month <- get.var.ncdf(netcdf_1986_temp, "Mon") year <- 1986:1995 mrunoff_1986 <- get.var.ncdf(netcdf_1986_temp, "mrunoff") mrunoff_1987 <- get.var.ncdf(netcdf_1987_temp, "mrunoff") mrunoff_1988 <- get.var.ncdf(netcdf_1988_temp, "mrunoff") mrunoff_1989 <- get.var.ncdf(netcdf_1989_temp, "mrunoff") mrunoff_1990 <- get.var.ncdf(netcdf_1990_temp, "mrunoff") mrunoff_1991 <- get.var.ncdf(netcdf_1991_temp, "mrunoff") mrunoff_1992 <- get.var.ncdf(netcdf_1992_temp, "mrunoff") mrunoff_1993 <- get.var.ncdf(netcdf_1993_temp, "mrunoff") mrunoff_1994 <- get.var.ncdf(netcdf_1994_temp, "mrunoff") mrunoff_1995 <- get.var.ncdf(netcdf_1995_temp, "mrunoff") # Define variable dimensions dimx <- dim.def.ncdf("Lon", "deg E", as.double(longitude)) dimy <- dim.def.ncdf("Lat", "deg N", as.double(latitude)) month <- dim.def.ncdf("Mon", "Months: Jan 86=1, Dec 95=120)", 1:120) year <- dim.def.ncdf("Year", "year", year) # Assign data: extract mrunoff from each of the 10 files and put into one place mrunoff_data <- dim.def.ncdf("mrunoff", "mm/month", c(mrunoff_1986, mrunoff_1987, mrunoff_1988, mrunoff_1989, mrunoff_1990, mrunoff_1991, mrunoff_1992, mrunoff_1993, mrunoff_1994, mrunoff_1995)) # Define runoff variable mrunoff_dims <- var.def.ncdf("mrunoff_out", "mm/month", list(dimx, dimy, month), -.0, "Global Monthly Runoff for 1986-1995", "double") # Create file mrunoff_file <- create.ncdf("mrunoff.nc", mrunoff_dims) # Put mrunoff data into the file put.var.ncdf(mrunoff_file, mrunoff_dims, mrunoff_data) # Write to disk # close.ncdf(mrunoff_file) However, when I run the code, I get the following error message: > put.var.ncdf(mrunoff_file, mrunoff_dims, mrunoff_data) Error in put.var.ncdf(mrunoff_file, mrunoff_dims, mrunoff_data) : put.var.ncdf: error: you asked to write 31104000 values, but the passed data array only has 8 entries! I can understand where the 31104000 comes from ((720*360)*12)*10, but am confused as to why only 8 values are being passed to put.var.ncdf. I therefore tried doing a couple of tests to shed some light on this: > length(dimx); length(dimy); length(mrunoff_dims); length(mrunoff_data) [1] 8 [1] 8 [1] 9 [1] 8 > str(dimx); str(dimy) List of 8 $ name : chr "Lon" $ units: chr "deg E" $ vals : num [1:720] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 ... $ len : int 720 $ id : num -1 $ unlim: logi FALSE $ dimvarid : num -1 $ create_dimvar: logi TRUE - attr(*, "class")= chr "dim.ncdf" List of 8 $ name : chr "Lat" $ units: chr "deg N" $ vals : num [1:360] -89.8 -89.2 -88.8 -88.2 -87.8 ... $ len : int 360 $ id : num -1 $ unlim: logi FALSE $ dimvarid : num -1 $ create_dimvar: logi TRUE - attr(*, "class")= chr "dim.ncdf" > str(mrunoff_dims); str(mrunoff_data) List of 9 $ name: chr "mrunoff_out" $ units : chr "mm/month" $ missval : num - $ longname: chr "Global Monthly Runoff for 1986-1995" $ id : num -1 $ prec: chr "double" $ dim :List of 3 ..$ :List of 8 .. ..$ name : chr "Lon" .. ..$ units: chr "deg E" .. ..$ vals : num [1:720] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 ... .. ..$ len : int 720 .. ..$ id : num -1 .. ..$ unlim: logi FALSE .. ..$ dimvarid : num -1 .. ..$ create_dimvar: logi TRUE .. ..- attr(*, "class")= chr "dim.ncdf" ..$ :List of 8 .. ..$ name : chr "Lat" .. ..$ units: chr "deg N" .. ..$ vals : num [1:360] -89.8 -89.2 -88.8 -88.2 -87.8 ... .. ..$ len : int 360 .. ..$ id : num -1 .. ..$ unlim: logi FALSE .. ..$ dimvarid : num -1 .. ..$ create_dimvar: logi TRUE .. ..- attr(*, "class")= chr "dim.ncdf" ..$ :List of 8 .. ..$ name : chr "Mon" .. ..
[R] Writing a NetCDF file in R
Dear all, I am attempting to convert 10 NetCDF files into a single NetCDF file, due to the data input requirements of a model I hope to use. I am using the ncdf package, version 1.6. The data are global-scale water values, on a monthly basis for 10 years (ie. 120 months of data in total; at present the data are separated by year, with 12 months of data in each file - mrunoff_1986 through to mrunoff_1995). For each month there are 720 longitude x 360 latitude values, each with a corresponding runoff value (although some of these may be NAs). Here is my code so far: # READ IN NetCDF FILES FROM DISK library(ncdf) year <- 1986:1995 file_list <- paste("mrunoff-",year,".nc", sep="") file_list start <- 1986 for (i in file_list) { assign(paste("netcdf_",start,"_temp", sep=""),open.ncdf(i)) start = start+1 } # Start converting 10 files into 1 latitude <- get.var.ncdf(netcdf_1986_temp, "Lat") longitude <- get.var.ncdf(netcdf_1986_temp, "Lon") month <- get.var.ncdf(netcdf_1986_temp, "Mon") year <- 1986:1995 mrunoff_1986 <- get.var.ncdf(netcdf_1986_temp, "mrunoff") mrunoff_1987 <- get.var.ncdf(netcdf_1987_temp, "mrunoff") mrunoff_1988 <- get.var.ncdf(netcdf_1988_temp, "mrunoff") mrunoff_1989 <- get.var.ncdf(netcdf_1989_temp, "mrunoff") mrunoff_1990 <- get.var.ncdf(netcdf_1990_temp, "mrunoff") mrunoff_1991 <- get.var.ncdf(netcdf_1991_temp, "mrunoff") mrunoff_1992 <- get.var.ncdf(netcdf_1992_temp, "mrunoff") mrunoff_1993 <- get.var.ncdf(netcdf_1993_temp, "mrunoff") mrunoff_1994 <- get.var.ncdf(netcdf_1994_temp, "mrunoff") mrunoff_1995 <- get.var.ncdf(netcdf_1995_temp, "mrunoff") # Define variable dimensions dimx <- dim.def.ncdf("Lon", "deg E", as.double(longitude)) dimy <- dim.def.ncdf("Lat", "deg N", as.double(latitude)) month <- dim.def.ncdf("Mon", "Months: Jan 86=1, Dec 95=120)", 1:120) year <- dim.def.ncdf("Year", "year", year) # Assign data: extract mrunoff from each of the 10 files and put into one place mrunoff_data <- dim.def.ncdf("mrunoff", "mm/month", c(mrunoff_1986, mrunoff_1987, mrunoff_1988, mrunoff_1989, mrunoff_1990, mrunoff_1991, mrunoff_1992, mrunoff_1993, mrunoff_1994, mrunoff_1995)) # Define runoff variable mrunoff_dims <- var.def.ncdf("mrunoff_out", "mm/month", list(dimx, dimy, month), -.0, "Global Monthly Runoff for 1986-1995", "double") # Create file mrunoff_file <- create.ncdf("mrunoff.nc", mrunoff_dims) # Put mrunoff data into the file put.var.ncdf(mrunoff_file, mrunoff_dims, mrunoff_data) # Write to disk # close.ncdf(mrunoff_file) However, when I run the code, I get the following error message: > put.var.ncdf(mrunoff_file, mrunoff_dims, mrunoff_data) Error in put.var.ncdf(mrunoff_file, mrunoff_dims, mrunoff_data) : put.var.ncdf: error: you asked to write 31104000 values, but the passed data array only has 8 entries! I can understand where the 31104000 comes from ((720*360)*12)*10, but am confused as to why only 8 values are being passed to put.var.ncdf. I therefore tried doing a couple of tests to shed some light on this: > length(dimx); length(dimy); length(mrunoff_dims); length(mrunoff_data) [1] 8 [1] 8 [1] 9 [1] 8 > str(dimx); str(dimy) List of 8 $ name : chr "Lon" $ units: chr "deg E" $ vals : num [1:720] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 ... $ len : int 720 $ id : num -1 $ unlim: logi FALSE $ dimvarid : num -1 $ create_dimvar: logi TRUE - attr(*, "class")= chr "dim.ncdf" List of 8 $ name : chr "Lat" $ units: chr "deg N" $ vals : num [1:360] -89.8 -89.2 -88.8 -88.2 -87.8 ... $ len : int 360 $ id : num -1 $ unlim: logi FALSE $ dimvarid : num -1 $ create_dimvar: logi TRUE - attr(*, "class")= chr "dim.ncdf" > str(mrunoff_dims); str(mrunoff_data) List of 9 $ name: chr "mrunoff_out" $ units : chr "mm/month" $ missval : num - $ longname: chr "Global Monthly Runoff for 1986-1995" $ id : num -1 $ prec: chr "double" $ dim :List of 3 ..$ :List of 8 .. ..$ name : chr "Lon" .. ..$ units: chr "deg E" .. ..$ vals : num [1:720] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 ... .. ..$ len : int 720 .. ..$ id : num -1 .. ..$ unlim: logi FALSE .. ..$ dimvarid : num -1 .. ..$ create_dimvar: logi TRUE .. ..- attr(*, "class")= chr "dim.ncdf" ..$ :List of 8 .. ..$ name : chr "Lat" .. ..$ units: chr "deg N" .. ..$ vals : num [1:360] -89.8 -89.2 -88.8 -88.2 -87.8 ... .. ..$ len : int 360 .. ..$ id : num -1 .. ..$ unlim: logi FALSE .. ..$ dimvarid : num -1 .. ..$ create_dimvar: logi TRUE .. ..- attr(*, "class")= chr "dim.ncdf" ..$ :List of 8 .. ..$ name : chr "Mon" .. ..$ units: chr "Months since Jan 1986 (Jan 86 =1)" .. ..$ vals : int [1:120] 1 2 3 4 5 6 7 8 9 10 ... .. ..$ l
Re: [R] New line operator in mtext
Thanks again for a very useful comment. That seems to have separated the text and put it onto separate lines. However, whilst this results in the text being centralised in relation to the axis, it means that the lower line is left-justified in relation to the upper line, rather than being centralised. How do I go about centralising the lower line in relation to the upper line, whilst keeping it central to the axis? Many thanks again, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New line operator in mtext
Thanks for the response, however, whilst this eliminates the 'new line' character from appearing, it doesn't actually cause a new line to be printed! Instead, I have the last few characters of the first character string overlapping with the first few characters of the next. How best should I change the code to execute a new line? Thanks again! _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New line operator in mtext
Dear R Users, I'm finding that when I execute the following bit of code, that the new line argument is actually displayed as text in the graphics device. How do I avoid this happening? mtext(side=2, line=5.5, expression(paste("Monthly Summed Runoff (mm/month)", "/n", "and Summed Monthly Precipitation (mm x ",10^2,"/month)"))) I suspect that I've done, or omitted, something fairly obvious, but as yet cannot see it! Thanks for your help, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Superscript in y-axis of plot
Dear all, I've been trying to superscript the '2' in the following command (I don't want the '^' displayed), but as yet haven't had much luck. I've tried both the paste and expression commands, but neither have brought me any joy! mtext(side=2, line=5.5, "Monthly Precipitation (mm x 10^2/month)", font=2, cex=1.1) Any advice would be much appreciated, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aligning axis values when plotting more than one graph on same axes
Thanks for the replies. This is a simple example which demonstrates exactly the problems I'm facing. As you can see, neither the x or y axes line up consistantly. > barplot(1:12, names.arg=substr(month.abb, 1,1)) > par(new=TRUE) > plot(1:12, 1:12, type="b", lwd=2, col="red", xaxt="n") I'd be surprised if there isn't a solution to this, seeing as they're two essential graph types in the base package? Any pointers would be most welcome! Thanks again, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Aligning axis values when plotting more than one graph on same axes
Dear R Users, I am trying to plot a barchart with a line graph superimposed (using par(new=TRUE)). There are 12 bars and 12 corresponding points for the line graph. This is fine, except that I'm encountering two problems: 1) The position of the points (of the line graph) are not centred on the middle (horizontally) of each corresponding bar. In fact, whilst the first point is located on the left-most side of the first bar, subsequent points drift further towards the right of the bars, until the final point is at the right-most position on the final bar. I have used names.arg=substr(month.abb, 1, 1) to represent the first letter of each month for the barplot and xaxt="n" for the overplotting line graph. Is there a way of properly aligning the x-axis values so that the points are centred horizontally on the bars? 2) Similarly, for both plots, I have set ylim=c(1, 85000) so that both y-axes are in proportion with one another. However, when I allow both graphs to plot their y-axis values, it becomes apparent that there is a slight offset again in where the axis labels are positioned. For example, 8 is positioned notably higher up the y-axis for the barchart than for the line graph. (N.B. I have used xpd=FALSE for the barchart to prevent bars falling outside of the plot area). How do I ensure that the y-axis labels are correctly aligned so that they are common to both graphs? Many thanks for your help, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Memory problems when using ifelse
Dear R Users, I have a data frame of 4 columns and ~58000 rows, the top of which looks like this: > head(max_out) Latitude Longitude Model Obs 1-0.25-49.25 4 4 2-0.25-50.25 4 5 3-0.25-50.75 4 4 4-0.25-51.25 311 5-0.25-51.75 6 4 6-0.25-52.2512 5 The above shows, for each coordinate point, the month (1 = January, 12 = December) in which the maximum value occurs for the variable I'm testing. I'm hoping to add an extra column onto this data frame to show the difference between the model and the observations, in the form of: max_out[3] - max_out[4]. This is fine for the simple cases, but row 6 is an example where this approach fails. I know from the data that model is predicting the maximum value too early (i.e. -5 months) rather than 7 months too late (as implied by the simple max_out[3] - max_out[4] calculation). In order to get round such cases, if the result of the simple calculation is>6, then I want to add 12 to the offending value (!) in the 'Obs' column in order to get the 'correct' value. For example, in row 6, as the current calculation results in a difference of 7, I instead want to do 12-(5+12) (corresponding to Obs+12 - Model) = -5. Similarly, in row 4, I'd like to do the reverse (i.e. -12 for where the calculation results in a value <- cbind(max_out[1:2], ifelse(max_out[3] - max_out[4]> 6, max_out[3] - (max_out[4]+12), max_out[3] - max_out[4])) # If diff is not>6, then output original difference value Error: cannot allocate vector of size 221 Kb In addition: Warning messages: 1: In data.frame(..., check.names = FALSE) : Reached total allocation of 999Mb: see help(memory.size) 2: In data.frame(..., check.names = FALSE) : Reached total allocation of 999Mb: see help(memory.size) I suspect I'm doing something wrong, as the calculation shouldn't be that computationally intense! Any help or advice to help get me on the right tracks would be much appreciated. Many thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing a legend title to bold font
Dear all, I have seen a response from Duncan Murdoch which comes close to solving this one, but I can't quite seem to tailor it to fit my needs! I am trying to make just the title in my legend as bold font, with the legend 'items' in normal typeface. I've tried setting par(font=2) external to the legend command, but this makes everything bold. I've also tried (in the legend code), font.main=2, but I didn't have any luck with this. Any suggestions would be most welcome, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Read.table problems
Dear all, I have a file which I've converted from NetCDF (.nc) to text (.txt) using ncdump in Unix (as I had problems using the ncdf package to do this). The first few rows (as copied and pasted from the Unix console) of the file appear as follows: _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, As you can see, there are a lot of NA values before the actual numeric values start further down the dataset. My problem is that I'm having trouble reading this file into R. I think the problem lies with the sep= argument, although I may be wrong. I tried the following command at first, as the data appear to be comma separated: > read.table("test86.txt", skip=43, na.strings="-", header=FALSE, sep=",") -> > test86 # skip =43 due to meta-data information being held in the initial rows Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 29 did not have 25 elements I then tried sep=" ", followed by sep="" but received a similar-type error message (although line 29 doesn't appear to be especially different from the rest). I subsequently tried using sep=\t and then sep=\n. These both result in the data being read in without an error message being displayed, although the data are formatted as follows: > head(test86) V1 1 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 2 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 3 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 4 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 5 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 6 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, > dim(test86) [1] 179899 1 Instead of one column, I'd expect there to be 720. I think I'm getting something wrong relating to the sep= argument (or possibly mis-using na.strings?). If anyone has any solutions to this then I'd be very grateful to hear them. Many thanks for any advice, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple plotting errors
Many thanks once more for helping me to solve this. Gabor - I wasn't even aware of month.abb, so thanks for bringing this useful trick to my attention! Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple plotting errors
Thanks for all the useful information; use of 'c(...)' did the trick, although in future I'll try to hold the data in a more user-friendly setup. I've now got a plot, but have two issues that I can't seem to resolve: 1, The ylab is overlapping the y-axis tick mark values. I've tried using oma and mar to adjust the outer and plot margins respectively, but this doesn't seem to 'detach' the overlapping text. 2. The x-axis currently has tick mark values of 2 to 12. How do change this to single-letter month labels? So far I've tried xlim=c("J","F","M",A","M"...) and names.arg=c("J","F","M"...), but these result in errors. Any suggestions would be much appreciated. Thanks again, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple plotting errors
Dear R Users, I have 12 data frames, each of 12 rows and 2 columns. e.g. FeketeJAN MEANSUM_ AMAZON 144.4997874 68348.4 NILE 5.4701955 1394.9 CONGO71.3670036 21196.0 MISSISSIPPI 18.9273250 6511.0 AMUR 1.8426874 466.2 PARANA 58.3835497 13486.6 YENISEI 1.4668313 592.6 OB1.4239179 559.6 LENA 0.9342164 387.7 NIGER 4.7245709 826.8 ZAMBEZI 76.6893794 8665.9 YANGTZE 10.6759257 1729.5 I want to do a line plot of the value of Amazon 'Sum' (in this case, 68348.4) for each of the 12 data frames. I've tried doing this as follows: plot(FeketeJAN[1,2], FeketeFEB[1,2], FeketeMAR[1,2], *through to December* type="l") but receive: Error in strsplit(log, NULL) : non-character argument I've also tried: plot(FeketeJAN$AMAZON[,2], FeketeFEB$AMAZON[,2], *through to December* type="l") but receive: Error in plot.window(...) : need finite 'xlim' values In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf What is it that I'm doing wrong?! Many thanks for any advice, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Positioning a legend via X and Y coordinates
Dear R Users, I'm able to display a legend using the following code: > legend("topright", c("Simulation", "Observation"), fill=2:3, bty="n") However, this causes the legend to be positioned too close to the bars in my barplot. I'd like to move the legend up slightly. I have been trying to determine the necessary values by trial and error to do this manually (by entering a coordinate) - however, I can't seem to get the legend to display! I must be off-limits, although I have tried a range of values. I can do a temporary fix by doing locator(1), but I'd ideally like to learn how to determine the range of coordinate values 'available' within the graphics console in order to state the x and y values for legend positioning in the legend command. Is there an easy way to do this? Thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] X and Y axis labels on a barplot
Thanks Jim, that's great. Based on the information in the previous messages, is it possible to change the y-axis as I'd hoped? Thanks again, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] X and Y axis labels on a barplot
> dput(total_sums) structure(c(17722202.6898231, 15276602.215475, 16875888.5155229, 14086271.625756, 18626581.9628846, 15387747.481166, 18428414.8535184, 15560882.404998, 17611181.5207881, 14905453.195546, 17290016.3934661, 14939493.120707, 16819288.8227961, 13979000.614402, 17657959.3656573, 14814672.426469, 17803561.0042762, 15003711.075902, 18016143.3425573, 14464426.596292), .Dim = c(2L, 10L), .Dimnames = list(c("Sim_1986", "X1986"), c("sums86", "sums87", "sums88", "sums89", "sums90", "sums91", "sums92", "sums93", "sums94", "sums95"))) Wasn't aware of dput - y'learn something new every day! Hope this helps, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] X and Y axis labels on a barplot
A very reasonable request! Sorry for not doing this initially, but please find below the data I am trying to plot: > total_sums sums86 sums87 sums88 sums89 sums90 sums91 sums92 Sim_1986 17722203 16875889 18626582 18428415 17611182 17290016 16819289 X198615276602 14086272 15387747 15560882 14905453 14939493 13979001 sums93 sums94 sums95 Sim_1986 17657959 17803561 18016143 X198614814672 15003711 14464427 This is the command I've successfully managed to execute: > barplot(total_sums, beside=TRUE, col=(2:3), las=2) ...however, I'm running into problems when trying to change the default x-axis tick marks from sumsXX to 1986:1995. I'm also trying to change the format of the y-axis values as mentioned in my previous post. Many thanks again, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] X and Y axis labels on a barplot
Dear all, I have produced a barplot and wish to alter the axes a little. In place of the variable names appearing on the x-axis, I'd like to have the numbers 1986 to 1995. I have tried using the argument xlim=c(1986,1995) in the barplot command but receive: "Error in plot.window(xlim, ylim, log = log, ...) : invalid 'xlim' value" Also, on the y-axis, the values are currently displayed in the format e.g. 1.5e+07 - how do I go about converting such values into normal notation, e.g. 1500 ? Many thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting pairs of bars
Jim and all, Thanks for the suggestion, however, I get the following error: > barplot(t(combine86[,1:2], beside = TRUE, las = 1)) Error in t(combine86[, 1:2], beside = TRUE, las = 1) : unused argument(s) (beside = TRUE, las = 1) I've looked up ?t and cannot see any extra arguments that I should be including, and the executes without the 't'. What is it that I've omitted that I need to successfully execute the code (as previously described)? Many thanks again, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting pairs of bars
Thanks for the reply - the 'beside' argument certainly looks useful, although I'm still not getting the output I'd hoped for. By doing: barplot(combine86[,1:2], beside = TRUE, las = 1, xlab=rownames(combine86)) ...I get all the bars for the 'Sim Mean' column plotted on the left side of the graphics device and all the bars for the 'Obs Mean' clustered on the right side. Ideally I'd like bar 1 to be from 'Sim Mean' for Amazon and then bar 2 for 'Obs Mean' for Amazon. Then there would be a small gap separating the Amazon from the next pair of bars of the next river (Nile). Then it would be the 'Sim Mean' value for Nile, followed by the 'Obs Mean' value for Nile, then a gap, then onto the next river and so on. Thanks for any help, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting pairs of bars
Dear all, I have a matrix called combine86 which looks as follows: > combine86 Sim Mean Obs Mean Sim Sum Obs Sum AMAZON 1172.0424 1394.44604 553204 659573 NILE 262.4440 164.23921 67973 41881 CONGO682.8007 722.63971 205523 214624 MISSISSIPPI 363.0758 142.59883 124535 49054 AMUR 143.585789.30434 36040 22594 PARANA 702.3793 388.03030 162952 89635 YENISEI 208.1396 174.52722 83464 70509 OB 197.0399 162.82697 79013 63991 LENA 118.110077.49638 48307 32161 NIGER374.8258 212.25714 66719 37145 ZAMBEZI 500. 485.87610 57000 54904 YANGTZE 358.4172 256.80246 58422 41602 For each of the rivers (which are the row names of this matrix), I wish to plot a bar for Simulated Mean and another for the Observed Mean. So far I've only been able to get R to stack the bars (using 'barplot) on top of one another, which isn't really what I want! I was hoping more for a pairing of bars (one 'Sim' and one 'Mean') followed by a gap, then the next pair of bars for the next river, a gap, and so on. Is this possible to do in R? If so, how?! Many thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Producing a legend successfullly
Dear all, I'm attempting to insert a legend into a line graph. I've sorted out the positioning, but I'm unable to display the sample line and associated colour to go within the legend box. Instead, under the variable names, the numbers 1, 2, 2, 3 are displayed in a column (with '2' repeated twice). This is the code I'm using: legend(80,1150, c("Simulation", "Observation", lty=1:2, col=2:3) How do I go about displaying a red solid line next to 'Simulation' and a green dashed line next to 'Observation' (and if necessary, remove the numbers that are currently displayed)? Many thanks for any help offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reversing axis label order
Dear R Users, I am executing the following command to produce a line graph: matplot(aggregate_1986[,1], aggregate_1986[,2:3], type="l", col=2:3) On the x-axis I have values of Latitude (in column 1) ranging from -60 to +80 (left to right on the x-axis). However, I wish to have these values shown in reverse on the x-axis, going from +80 to -60 (ie. North to South in terms of Latitude). I have tried doing this by altering the command as follows: matplot(-aggregate_1986[,1], aggregate_1986[,2:3], type="l", col=2:3) ...but this produces the inverse sign of the latitude values along the axis - ie. it goes from -80 to +60. How do I reverse the display of the axis labels correctly and of course, maintain the associated data values correctly? Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems producing a simple plot
Dear R Users, I have a data frame of the nature: > head(aggregate_1986) Latitude Mean Annual Simulated Runoff per 1° Latitudinal Band 1 -55574.09287 2 -54247.23078 3 -53103.40756 4 -52 86.1 5 -51 45.21980 6 -50 55.92928 Mean Annual Observed Runoff per 1° Latitudinal Band 1491.9525 2319.6592 3237.8222 4179.5391 5105.2968 6136.6124 I am hoping to plot columns 2 and 3 against Latitude. I understand that you have to do this by plotting one column at a time, so I have been starting by attempting the following, but receiving errors: > plot(aggregate_1986[1] ~ aggregate_1986[2], type="l") Error in model.frame.default(formula = aggregate_1986[1] ~ aggregate_1986[2], : invalid type (list) for variable 'aggregate_1986[1]' Or... > plot(aggregate_1986[1],aggregate_1986[2], type="l") Error in stripchart.default(x1, ...) : invalid plotting method I'm obviously doing something fundamentally wrong (!). I'd be grateful for any suggestions as to how I should go about this. Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with 'valid columns' when using merge
Dear all, I am trying to use 'merge' within a loop, however, I receive an error relating to the 'by' argument of the command, as follows: > merge_year <- 1986 > > for (i in 1:10) { # Number of file pairs + assign(paste("merged_arunfek_", merge_year, sep=""), merge(x=paste("arunoff_",start_arunoff, sep=""), y=paste("fekete_", start_fekete, sep=""), by=c("Latitude", "Longitude"), sort=FALSE)) + attach(paste("merged_arunfek_", merge_year)) + merge_year = merge_year+1 + } Error in fix.by(by.x, x) : 'by' must specify valid column(s) However, as far as I can tell, the column names (as stated in the above code) appear to be valid: > colnames(arun_1986) [1] "Latitude" "Longitude" "Sim_1986" > colnames(fekete_1986) [1] "Latitude" "Longitude" "X1986" I'm trying to merge based on both the Latitude and Longitude column and have used by=c("name_x", "name_y") before without too many problems. Any suggestions would be gratefully received. Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using 'aggregate' when dependent on row value increments
Many thanks for the very useful responses in such a short time. I'm not a former SAS user - more a naive R user who didn't realise that a sort wasn't necessary! Jim, your solution worked really well - thanks. Thanks again for the great solutions. Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using 'aggregate' when dependent on row value increments
Dear all, I have a data frame of three columns, which I have sorted by Latitude as follows: > test2[60:80,] Latitude Longitude Sim_1986 6194885.25-29.25 2.175345 6195785.25-28.75 8.750486 6196785.25-28.25 33.569305 6197785.25-27.75 23.702572 6198885.25-27.25 26.488602 6200085.25-26.75 23.915724 6201285.25-26.25 25.055082 6202785.25-25.75 26.609823 6204785.25-25.25 28.813068 6206684.25-24.75 25.069952 5234184.75-82.25 34.940380 5243484.75-81.75 56.192116 5253184.75-81.25 41.409431 5261683.75-80.75 56.717590 5270183.75-80.25 68.887123 5278183.75-79.75 74.133286 5286583.75-79.25 41.309422 5295182.25-78.75 69.863419 5305282.25-78.25 21.480116 5316482.25-77.75 58.799141 5597982.25-68.75 70.028358 What I am hoping to do is to use the aggregate command to calculate the mean of Sim_1986' per 1-degree increment of Latitude. So, using the above subset of the data frame as an example, a mean would be produced based on the Sim_1986 values between where Latitude 85, 84, 83, 82. The maximum latitude in the dataset as a whole is 83.75 and the minimum is -55.75. Is it possible to also output corresponding latitude values for each 'grouped mean', so that I can easily associate each mean value with its latitudinal band? Many thanks for any help offered, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding rows common to two datasets
Thanks for the reply, however, when I do the following command, I receive the message: 'data frame with 0 columns and 0 rows'. I've checked again though, and there should be several thousand rows where the Latitude and Longitude pairs are the same. > common <- intersect(data_frame_x[c("Latitude", "Longitude")], > data_frame_y[c("Latitude","Longitude")]) > common data frame with 0 columns and 0 rows Is there an obvious solution to this? Should I be using 'unique' instead, and if so, how would I get the above to correspond to this command? Thanks, Steve > Date: Tue, 28 Apr 2009 13:36:51 +0530 > Subject: Re: [R] Finding rows common to two datasets > From: umesh.sriniva...@gmail.com > To: smurray...@hotmail.com > CC: r-help@r-project.org > > Dear Steve, > > Try > > ? intersect > > and see if that might help. > > Cheers, > Umesh > > On Tue, Apr 28, 2009 at 1:29 PM, Steve Murray> wrote: > > > > Dear all, > > > > I have 2 data frames, both with 14 columns of data and differing numbers of > rows. The first two columns are 'Latitude' and 'Longitude'. I want to find > the pairs of Latitude and Longitude coordinates which are common to both > datasets, and output a new data frame which is composed of these coincident > rows. I tried using the 'unique' command, but had difficulties interpreting > the help file. > > > > > Many thanks for any help offered, > > > > Steve > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding rows common to two datasets
Dear all, I have 2 data frames, both with 14 columns of data and differing numbers of rows. The first two columns are 'Latitude' and 'Longitude'. I want to find the pairs of Latitude and Longitude coordinates which are common to both datasets, and output a new data frame which is composed of these coincident rows. I tried using the 'unique' command, but had difficulties interpreting the help file. Many thanks for any help offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshape - strange outputs
Genius! Thanks very much Hadley - that was surprisingly easier to solve than I was anticipating! As a way of offering something back in return, I don't know if you plan to release a new version of the reshape package, but here's a suggestion to consider, just in case you do. On the basis of what I've learned from this experience, how about including a simple line of code in the next release, along the lines of is.data.frame(object_name) - if TRUE then proceed as normal, if FALSE then issue a warning/error message to inform the user that the procedure may not execute as intended and suggest coercing to a data frame. Apologies if I've overlooked something fundamental which prevents the need or feasibility of this idea - just thought I'd bring it up in case it is useful! Thanks again and all the best, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reshape - strange outputs
Dear R Users, I am using the reshape package to reformat gridded data into column format using the code shown below. However, when I display the resulting object, a single column is fomed (instead of three) and all the latitude values (which should be in either column one or two) are collected at the bottom. Also, the NA values aren't removed, despite this being requested in the code. Code: # NetCDF file has been read in and is being processed... arunoff_1986_temp <- get.var.ncdf(netcdf_1036_temp, "arunoff") # Assign row and column names columnnames <- sprintf("%.2f", seq(from = -89.75, to = 89.75, length = 360)) rnames <- sprintf("%.2f", seq(from = -179.75, to = 179.75, length = 720)) colnames(arunoff_1986_temp) <- columnnames rownames(arunoff_1986_temp) <- rnames # Melt into columnar format arunoff_1986$Longitude <- rownames(arunoff_1986) # Note: If I do: arunoff_1986$Latitude <- rownames(arunoff_1986) (i.e. change it to 'Latitude', I get the following: Warning message: In arunoff_1986$Latitude <- rownames(arunoff_1986) : Coercing LHS to a list ), thus proceed using 'Longitude' where no warning is apparent. arunoff_long_1986 <- melt(arunoff_1986, id.var="Longitude", na.rm=TRUE) > dim(arunoff_long_1986) [1] 259560 2 >head(arunoff_long_1986, n=10) # This displays what looks like one single >column, but is in fact two: column entitled "L1" is empty until the end of the >file, as shown by the 'tail' command below. value L1 1 2 3 4 5 6 7 8 9 10 > tail(arunoff_long_1986, n=10) value L1 Latitude.351 85.25 Latitude Latitude.352 85.75 Latitude Latitude.353 86.25 Latitude Latitude.354 86.75 Latitude Latitude.355 87.25 Latitude Latitude.356 87.75 Latitude Latitude.357 88.25 Latitude Latitude.358 88.75 Latitude Latitude.359 89.25 Latitude Latitude.360 89.75 Latitude I'd be very grateful indeed if anyone is able to offer assistance by way of pointing out what I've done wrong. I've spent a long time working on this and trying various options, but am srill none the wiser! I'm aiming for three columns: Latitude, Longitude, 'value'. Many thanks for any help offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouping of data frames to calculate means
Dear R Users, I have 120 data frames of the format table_198601, table_198602... table_198612, table_198701, table_198702... table_198712 through to table_199512 (ie. the first 4 digits are years which vary from 1986 to 1995, and the final two digits are months which vary from 01 to 12 for each year). I simply hope to find the means of column 3 of each of the 120 tables without having to type out mean(table_198601[3]) etc etc each time. How would I go about doing this? And how would I go about finding the mean of all the January months (01) from say 1986 to 1990? Finally, I hope to be able to plot (as a scatter graph) the values of column 1 against the mean of those from column 3 for all the months in the period 1989 to 1990 and then 1991 to 1995. Any help offered would be very much appreciated. Thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reshape: 'melt' numerous objects
Dear R Users, I'm trying to use the reshape package to 'melt' my gridded data into column format. I've done this before on individual files, but this time I'm trying to do it on a directory of files (with variable file names) - therefore I have to also use the 'assign' command. I have come up against a couple of problems however and am therefore seeking advice... > assign(paste("Fekete_table_temp", index$year[i], index$month[i], > sep='')$Latitude,rownames(Fekete_198601)) # Using the row names of a given > file Error in paste("Fekete_table_temp", index$year[i], index$month[i], sep = "")$Latitude : $ operator is invalid for atomic vectors To get round this, I did: assign(paste("Fekete_table_temp", index$year[i], index$month[i], sep='')["Latitude"],rownames(Fekete_198601)) And for the actual loop in which the files are melted, I tried: for (i in 1:120) { Fekete_table_temp <- get(paste("Fekete_", index$year[i], index$month[i], sep='')) Fekete_table_long <- melt(Fekete_table_temp, id.var="Latitude", na.rm=TRUE) assign(paste("Fekete_long_", index$year[i], index$month[i], sep=''), Fekete_table_long) names(paste("Fekete_long_", index$year[i], index$month[i], sep=''), c("Latitude", "Longitude", paste("Obs",index$year[i], index$month[i], sep='')) } However, this results in: Error: id variables not found in data: Latitude ...despite me having (supposedly) told it where 'Latitude' is, in 'Fekete_table_temp'. What have I done wrong here?! And more importantly, how do I put it right?! Many thanks for any help, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Row/columns names within 'assign' command
Dear all, I am attempting to add row and column names to a series of tables (120 in total) which have 2 variable parts to their name. These tables are created as follows: # Create table indexes index <- expand.grid(year = sprintf("%04d", seq(1986, 1995)), month = sprintf("%02d", 1:12)) # Read in and assign file names to individual objects with variable name components for (i in seq(nrow(index))) { assign(paste("Fekete_",index$year[i], index$month[i], sep=''), read.table(file=paste("C:\\Data\\comp_runoff_hd_", index$year[i], index$month[i], ".asc", sep=""), header=FALSE, sep="")) # Create index of file names files <- print(ls()[1:120], quote=FALSE) # This is the best way I could manage to successfully attribute all the table names to a single list - I realise it's horrible coding (especially as it relies on the first 120 objects stored in the memory actually being the objects I want to use)... files [1] "Fekete_198601" "Fekete_198602" "Fekete_198603" "Fekete_198604" [5] "Fekete_198605" "Fekete_198606" "Fekete_198607" "Fekete_198608" [9] "Fekete_198609" "Fekete_198610" "Fekete_198611" "Fekete_198612" [13] "Fekete_198701" "Fekete_198702" "Fekete_198703" "Fekete_198704" [17] "Fekete_198705" "Fekete_198706" "Fekete_198707" "Fekete_198708" ...[truncated - there are 120 in total] # Provide column and row names according to lat/longs. rnames <- sprintf("%.2f", seq(from = -89.75, to = 89.75, length = 360)) columnnames <- sprintf("%.2f", seq(from = -179.75, to = 179.75, length = 720)) for (i in 1:120) { Fekete_table <- get(paste("Fekete_", index$year[i], index$month[i], sep='')) colnames(Fekete_table) <- columnnames rownames(Fekete_table) <- rnames assign(paste("Fekete_",index$year[i], index$month[i], sep=''), colnames(Fekete_table)) } As you can see, I'm in a bit of a muddle during the column/row name assignments. In fact, this loop simply writes over the existing data in the tables and replaces it with all the column name values (whilst colnames remains NULL). The problem I've been having is that I can't seem to tell R to assign these column/row heading values to the colnames/rownames within an assign command - it seems to result in errors even when I try breaking this assignment process down into steps. How do I go about assigning rows and columns in this way, and how do I create a better way of indexing the file names? Many thanks for any help offered, Steve _ 25GB of FREE Online Storage – Find out more __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Column name assignment problem
Dear Peter, Jim and all, Thanks for the information regarding how to structure 'assign' commands. I've had a go at doing this, based on your advice, and although I feel I'm a lot closer now, I can't quite get it to work: rnames <- sprintf("%.2f", seq(from = -89.75, to = 89.75, length = 360)) columnnames <- sprintf("%.2f", seq(from = -179.75, to = 179.75, length = 720)) for (i in 1:120) { Fekete_table <- get(paste("Fekete_", index$year[i], index$month[i], sep='')) colnames(Fekete_table) <- columnnames rownames(Fekete_table) <- rnames assign(paste("Fekete_",index$year[i], index$month[i], sep=''), colnames(Fekete_table)) } This assigns the column headings to each table, so that each table doesn't contain data any longer, but simply the column values. I tried inserting assign(colnames(paste("Fekete_"...) but this resulted in the type of error that was mentioned in the previous message. I've run dry of ideas as to how I should restructure the commands, so would be grateful for any pointers. Many thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Column name assignment problem
Dear all, Apologies for yet another question (!). Hopefully it won't be too tricky to solve. I am attempting to add row and column names (these are in fact numbers) to each of the tables created by the code (120 in total). # Create index of file names files <- print(ls()[1:120], quote=FALSE) # This is the best way I could manage to successfully attribute all the table names to a single list - I realise it's horrible coding (especially as it relies on the first 120 objects stored in the memory actually being the objects I want to use)... files [1] "Fekete_198601" "Fekete_198602" "Fekete_198603" "Fekete_198604" [5] "Fekete_198605" "Fekete_198606" "Fekete_198607" "Fekete_198608" [9] "Fekete_198609" "Fekete_198610" "Fekete_198611" "Fekete_198612" [13] "Fekete_198701" "Fekete_198702" "Fekete_198703" "Fekete_198704" [17] "Fekete_198705" "Fekete_198706" "Fekete_198707" "Fekete_198708" ...[truncated - there are 120 in total] # Provide column and row names according to lat/longs. rnames <- sprintf("%.2f", seq(from = -89.75, to = 89.75, length = 360)) columnnames <- sprintf("%.2f", seq(from = -179.75, to = 179.75, length = 720)) for (i in files) { assign(colnames((paste(Fekete_",index$year[i], index$month[i])", sep='')), columnnames) assign(rownames(paste("rownames(Fekete_",index$year[i], index$month[i],")", sep=''), rnames)) } Error: unexpected string constant in: "for (i in files) { assign(colnames((paste(Fekete_",index$year[i], index$month[i])"" > assign(rownames(paste("rownames(Fekete_",index$year[i], > index$month[i],")", sep=''), rnames)) Error in if (do.NULL) NULL else if (nr> 0) paste(prefix, seq_len(nr), : argument is not interpretable as logical In addition: Warning message: In if (do.NULL) NULL else if (nr> 0) paste(prefix, seq_len(nr), : the condition has length> 1 and only the first element will be used > } Error: unexpected '}' in "}" Is there a more elegant way of creating a list of file names in this case (remember that there are 2 variable parts to each name) which would facilitate the assigning of column and row names to each table? (And make life easier when doing other things with the data, e.g. plotting...!). Many thanks once again - the help offered really is appreciated. Steve _ All your Twitter and other social updates in one place __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Column name assignment problem
Jim and all, Thanks - I managed to get it working based on your helpful advice. I'm now trying to do something very similar which simply involves changing the names of the variables in column 1 to make them more succinct. I'm trying to do this via the 'levels' command as I figured that I might be able to apply the character strings in a similar way to how you recommended when dealing with 'colnames'. # Refine names of rivers to make more succinct riv_names <- get(paste("arunoff_",table_year, sep=''))[,1] levels(riv_names) <- c("AMAZON", "AMUR", "CONGO", "LENA", "MISSISSIPPI", "NIGER", "NILE", "OB", "PARANA", "YANGTZE", "YENISEI", "ZAMBEZI") assign(get(paste("arunoff_",table_year, sep='')[,1], levels(riv_names))) Error in paste("arunoff_", table_year, sep = "")[, 1] : incorrect number of dimensions My thinking was to assign the levels of riv_names to column 1 of the table... Many thanks again for any advice offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Column name assignment problem
Dear all, I've been trying to implement the advice given to me, but without much success so far. I thought I'd provide the code in full in the hope that it might make more sense. Just to reiterate, I'm attempting to change the header of the 4th column of every table to "COUNT". year<- 1951:2000 filelist <- paste("C:\\Data\\arunoff_",year,".txt", sep="") filelist # Assign file names to individual objects table_year=1951 for (i in filelist) { assign(paste("arunoff_",table_year,"_temp", sep=""),read.table(file=i, header=TRUE, sep=",")) print(c("LOADED FILE: ","arunoff_",table_year,"_temp"), quote=FALSE) table_year = table_year+1 } # RE-FORMAT DATA # Change names of particular column headings colnames(assign(paste("arunoff_",table_year, sep=""))[4],"COUNT") Any help would be very much appreciated. Thanks as ever, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assignment to variables fails to loop
Dear all, I think I'm nearly there in writing R code which will read in files with two variable parts to the file name and then assigning these file names to objects, which also have two variable parts. I have got the code running without encountering errors, however, I receive 50+ of the same warnings: 1: In assign(paste("Fekete_", index$year, index$month, sep = ""), ... : only the first element is used as variable name And it's true, when I do ls() only Fekete198601 has been assigned. I've attempted to rectify this, but have only come up against further errors. The code as it stands, is as follows: # READ IN FILES FROM DISK # File names have two variable parts: creating a 'file index' is a two-step process index <- expand.grid(year = sprintf("%04d", seq(1986, 1995)), month = sprintf("%02d", 1:12)) filelist <- paste("C:\\Documents and Settings\\Data\\comp_runoff_hd_", paste(index$year, index$month, sep=''), '.asc', sep='') filelist # Assign file names to individual objects with variable name components for (i in filelist) { assign(paste("Fekete_",index$year, index$month, sep=''),read.table(file=i, header=FALSE, sep=" ")) update <- substr(i,35,55) # substring - 2nd argument is character at which extraction is to begin, 3rd argument is where extraction ends. print(c("LOADED FILE:",update), quote=FALSE) } > ls() [1] "Fekete_198601" "filelist" "i" "index" [5] "update" Why is it that only Fekete_198601 has had data assigned to it (there should be 120 such objects in total) and how do I go about solving this? Many thanks again for any help offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading in files with variable parts to names
Thanks, that's great - just what I was looking for. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading in files with variable parts to names
Dear all, Thanks for the help in the previous posts. I've considered each one and have nearly managed to get it working. The structure of the filelist being produced is correct, except for a single space which I can't seem to eradicate! This is my amended code, followed by the first twelve rows of the output (it really goes up to 120 rows). >filelist <- paste("C:\\Documents and >Settings\\Data\\comp_runoff_hd_",do.call(paste, expand.grid(year = >sprintf("%04d", seq(1986,1995)), month = sprintf("%02d",1:12))),".asc", sep="") >filelist [1] "C:\\Documents and Settings\\Data\\comp1986 01.asc" [2] "C:\\Documents and Settings\\Data\\comp1987 01.asc" [3] "C:\\Documents and Settings\\Data\\comp1988 01.asc" [4] "C:\\Documents and Settings\\Data\\comp1989 01.asc" [5] "C:\\Documents and Settings\\Data\\comp1990 01.asc" [6] "C:\\Documents and Settings\\Data\\comp1991 01.asc" [7] "C:\\Documents and Settings\\Data\\comp1992 01.asc" [8] "C:\\Documents and Settings\\Data\\comp1993 01.asc" [9] "C:\\Documents and Settings\\Data\\comp1994 01.asc" [10] "C:\\Documents and Settings\\Data\\comp1995 01.asc" [11] "C:\\Documents and Settings\\Data\\comp1986 02.asc" [12] "C:\\Documents and Settings\\Data\\comp1987 02.asc" I've tried inserting sep="" after the 'month=sprintf("%02d",1:12)' but this doesn't appear to solve the problem - in fact it doesn't change the output at all...! Any help would be much appreciated, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading in files with variable parts to names
Dear all, I'm trying to read in a whole directory of files which have two variable parts to the file name: year and month. E.g. comp198604.asc represents April of 1986 - 'comp' is fixed in each case. Years range between 1986 to 1995 and months are between 1 and 12. Just to be clear, there are 12 files associated with each year: e.g. comp198601, comp198602, ... comp198612 through to comp199501, comp199502 ... comp199512. I am trying to automate the reading in of these files, but am struggling to find an adequate way of achieving this. The closest I've got is by doing: year <- 1986:1995 month <- sprintf("%02d", 1:12) # formats numbers to 2 digits (for maintaining leading zeros in file names) filelist <- paste("C:\\Documents and Settings\\Data\\comp",year,month,".asc", sep="") filelist [1] "C:\\Documents and Settings\\Data\\comp198601.asc" [2] "C:\\Documents and Settings\\Data\\comp198702.asc" [3] "C:\\Documents and Settings\\Data\\comp198803.asc" [4] "C:\\Documents and Settings\\Data\\comp198904.asc" [5] "C:\\Documents and Settings\\Data\\comp199005.asc" [6] "C:\\Documents and Settings\\Data\\comp199106.asc" [7] "C:\\Documents and Settings\\Data\\comp199207.asc" [8] "C:\\Documents and Settings\\Data\\comp199308.asc" [9] "C:\\Documents and Settings\\Data\\comp199409.asc" [10] "C:\\Documents and Settings\\Data\\comp199510.asc" [11] "C:\\Documents and Settings\\Data\\comp198611.asc" [12] "C:\\Documents and Settings\\Data\\comp198712.asc" I need 1986 to remain fixed whilst it cycles through 01 to 12, before it moves onto 1987 and cycles again. There should be 120 outputs in total (10 years each with 12 months), but at present it's only reaching 12 outputs. I'd be grateful to learn what I'm doing wrong here so that I can solve this. Many thanks as ever, Steve _ 25GB of FREE Online Storage – Find out more __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Column name assignment problem
Dear all, I'm trying to assign a name to the fourth column whilst using 'assign', but keep encountering errors. What have I done wrong?! > assign(colnames(c(paste("arunoff_",table_year, sep="")[4]), "COUNT")) Error in if (do.NULL) NULL else if (nc> 0) paste(prefix, seq_len(nc), : argument is not interpretable as logical Hope someone is able to help. Thanks for any pointers, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Manual sort in a for loop
Thanks all - I'm fairly new to R, so I was oblivious to the pros and cons of using a data frame as opposed to a list! The 'get' command also seemed to work successfully. Thanks again, Steve _ 25GB of FREE Online Storage – Find out more __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Manual sort in a for loop
Dear all, I am trying to manually re-sort rows in a number of tables. The rows aren't sorted on any particular values but are simply ordered by user choice (as shown by the row numbers in the code). I have been able to carry out each re-arrangement without the use of the 'for' loop, but cannot seem to successfully execute the statements when incorporated into the loop. The code I have is as follows: table_year=1951 for (i in (paste("arunoff_",year,"_temp",sep=""))) { assign(paste("arunoff_",table_year, sep=""),paste("arunoff_",table_year,"_temp")[c(10,7,9,5,4,12,1,3,2,8,11,6),]) table_year = table_year+1 } The error I get is: Error in paste("arunoff_", table_year, "_temp")[c(10, 7, 9, 5, 4, 12, : incorrect number of dimensions ...depsite this not occurring when I do each table individually (so it can't be a case of there not being enough rows, as> dim(arunoff_1951_temp) gives [1] 12 11 I have a feeling that it may be a syntax error, possibly between 'temp' and the square bracket, but I can't be sure of this. Any solutions or advice offered would be gratefully received. Many thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looping of read.table and assignment
Dear all, I am trying to read in and assign data from 50 tables in an automated fashion. I have the following code, which I created with the help of textbooks and the internet, but it only seems to read in the final data file over and over again. For example, when I type:> table_1951 I get the same values in the table as when I type> table_2000 despite the values in the source tables being different: year <- 1951:2000 filelist <- paste("C:\\Documents and Settings\\Data\\table_",year,".txt", sep="") filelist # Code seems to operate successfully up to this point for (i in filelist) { for (iyear in 1951:2000) { assign(paste("table_",iyear, sep=""),read.table(file=i, header=TRUE, sep=",")) noquote(paste("LOADED FILE:",paste("table_",iyear, sep=""),sep=" ")) } } Can anyone see what I've done wrong here? And just as an aside, as you can see, I've inserted the 'noquote' line so that when the code is running I should be able to see each file being read in - mainly as a 'checker'. Should this work as anticipated, with each line being displayed with its corresponding table number after it's been read in? Many thanks for any help offered, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working with duplicated rows
Thanks - that's great! _ Choose the perfect PC or mobile phone for you __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Working with duplicated rows
Dear all, I have a dataframe of 3 columns, consisting of 'longitude', 'latitude' and a corresponding 'value'. Where identical 'longitude' and 'latitude' pairs occur more than once, I want their corresponding 'value' to be summed and the 'pair' to only appear once. For example: long lat value 10 20 5 6 2 3 27-3 9 10 20 10 4 -1 0 6 2 9 would be converted to something like: long lat value 10 20 15 6 2 12 27-3 9 4 -1 0 ...as rows 1 and 4, 2 and 6 respectively are matched with respect to the 'long' and 'lat' columns. Their values in column 3 are then summed and reported as one row in the new dataframe. How would I go about coding this in R? Many thanks, Steve _ Are you a PC? Upload your PC story and show the world __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a Legend
Dear all, I'm trying to create a legend for my graph. I hope to have the title as "Land Use Type" and the two elements being "Urban" and "Rural" with a red point and green point respectively. So far I have the following command, but obviously it isn't correct: > legend("topright", title="Land Use Type", cex=0.75, pch=16, > col="red","Urban"&"green","Rural", ncol=2) As you can see, I'm a bit confused as to how to deal with the point colours and associated text. Also, how would I make the associated text ("Urban" and "Rural") smaller than the title? Many thanks for any suggestions! Steve _ Discover Bird's Eye View now with Multimap from Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of colour in plots
Hi Thierry and all, Thanks very much for your suggestion. I've given it a go and played around with the transparency values, but seem to be having a problem in that some of the red values are made transparent, even though there are no green values being overplotted! The code I used to display the image was: ggplot(Jan, aes(x = Jan[,4], y = Jan[,5], colour = factor(Jan$Urban.Rural> 1.25))) + geom_point() + scale_colour_manual(values = c(alpha("red",1/10), "green")) Do you have any ideas to put me on the right tracks with this? Thanks again for your help, Steve > Subject: RE: [R] Use of colour in plots > Date: Fri, 19 Sep 2008 16:15:39 +0200 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED]; r-help@r-project.org > > Steve, > > You want something like this: > > library(ggplot2) > n <- 1000 > dataset <- data.frame(x = round(rnorm(n), 2), y = round(rnorm(n), 1), z > = rnorm(n)) > ggplot(dataset, aes(x = x, y = y, colour = factor(z> 1))) + > geom_point() + scale_colour_manual(values = c(alpha("red", 1/4), > "green")) > > HTH, > > Thierry > > > > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and Forest > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, > methodology and quality assurance > Gaverstraat 4 > 9500 Geraardsbergen > Belgium > tel. + 32 54/436 185 > [EMAIL PROTECTED] > www.inbo.be > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to > say what the experiment died of. > ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. > ~ Roger Brinner > > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of > data. > ~ John Tukey > > -Oorspronkelijk bericht- > Van: Steve Murray [mailto:[EMAIL PROTECTED] > Verzonden: vrijdag 19 september 2008 16:01 > Aan: ONKELINX, Thierry; r-help@r-project.org > Onderwerp: RE: [R] Use of colour in plots > > > Sorry - I should've maybe also pointed out that the command I've been > trying to use is: alpha(col="green", 1/10) > > On its own this results in the following error: [1] "#00FF001A" and I > haven't been able to successfully incorporate it into the main formula > just yet (please see my previous message). > > Without wanting to get too far ahead of myself, is there also a way of > making the red points transparent too? (within this command - I've tried > using '&' but this results in an error). > > Many thanks again for any advice you can offer, > > Steve > > > >> From: [EMAIL PROTECTED] >> To: [EMAIL PROTECTED]; r-help@r-project.org >> Subject: RE: [R] Use of colour in plots >> Date: Fri, 19 Sep 2008 13:45:43 + >> >> >> Thierry, >> >> Thanks - I've had a look into using the transparency option, but can't > seem to work out where to place it within the command I'm using: >> >>> ggplot(Jan, aes(x = PopDensity, y = Average.Burnt.Area.Fraction, > colour = factor(Urban.Rural> 1.25))) + geom_point() >> >> I'm assuming that it has to go in the 'aes' section somewhere, but I > seem to be encountering errors wherever I insert it. This doesn't seem > to be mentioned in the book, so do you have any tips?! >> >> Also, out of interest, what does the 'geom_point()' command do? >> >> Thanks again, >> >> Steve >> >> >> >>> Subject: RE: [R] Use of colour in plots >>> Date: Fri, 19 Sep 2008 10:31:58 +0200 >>> From: [EMAIL PROTECTED] >>> To: [EMAIL PROTECTED]; r-help@r-project.org >>> >>> Steve, >>> >>> - Use tranparancy to prevent overplotting: more details on p. 16 of > the >>> ggplot2 book: http://had.co.nz/ggplot2/book/ >>> - You can choose your own colour with scale_manual(): >>> http://had.co.nz/ggplot2/scale_manual.html >>> - The backgroundcolor can be set with ggopt(background.color = > "white"): >>> http://rweb.stat.umn.edu/R/library/ggplot/html/build-options-8a.html >>> >>> HTH, >>> >>> Thierry >>> >>> >>> > >>> >>> ir. Thierry Onkelinx >>> Instituut voor natuur- en bosonderzoek
Re: [R] Use of colour in plots
Sorry - I should've maybe also pointed out that the command I've been trying to use is: alpha(col="green", 1/10) On its own this results in the following error: [1] "#00FF001A" and I haven't been able to successfully incorporate it into the main formula just yet (please see my previous message). Without wanting to get too far ahead of myself, is there also a way of making the red points transparent too? (within this command - I've tried using '&' but this results in an error). Many thanks again for any advice you can offer, Steve > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED]; r-help@r-project.org > Subject: RE: [R] Use of colour in plots > Date: Fri, 19 Sep 2008 13:45:43 + > > > Thierry, > > Thanks - I've had a look into using the transparency option, but can't seem > to work out where to place it within the command I'm using: > >> ggplot(Jan, aes(x = PopDensity, y = Average.Burnt.Area.Fraction, colour = >> factor(Urban.Rural> 1.25))) + geom_point() > > I'm assuming that it has to go in the 'aes' section somewhere, but I seem to > be encountering errors wherever I insert it. This doesn't seem to be > mentioned in the book, so do you have any tips?! > > Also, out of interest, what does the 'geom_point()' command do? > > Thanks again, > > Steve > > > >> Subject: RE: [R] Use of colour in plots >> Date: Fri, 19 Sep 2008 10:31:58 +0200 >> From: [EMAIL PROTECTED] >> To: [EMAIL PROTECTED]; r-help@r-project.org >> >> Steve, >> >> - Use tranparancy to prevent overplotting: more details on p. 16 of the >> ggplot2 book: http://had.co.nz/ggplot2/book/ >> - You can choose your own colour with scale_manual(): >> http://had.co.nz/ggplot2/scale_manual.html >> - The backgroundcolor can be set with ggopt(background.color = "white"): >> http://rweb.stat.umn.edu/R/library/ggplot/html/build-options-8a.html >> >> HTH, >> >> Thierry >> >> >> >> >> ir. Thierry Onkelinx >> Instituut voor natuur- en bosonderzoek / Research Institute for Nature >> and Forest >> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, >> methodology and quality assurance >> Gaverstraat 4 >> 9500 Geraardsbergen >> Belgium >> tel. + 32 54/436 185 >> [EMAIL PROTECTED] >> www.inbo.be >> >> To call in the statistician after the experiment is done may be no more >> than asking him to perform a post-mortem examination: he may be able to >> say what the experiment died of. >> ~ Sir Ronald Aylmer Fisher >> >> The plural of anecdote is not data. >> ~ Roger Brinner >> >> The combination of some data and an aching desire for an answer does not >> ensure that a reasonable answer can be extracted from a given body of >> data. >> ~ John Tukey >> >> -Oorspronkelijk bericht- >> Van: Steve Murray [mailto:[EMAIL PROTECTED] >> Verzonden: donderdag 18 september 2008 19:08 >> Aan: r-help@r-project.org; ONKELINX, Thierry; [EMAIL PROTECTED]; >> [EMAIL PROTECTED]; [EMAIL PROTECTED] >> Onderwerp: RE: [R] Use of colour in plots >> >> >> Dear Thierry and all, >> >> I've tried out ggplot from the ggplot2 package and it seems to provide >> much more favourable results! >> >> Just a few questions I have after consulting the 'help' file for ggplot. >> >> Is there a way of preventing overplotting? Some of the red points are >> being obscured by the green ones. I've tried changing the size of the >> points (using size=1) but this doesn't resolve the issue, as there are >> many points quite densely packed in some parts of the graph. >> >> Also how would I change the colours if I wished (for future plots of a >> similar format)? And how do you customise the legend? >> >> Finally, is there a way of changing the grey background of the graph to >> white? >> >> Sorry for all the questions, it's just that I'm new to the ggplot2 >> package and can't find the answers in the help file or on the associated >> website! >> >> Many thanks to anyone who's able to offer any advice. >> >> Best wishes, >> >> Steve >> >> >> >>> Subject: RE: [R] Use of colour in plots >>> Date: Thu, 18 Sep 2008 14:52:57 +0200 >>> From: [EMAIL PROTECTED] >>> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] >>> C
Re: [R] Use of colour in plots
Thierry, Thanks - I've had a look into using the transparency option, but can't seem to work out where to place it within the command I'm using: > ggplot(Jan, aes(x = PopDensity, y = Average.Burnt.Area.Fraction, colour = > factor(Urban.Rural> 1.25))) + geom_point() I'm assuming that it has to go in the 'aes' section somewhere, but I seem to be encountering errors wherever I insert it. This doesn't seem to be mentioned in the book, so do you have any tips?! Also, out of interest, what does the 'geom_point()' command do? Thanks again, Steve > Subject: RE: [R] Use of colour in plots > Date: Fri, 19 Sep 2008 10:31:58 +0200 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED]; r-help@r-project.org > > Steve, > > - Use tranparancy to prevent overplotting: more details on p. 16 of the > ggplot2 book: http://had.co.nz/ggplot2/book/ > - You can choose your own colour with scale_manual(): > http://had.co.nz/ggplot2/scale_manual.html > - The backgroundcolor can be set with ggopt(background.color = "white"): > http://rweb.stat.umn.edu/R/library/ggplot/html/build-options-8a.html > > HTH, > > Thierry > > > > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and Forest > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, > methodology and quality assurance > Gaverstraat 4 > 9500 Geraardsbergen > Belgium > tel. + 32 54/436 185 > [EMAIL PROTECTED] > www.inbo.be > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to > say what the experiment died of. > ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. > ~ Roger Brinner > > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of > data. > ~ John Tukey > > -Oorspronkelijk bericht- > Van: Steve Murray [mailto:[EMAIL PROTECTED] > Verzonden: donderdag 18 september 2008 19:08 > Aan: r-help@r-project.org; ONKELINX, Thierry; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; [EMAIL PROTECTED] > Onderwerp: RE: [R] Use of colour in plots > > > Dear Thierry and all, > > I've tried out ggplot from the ggplot2 package and it seems to provide > much more favourable results! > > Just a few questions I have after consulting the 'help' file for ggplot. > > Is there a way of preventing overplotting? Some of the red points are > being obscured by the green ones. I've tried changing the size of the > points (using size=1) but this doesn't resolve the issue, as there are > many points quite densely packed in some parts of the graph. > > Also how would I change the colours if I wished (for future plots of a > similar format)? And how do you customise the legend? > > Finally, is there a way of changing the grey background of the graph to > white? > > Sorry for all the questions, it's just that I'm new to the ggplot2 > package and can't find the answers in the help file or on the associated > website! > > Many thanks to anyone who's able to offer any advice. > > Best wishes, > > Steve > > > >> Subject: RE: [R] Use of colour in plots >> Date: Thu, 18 Sep 2008 14:52:57 +0200 >> From: [EMAIL PROTECTED] >> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] >> CC: r-help@r-project.org >> >> Steve, >> >> Have a look at the ggplot2 package: >> >> library(ggplot2) >> ggplot(Jan, aes(x = PopDensity, y = Average.Burnt.Area.Fraction, > colour >> = factor(Urban.Rural> 1.25))) + geom_point() >> >> >> > >> >> ir. Thierry Onkelinx >> Instituut voor natuur- en bosonderzoek / Research Institute for Nature >> and Forest >> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, >> methodology and quality assurance >> Gaverstraat 4 >> 9500 Geraardsbergen >> Belgium >> tel. + 32 54/436 185 >> [EMAIL PROTECTED] >> www.inbo.be >> >> To call in the statistician after the experiment is done may be no > more >> than asking him to perform a post-mortem examination: he may be able > to >> say what the experiment died of. >> ~ Sir Ronald Aylmer Fisher >> >> The plural of anecdote is not data. >> ~ Roger Brinner >> >> The combination of some data and an aching desire for an answer does &g
Re: [R] Use of colour in plots
Jim, Thanks for this - I've looked into cluster.overplot in particular which, judging by the help file, sounds quite useful (count.overplot seems less relevant). I'm finding however, that when I execute cluster.overplot, it simply returns many values (which total the number in the dataset I'm using, 54041), but doesn't produce, or alter my graph! Is this to be expected? If so, what do the ouput values represent? Because not *all* of the values overplot, so I'm confused as to why the number of cluster.overplot output values equals the number of values in my dataset! Thanks, Steve > Date: Fri, 19 Sep 2008 21:57:03 +1000 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Subject: Re: [R] Use of colour in plots > > Steve Murray wrote: >> Greg, >> [[elided Hotmail spam]] >> >> One (hopefully final!) question I have is, is there any way of preventing >> overplotting? I'm finding that many of the red points are being obscured by >> the greens - I've tried making the point sizes small (cex=0.1) but this >> doesn't fully solve the problem. >> >> Or even, is there a way of changing the order of which the points are >> plotted? >> > Hi again, > Maybe cluster.overplot or count.overplot in the plotrix package? > > Jim > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of colour in plots
Greg, Thanks for this - it works really well! One (hopefully final!) question I have is, is there any way of preventing overplotting? I'm finding that many of the red points are being obscured by the greens - I've tried making the point sizes small (cex=0.1) but this doesn't fully solve the problem. Or even, is there a way of changing the order of which the points are plotted? Thanks again, Steve > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL > PROTECTED]; [EMAIL PROTECTED] > CC: r-help@r-project.org > Date: Thu, 18 Sep 2008 10:56:05 -0600 > Subject: RE: [R] Use of colour in plots > > Try something like: > >> x <- runif(25) >> y <- rnorm(25) >> z <- rnorm(25, 3*x) >> plot(x, y, col=ifelse( z> 1.25, 'red', 'green') ) > > Does this help, > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > [EMAIL PROTECTED] > 801.408.8111 > > >> -Original Message- >> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] >> project.org] On Behalf Of Steve Murray >> Sent: Thursday, September 18, 2008 8:14 AM >> To: ONKELINX, Thierry; Petr PIKAL; [EMAIL PROTECTED]; >> [EMAIL PROTECTED] >> Cc: r-help@r-project.org >> Subject: Re: [R] Use of colour in plots >> >> >> Jim and all, >> >> Maybe I've misunderstood ?color.scale (appologies if this is so), but I >> don't think this is what I need. I'm not looking to scale the colours >> of points, instead I simply want to assign each point a colour (either >> red or green) based on it's value in the Urban.Rural column. >> >> To clarify (but please also see my earlier message if this helps): >> >> In my dataset (Jan) I have 3 columns of interest: Average Burnt Area >> Fraction (ABAF), PopDensity and Urban.Rural. >> >> I want to plot ABAF against PopDens (which I've had no problems doing) >> and then, regardless of the values of ABAF and PopDens, I want to >> assign it a colour. The colour each point is given is based on the >> corresponding Urban.Rural value on each row. If for each pair of ABAF >> and PopDens values the Urban.Rural value on that row is>1.25, then the >> point should be coloured red, whereas if it's = >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. _ Discover Bird's Eye View now with Multimap from Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of colour in plots
Dear Thierry and all, I've tried out ggplot from the ggplot2 package and it seems to provide much more favourable results! Just a few questions I have after consulting the 'help' file for ggplot. Is there a way of preventing overplotting? Some of the red points are being obscured by the green ones. I've tried changing the size of the points (using size=1) but this doesn't resolve the issue, as there are many points quite densely packed in some parts of the graph. Also how would I change the colours if I wished (for future plots of a similar format)? And how do you customise the legend? Finally, is there a way of changing the grey background of the graph to white? Sorry for all the questions, it's just that I'm new to the ggplot2 package and can't find the answers in the help file or on the associated website! Many thanks to anyone who's able to offer any advice. Best wishes, Steve > Subject: RE: [R] Use of colour in plots > Date: Thu, 18 Sep 2008 14:52:57 +0200 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] > CC: r-help@r-project.org > > Steve, > > Have a look at the ggplot2 package: > > library(ggplot2) > ggplot(Jan, aes(x = PopDensity, y = Average.Burnt.Area.Fraction, colour > = factor(Urban.Rural> 1.25))) + geom_point() > > > > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and Forest > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, > methodology and quality assurance > Gaverstraat 4 > 9500 Geraardsbergen > Belgium > tel. + 32 54/436 185 > [EMAIL PROTECTED] > www.inbo.be > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to > say what the experiment died of. > ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. > ~ Roger Brinner > > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of > data. > ~ John Tukey > > -Oorspronkelijk bericht- > Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Namens Steve Murray > Verzonden: donderdag 18 september 2008 13:58 > Aan: Petr PIKAL; [EMAIL PROTECTED] > CC: r-help@r-project.org > Onderwerp: Re: [R] Use of colour in plots > > > Dear all, > > I've finally got round to plotting my data and trying to apply colour > (had some problems with the data which I needed to rectify first!). I'm > having trouble however getting the colour to work as I'd hoped, despite > the help offered in previous messages. > > Just to recap, and with more specifics this time, I have a data frame as > follows: > > >> head(Jan) > Latitude Longitude Urban.Rural Average.Burnt.Area.Fraction PopDensity > GDP > 1 -0.25 -49.25 1.00 9e-05 1.8703090 > 25694 > 2 -0.25 -50.25 1.00 2e-05 2.5962470 > 32205 > 3 -0.25 -50.75 1.00 0e+00 3.5221470 > 39312 > 4 -0.25 -51.25 1.042432 5e-06 14.2919000 > 87685 > 5 -0.25 -51.75 1.00 1e-05 0.5721315 > 11376 > 6 -0.25 -52.25 1.00 4e-05 0.7262031 > 11083 > Cropland.Area..km.2.grid.cell. > 1 0.4260444 > 2 0.3401146 > 3 0.3036076 > 4 0.3147694 > 5 0.2843388 > 6 0.1734099 > > > I hope to plot Average.Burnt.Area.Fraction (ABAF) against PopDensity > (which I have done using:> plot(Jan[,3],Jan[,4]) ). > > However, the twist is, I hope these points to be coloured according to > the values of Urban.Rural (but don't want this column to actually be > plotted). I am looking to do, if Urban.Rural>1.25 then colour the point > red, and if it's = To: [EMAIL PROTECTED] >> CC: r-help@r-project.org; [EMAIL PROTECTED] >> Subject: Re: [R] Use of colour in plots >> From: [EMAIL PROTECTED] >> Date: Fri, 5 Sep 2008 16:40:47 +0200 >> >> Hi >> >> [EMAIL PROTECTED] napsal dne 05.09.2008 16:24:35: >> >>> Here is an example doing the same type of thing. >>> It should be easy enough to adapt. >>> >>> Good luck >>> >>> === >>> x <- runif(100, 0, 1) >>> y <- runif(100, 0, 1) >>> z <- data.frame(x,y) >>> >>> plot(subset(z, z$y>=.5), col="red", ylim=c(min(z$y), >>> max(z$y)), pch=16) >>> points(subset(z, z$y <=.49), col="blue", pch=16) >>> === >> >> Or >> >> third <-