Dear Randall: I could not find package 9or function) called "*tidyr*". I install all other packages, but could not find tidyr.
with many thanks steve On Tue, Dec 29, 2015 at 5:43 PM, Randall Pruim <[email protected]> wrote: > A few more suggestions and an update to my ggplot2 plot. > > 1) I recommend using SPACES in your code to make things more readable. > 2) Coding things with COLOR isn’t really very useful. This is an > additional variable and should be coded as such. > 3) I don’t really know what detected means, but I’ve coded it as a > logical variable. You could use a factor or character vector instead. > 4) You have used inconsistent date formatting which (without my edits) > will cause some years to be 0005 and others to be 2005. (This will be > immediately clear when the plot spans 2000 years — that’s how I detected > the problem.) > > Here’s what my first draft would look like: > > > ### Put data into a data frame -- avoid loose vectors > library(dplyr); library(lubridate); require(tidyr) > library(ggplot2) > > # recreate your data in a data frame > MyData <- data_frame( > Well1 = > c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25), > Well2 = > c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11), > dateString = > c("2Jan05","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06", > > > "2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07"), > date = dmy(dateString) > ) > > # put the data into "long" format > MyData2 <- > MyData %>% > gather(location, concentration, Well1, Well2) %>% > mutate(detected = TRUE) > > # hand-code your colored values (should be double checked for accuracy) > > MyData2$detected[c(1, 2, 5, 15 + 1, 15 + 5, 15 + 10)] <- FALSE > > # Create plot using ggplot2 > > ggplot( data = MyData2 %>% filter(!is.na(concentration)), > aes(x = date, y = concentration, colour = location)) + > geom_line(alpha = 0.8) + > geom_point( aes(shape = detected, group = location), size = 3, alpha = > 0.8) + > scale_shape_manual(values = c(1, 16)) + > theme_minimal() > > > > > > On Dec 26, 2015, at 6:02 AM, Steven Stoline <[email protected]> wrote: > > > > Dear Randall: > > > > > > Thank you very much for the details and for your support and patience. > > > > > > > > ### This how are the original data look like: > > ### --------------------------------------------------- > > > > > > > > > Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25) > > > > > > > > > Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11) > > > > > > > > > date<-c("2Jan2005","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06","2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07") > > > > > > > > The data values in red font are Non-detected. So I need to make > difference between these non-detected values and the detected ones in the > graph. > > > > > > > > For example, solid circle for the detected ones, and open circles for > the non-detected one (the ones in red font). > > > > > > So, I was trying to use pch for. > > > > > > > > Please notice that, now, both data sets Well1 and Well2, and date have > the same length of 15, but Well1 has one NA, and Well2 has two NA. > > > > > > Happy Holiday and Happy Christmas (if you are celebrating) > > > > with many thanks > > steve > > > > On Thu, Dec 24, 2015 at 9:31 AM, Randall Pruim <[email protected]> > wrote: > > Steve, > > > > This is on the edge of what R-sig-teaching is for (since it isn’t really > about teaching). But since I think there are elements of what you are > doing that lead students to think that R is terrible, I’ll show you how I > might approach things. > > > > First a few comments about my solution. > > > > 1) I generally avoid loose vectors. I prefer to use data frames to keep > related vectors related. > > > > 2) I prefer to code dates as dates. I would be very nervous about code > that manually sets the axis labels differently from the data. That can > lead to all sorts of bad errors down the road if you change the data and > forget to change the labels and often indicates you don’t have the data > formatted the way you should. (Note: I added day of month values to your > dates that had none.) The lubridate package makes it easy to create dates > from strings. > > > > 3) I rarely use base graphics, so I’ll show you solutions using lattice > and ggplot2. There may be nice ways to do this in base graphics as well. > > > > 4) I’m ignoring the color choices, title, etc. All that can be easily > added, but I’m focusing on getting the data display correct. That’s > generally the approach I take to plotting: First get the data display > correct, then fancy up titles, colors, fonts, etc. It’s saves lots of > times, because often once I see the plot, I realize it isn’t what I need, > so there is no reason to gussy it up. > > > > 5) I prefer (and lattice and ggplot2) encourage keeping the data > manipulation in one location and the plotting after that rather than going > back and forth between those two types of operations. I find that it makes > the code easier to read. > > > > 6) One of your series as fewer points than the other. I made the > assumption that the missing value was at the end. That should be changed > to whatever is correct for your data. > > > > 7) I don’t know what you were using pch to indicate, so I created a > variable called “group” with values 0 and 15. The variable and its values > should ideally be renamed to reflect what they represent. That will make > your code easier to read and produce better labeling of the plot. > > > > And one note about your code. > > > >> 6*0:max_y > > > > probably doesn’t do what you expect since the 6 does nothing here > (because 6 * 0 = 0). You could do 6 * (0:max_y), but isn’t clear why you > would want the range of the plot to be six times that of the data. Maybe > you were thinking something like seq(0, max_y, length.out = 6), but that > will give pretty ugly breakpoints. In any case, the plots below do a fine > job of setting the axes by default, and each system allows you to tune them > if you disagree with the default for a particular plot. > > > > > > With that much preamble, the code is now shorter than the introduction. > > > > > > ### Put data into a data frame -- avoid loose vectors > > library(dplyr); library(lubridate) > > > > # if i knew what you were using pch for, i would name group and its > values to match > > MyData <- data_frame( > > Well1 = > c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25), > > Well2 = > c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11,NA), > > dateString = > c("1Jan05","1April05","1Jul05","1Oct05","1Jan06","1March06","1Jun06","2Oct06","17Oct06","1Jan07","1April07","1Jul07","1Oct07","1Dec07"), > > date = dmy(dateString), > > group = factor(c(0,0,15,15,0,15,15,15,15,15,15,15,15,15)) > > ) > > > > ## using lattice > > ## lattice makes plotting two series easy > > ## but doesn't make it as easy to have different symbols along the same > series > > > > library(lattice) > > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"), auto.key > = TRUE) > > ## better legend > > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"), > > auto.key = list(points = TRUE, lines = TRUE)) > > > > ## using ggplot2 > > ## for highly customized plots, i generally find ggplot2 works better > > ## i would reshape the data with tidyr before plotting (could be don in > lattice as well) > > > > library(ggplot2); library(tidyr) > > > > MyData2 <- > > MyData %>% > > gather(location, concentration, Well1, Well2) > > > > ggplot( data = MyData2, aes(x = date, y = concentration, colour = > location)) + > > geom_line() + > > geom_point( aes(shape = group), size = 2) > > > > xyplot(concentration ~ date, data = MyData2, groups = location, type = > c("p", "l"), > > auto.key = TRUE) > > > > ## without reshaping, you can plot 4 layers well manually, but the > default labeling isn’t as nice > > > > ggplot(data = MyData) + > > geom_line(aes(x = date, y = Well1, colour = "Well1")) + > > geom_line(aes(x = date, y = Well2, colour = "Well2")) + > > geom_point(aes(x = date, y = Well1, colour = "Well1", shape = group)) + > > geom_point(aes(x = date, y = Well2, colour = "Well2", shape = group)) > > > > > > Happy Holidays. I hope one of these approaches will get you headed in > the right direction. > > > > —rjp > > > > > > > >> On Dec 24, 2015, at 7:51 AM, Steven Stoline <[email protected]> wrote: > >> > >> Dear All: > >> > >> I am trying to plot two series in one graph. But I have some > difficulties > >> to set up the y-axis lim. Also, the second series is not correctly > graphed. > >> > >> *Here is what I tried to do:* > >> > >> > >> ### Define 2 vectors > >> > >> > Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25) > >> > Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11) > >> > >> ### Calculate range from 0 to max value of Well1 and Well2 > >> ### g_range <- range(0, Well1, Well2) > >> > >> max_y <- max(Well1, Well2) > >> > >> ### Graph Groundwater Concentrations using y axis that ranges from 0 to > max > >> ### value in Well1 or Well2 vector. Turn off axes and > >> ### annotations (axis labels) so we can specify them yourself > >> > >> plot(Well1, type="o", pch=c(0,0,15,15,0,15,15,15,15,15,15,15,15,15), > >> col="blue", ylim=c(0,max_y), axes=FALSE, ann=FALSE, , lwd=3, cex=1.25) > ### > >> axes=FALSE, > >> > >> ### Make x axis using Jan 2005 - Dec 2008 labels > >> > >> axis(1, at=1:14, > >> > lab=c("Jan05","April05","Jul05","Oct05","Jan06","March06","Jun06","2Oct06","17Oct06","Jan07","April07","Jul07","Oct07","Dec07")) > >> > >> > >> > >> *### Make y axis with horizontal labels , Here what I have the major > >> problem* > >> > >> ### I want the y-axis looks like: 0, 0.05, 0.10, 0.15, 20, 0.25 > >> > >> axis(2, las=0, at=6*0:max_y) ### max_y > >> > >> > >> ### Create box around plot > >> > >> box() > >> > >> ### Graph Well2 with red dashed line and square points > >> > >> ### lines(Well2, type="o", pch=22, lty=2, col="red", lwd=3, cex=1.0) > >> > >> lines(Well2, type="o", pch=c(0,15,15,15,0,15,15,15,0,15,15,15,15), > lty=2, > >> col="red", lwd=3, cex=1.25) > >> > >> ### Create a title with a red, bold/italic font > >> > >> title(main="Trichloroethene mg/L from Wells 1 and 2 - 2005-2007", > >> col.main="red", font.main=2) > >> > >> ### Label the x and y axes with dark green text > >> > >> title(xlab="Time Points", col.lab=rgb(0,0.5,0)) > >> > >> > >> title(ylab="Trichloroethene mg/L", col.lab=rgb(0,0.5,0)) > >> > >> ### Create a legend > >> > >> legend(1, g_range[2], c("Well1","Well2"), cex=1.0, col=c("blue","red"), > >> pch=15:15, lty=1:2); > >> > >> > >> > >> > >> with thanks > >> steve > >> ------------------------- > >> Steven M. Stoline > >> 1123 Forest Avenue > >> Portland, ME 04112 > >> [email protected] > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> [email protected] mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching > > > > > > > > > > -- > > Steven M. Stoline > > 1123 Forest Avenue > > Portland, ME 04112 > > [email protected] > > -- Steven M. Stoline 1123 Forest Avenue Portland, ME 04112 [email protected] [[alternative HTML version deleted]] _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
