https://cran.r-project.org/web/packages/tidyr/index.html
On Tue, Dec 29, 2015 at 6:30 PM, Steven Stoline <[email protected]> wrote: > Dear Randall: > > I could not find package 9or function) called "*tidyr*". I install all > other packages, but could not find tidyr. > > with many thanks > steve > > On Tue, Dec 29, 2015 at 5:43 PM, Randall Pruim <[email protected]> wrote: > >> A few more suggestions and an update to my ggplot2 plot. >> >> 1) I recommend using SPACES in your code to make things more readable. >> 2) Coding things with COLOR isn’t really very useful. This is an >> additional variable and should be coded as such. >> 3) I don’t really know what detected means, but I’ve coded it as a >> logical variable. You could use a factor or character vector instead. >> 4) You have used inconsistent date formatting which (without my edits) >> will cause some years to be 0005 and others to be 2005. (This will be >> immediately clear when the plot spans 2000 years — that’s how I detected >> the problem.) >> >> Here’s what my first draft would look like: >> >> >> ### Put data into a data frame -- avoid loose vectors >> library(dplyr); library(lubridate); require(tidyr) >> library(ggplot2) >> >> # recreate your data in a data frame >> MyData <- data_frame( >> Well1 = >> c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25), >> Well2 = >> c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11), >> dateString = >> c("2Jan05","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06", >> >> >> "2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07"), >> date = dmy(dateString) >> ) >> >> # put the data into "long" format >> MyData2 <- >> MyData %>% >> gather(location, concentration, Well1, Well2) %>% >> mutate(detected = TRUE) >> >> # hand-code your colored values (should be double checked for accuracy) >> >> MyData2$detected[c(1, 2, 5, 15 + 1, 15 + 5, 15 + 10)] <- FALSE >> >> # Create plot using ggplot2 >> >> ggplot( data = MyData2 %>% filter(!is.na(concentration)), >> aes(x = date, y = concentration, colour = location)) + >> geom_line(alpha = 0.8) + >> geom_point( aes(shape = detected, group = location), size = 3, alpha = >> 0.8) + >> scale_shape_manual(values = c(1, 16)) + >> theme_minimal() >> >> >> >> >> > On Dec 26, 2015, at 6:02 AM, Steven Stoline <[email protected]> wrote: >> > >> > Dear Randall: >> > >> > >> > Thank you very much for the details and for your support and patience. >> > >> > >> > >> > ### This how are the original data look like: >> > ### --------------------------------------------------- >> > >> > >> > >> > >> Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25) >> > >> > >> > >> > >> Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11) >> > >> > >> > >> > >> date<-c("2Jan2005","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06","2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07") >> > >> > >> > >> > The data values in red font are Non-detected. So I need to make >> difference between these non-detected values and the detected ones in the >> graph. >> > >> > >> > >> > For example, solid circle for the detected ones, and open circles for >> the non-detected one (the ones in red font). >> > >> > >> > So, I was trying to use pch for. >> > >> > >> > >> > Please notice that, now, both data sets Well1 and Well2, and date have >> the same length of 15, but Well1 has one NA, and Well2 has two NA. >> > >> > >> > Happy Holiday and Happy Christmas (if you are celebrating) >> > >> > with many thanks >> > steve >> > >> > On Thu, Dec 24, 2015 at 9:31 AM, Randall Pruim <[email protected]> >> wrote: >> > Steve, >> > >> > This is on the edge of what R-sig-teaching is for (since it isn’t really >> about teaching). But since I think there are elements of what you are >> doing that lead students to think that R is terrible, I’ll show you how I >> might approach things. >> > >> > First a few comments about my solution. >> > >> > 1) I generally avoid loose vectors. I prefer to use data frames to keep >> related vectors related. >> > >> > 2) I prefer to code dates as dates. I would be very nervous about code >> that manually sets the axis labels differently from the data. That can >> lead to all sorts of bad errors down the road if you change the data and >> forget to change the labels and often indicates you don’t have the data >> formatted the way you should. (Note: I added day of month values to your >> dates that had none.) The lubridate package makes it easy to create dates >> from strings. >> > >> > 3) I rarely use base graphics, so I’ll show you solutions using lattice >> and ggplot2. There may be nice ways to do this in base graphics as well. >> > >> > 4) I’m ignoring the color choices, title, etc. All that can be easily >> added, but I’m focusing on getting the data display correct. That’s >> generally the approach I take to plotting: First get the data display >> correct, then fancy up titles, colors, fonts, etc. It’s saves lots of >> times, because often once I see the plot, I realize it isn’t what I need, >> so there is no reason to gussy it up. >> > >> > 5) I prefer (and lattice and ggplot2) encourage keeping the data >> manipulation in one location and the plotting after that rather than going >> back and forth between those two types of operations. I find that it makes >> the code easier to read. >> > >> > 6) One of your series as fewer points than the other. I made the >> assumption that the missing value was at the end. That should be changed >> to whatever is correct for your data. >> > >> > 7) I don’t know what you were using pch to indicate, so I created a >> variable called “group” with values 0 and 15. The variable and its values >> should ideally be renamed to reflect what they represent. That will make >> your code easier to read and produce better labeling of the plot. >> > >> > And one note about your code. >> > >> >> 6*0:max_y >> > >> > probably doesn’t do what you expect since the 6 does nothing here >> (because 6 * 0 = 0). You could do 6 * (0:max_y), but isn’t clear why you >> would want the range of the plot to be six times that of the data. Maybe >> you were thinking something like seq(0, max_y, length.out = 6), but that >> will give pretty ugly breakpoints. In any case, the plots below do a fine >> job of setting the axes by default, and each system allows you to tune them >> if you disagree with the default for a particular plot. >> > >> > >> > With that much preamble, the code is now shorter than the introduction. >> > >> > >> > ### Put data into a data frame -- avoid loose vectors >> > library(dplyr); library(lubridate) >> > >> > # if i knew what you were using pch for, i would name group and its >> values to match >> > MyData <- data_frame( >> > Well1 = >> c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25), >> > Well2 = >> c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11,NA), >> > dateString = >> c("1Jan05","1April05","1Jul05","1Oct05","1Jan06","1March06","1Jun06","2Oct06","17Oct06","1Jan07","1April07","1Jul07","1Oct07","1Dec07"), >> > date = dmy(dateString), >> > group = factor(c(0,0,15,15,0,15,15,15,15,15,15,15,15,15)) >> > ) >> > >> > ## using lattice >> > ## lattice makes plotting two series easy >> > ## but doesn't make it as easy to have different symbols along the same >> series >> > >> > library(lattice) >> > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"), auto.key >> = TRUE) >> > ## better legend >> > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"), >> > auto.key = list(points = TRUE, lines = TRUE)) >> > >> > ## using ggplot2 >> > ## for highly customized plots, i generally find ggplot2 works better >> > ## i would reshape the data with tidyr before plotting (could be don in >> lattice as well) >> > >> > library(ggplot2); library(tidyr) >> > >> > MyData2 <- >> > MyData %>% >> > gather(location, concentration, Well1, Well2) >> > >> > ggplot( data = MyData2, aes(x = date, y = concentration, colour = >> location)) + >> > geom_line() + >> > geom_point( aes(shape = group), size = 2) >> > >> > xyplot(concentration ~ date, data = MyData2, groups = location, type = >> c("p", "l"), >> > auto.key = TRUE) >> > >> > ## without reshaping, you can plot 4 layers well manually, but the >> default labeling isn’t as nice >> > >> > ggplot(data = MyData) + >> > geom_line(aes(x = date, y = Well1, colour = "Well1")) + >> > geom_line(aes(x = date, y = Well2, colour = "Well2")) + >> > geom_point(aes(x = date, y = Well1, colour = "Well1", shape = group)) + >> > geom_point(aes(x = date, y = Well2, colour = "Well2", shape = group)) >> > >> > >> > Happy Holidays. I hope one of these approaches will get you headed in >> the right direction. >> > >> > —rjp >> > >> > >> > >> >> On Dec 24, 2015, at 7:51 AM, Steven Stoline <[email protected]> wrote: >> >> >> >> Dear All: >> >> >> >> I am trying to plot two series in one graph. But I have some >> difficulties >> >> to set up the y-axis lim. Also, the second series is not correctly >> graphed. >> >> >> >> *Here is what I tried to do:* >> >> >> >> >> >> ### Define 2 vectors >> >> >> >> >> Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25) >> >> >> Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11) >> >> >> >> ### Calculate range from 0 to max value of Well1 and Well2 >> >> ### g_range <- range(0, Well1, Well2) >> >> >> >> max_y <- max(Well1, Well2) >> >> >> >> ### Graph Groundwater Concentrations using y axis that ranges from 0 to >> max >> >> ### value in Well1 or Well2 vector. Turn off axes and >> >> ### annotations (axis labels) so we can specify them yourself >> >> >> >> plot(Well1, type="o", pch=c(0,0,15,15,0,15,15,15,15,15,15,15,15,15), >> >> col="blue", ylim=c(0,max_y), axes=FALSE, ann=FALSE, , lwd=3, cex=1.25) >> ### >> >> axes=FALSE, >> >> >> >> ### Make x axis using Jan 2005 - Dec 2008 labels >> >> >> >> axis(1, at=1:14, >> >> >> lab=c("Jan05","April05","Jul05","Oct05","Jan06","March06","Jun06","2Oct06","17Oct06","Jan07","April07","Jul07","Oct07","Dec07")) >> >> >> >> >> >> >> >> *### Make y axis with horizontal labels , Here what I have the major >> >> problem* >> >> >> >> ### I want the y-axis looks like: 0, 0.05, 0.10, 0.15, 20, 0.25 >> >> >> >> axis(2, las=0, at=6*0:max_y) ### max_y >> >> >> >> >> >> ### Create box around plot >> >> >> >> box() >> >> >> >> ### Graph Well2 with red dashed line and square points >> >> >> >> ### lines(Well2, type="o", pch=22, lty=2, col="red", lwd=3, cex=1.0) >> >> >> >> lines(Well2, type="o", pch=c(0,15,15,15,0,15,15,15,0,15,15,15,15), >> lty=2, >> >> col="red", lwd=3, cex=1.25) >> >> >> >> ### Create a title with a red, bold/italic font >> >> >> >> title(main="Trichloroethene mg/L from Wells 1 and 2 - 2005-2007", >> >> col.main="red", font.main=2) >> >> >> >> ### Label the x and y axes with dark green text >> >> >> >> title(xlab="Time Points", col.lab=rgb(0,0.5,0)) >> >> >> >> >> >> title(ylab="Trichloroethene mg/L", col.lab=rgb(0,0.5,0)) >> >> >> >> ### Create a legend >> >> >> >> legend(1, g_range[2], c("Well1","Well2"), cex=1.0, col=c("blue","red"), >> >> pch=15:15, lty=1:2); >> >> >> >> >> >> >> >> >> >> with thanks >> >> steve >> >> ------------------------- >> >> Steven M. Stoline >> >> 1123 Forest Avenue >> >> Portland, ME 04112 >> >> [email protected] >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> >> [email protected] mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching >> > >> > >> > >> > >> > -- >> > Steven M. Stoline >> > 1123 Forest Avenue >> > Portland, ME 04112 >> > [email protected] >> >> > > > -- > Steven M. Stoline > 1123 Forest Avenue > Portland, ME 04112 > [email protected] > > [[alternative HTML version deleted]] > > _______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-teaching _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
