Re: [R] Obtaining a value of pie in a zero inflated model (fm-zinb2)
Are you referring to the zeroinfl() function in the countreg package? If so, I think predict(fm_zinb2, type = "zero", newdata = some.new.data) will give you pi for each combination of covariate values that you provide in some.new.data where pi is the probability to observe a zero from the point mass component. As to your second question, I'm not sure that's possible, for any *particular, individual* subject. Others will undoubtedly know better than I. --Chris Ryan Sorkin, John wrote: > I am running a zero inflated regression using the zeroinfl function similar > to the model below: > > fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "poisson") > summary(fm_zinb2) > > I have three questions: > > 1) How can I obtain a value for the parameter pie, which is the fraction of > the population that is in the zero inflated model vs the fraction in the > count model? > > 2) For any particular subject, how can I determine if the subject is in the > portion of the population that contributes a zero count because the subject > is in the group of subjects who have structural zero responses vs. the > subject being in the portion of the population who can contribute a zero or a > non-zero response? > > 3) zero inflated models can be solved using closed form solutions, or using > iterative methods. Which method is used by fm_zinb2? > > Thank you, > John > > John David Sorkin M.D., Ph.D. > Professor of Medicine, University of Maryland School of Medicine; > > Associate Director for Biostatistics and Informatics, Baltimore VA Medical > Center Geriatrics Research, Education, and Clinical Center; > > PI Biostatistics and Informatics Core, University of Maryland School of > Medicine Claude D. Pepper Older Americans Independence Center; > > Senior Statistician University of Maryland Center for Vascular Research; > > Division of Gerontology and Paliative Care, > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > Cell phone 443-418-5382 > > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anyone having trouble accesing CRAN?
Ah, thanks all. Guess I missed the message before they started the maintenance. --Chris Ivan Krylov wrote: > On Wed, 15 Nov 2023 14:13:00 -0500 > "Christopher W. Ryan via R-help" wrote: > >> Anyone seeing similar? > > Same for me. > > While it worked, CRAN website had the following message: > >>> The CRAN Admin Team will perform system upgrades during the period >>> Wednesday November 15 until Thursday November 16, 2023. There will >>> be intermittent outages in service during this time. > > Use chooseCRANmirror(local.only = TRUE) (or subset() the return value > of getCRANmirrors(local.only = TRUE)) to access a mirror that works. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] anyone having trouble accesing CRAN?
at https://cran.r-project.org/ I get this error message: = Secure Connection Failed An error occurred during a connection to cran.r-project.org. PR_END_OF_FILE_ERROR Error code: PR_END_OF_FILE_ERROR The page you are trying to view cannot be shown because the authenticity of the received data could not be verified. === Three different browsers, two different devices, two different networks. (The text of the error messages varies.) Anyone seeing similar? Thanks. --Chris Ryan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] make a lattice dotplot with symbol size proportional to a variable in the plotted dataframe
Very helpful, Deepayan, and educational. Thank you. What does NSE stand for? Thanks, Chris Deepayan Sarkar wrote: > > --Chris Ryan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum data according to date in sequence
date appears to be a character variable, and R is treating it as such. str(dt1) might give you some insight. Or the dplyr equivalent glimpse(dt1) I think R did what you asked, but if you want to be able to order records by date, in temporal order, you need to tell R that it is a date: library(dplyr) sums <- dt1 %>% mutate(realDate = as.Date(date, format = "%m/%d/%Y") %>% group_by(realDate) %>% summarise(EnergykWh = sum(EnergykWh)) --Chris Ryan roslinazairimah zakaria wrote: > Dear all, > > I have this set of data. I would like to sum the EnergykWh according date > sequences. > >> head(dt1,20) StationName date time EnergykWh > 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 > 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 > 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 > 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 > 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 > 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 > 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 > 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 > 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 > > I have tried this: > library(dplyr) > sums <- dt1 %>% > group_by(date) %>% > summarise(EnergykWh = sum(EnergykWh)) > > head(sums,20) > > The date is not by daily sequence but by year sequence. > >> head(sums,20)# A tibble: 20 × 2 >date EnergykWh > 1 1/1/2017 25.3 2 1/1/2018 61.0 3 > 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/201769.4 6 > 1/10/201854.5 7 1/10/201949.1 8 1/10/202045.9 9 > 1/11/201773.9 10 1/11/201853.3 11 1/11/201993.5 12 > 1/11/202066.7 13 1/12/201778.6 14 1/12/201842.2 15 > 1/12/201922.7 16 1/12/202080.9 17 1/13/201785.6 18 > 1/13/201846.4 19 1/13/201940.0 20 1/13/2020 121. > > > > Thank you very much for any help given. > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amazing AI
In clinical medicine, the question the patient asks rarely represents their main concern. Most of what I've done in my career, and most of what I've taught, is about how to have the back-and-forth dynamic dialoq with the patient, to help them formulate what's really on their mind, and make sure I understand it before proceeding. Seems to me that statistical consulting, or working IT in an organization where one is serving "internal" customers, is similar. Students need to learn the skills of clarifying, verifying, and paraphrasing a client's needs. Sure, ChatGPT may be able to generate decent code to do what a client *says* they want to do, but is that *really* what they are looking for? So don't retire yet :) --Chris Ryan Boris Steipe wrote: > Technically not a help question. But crucial to be aware of, especially for > those of us in academia, or otherwise teaching R. I am not aware of a > suitable alternate forum. If this does not interest you, please simply ignore > - I already know that this may be somewhat OT. > > Thanks. > -- > > You very likely have heard of ChatGPT, the conversation interface on top of > the GPT-3 large language model and that it can generate code. I thought it > doesn't do R - I was wrong. Here is a little experiment: > > MY PROMPT: > == > Please write me some R code that will read a tsv file, remove all rows that > have duplicates in the column "names", and store the median values of the > remaining rows in a variable called "Medians". Of course, that calculation > should only be done with columns that are numeric. Print the last variable in > order from largest to smallest. > > > AI RESPONSE: > > > Here is some R code that should accomplish what you've described: > > (Note: the code is formatted in a little terminal window, complete with a > button to "copy code" at the top.) > > > # Read the tsv file > data <- read.table("file.tsv", header = TRUE, sep = "\t") > > # Remove rows with duplicates in the "names" column > data <- data[!duplicated(data$names), ] > > # Select only the numeric columns > num_columns <- sapply(data, is.numeric) > numeric_data <- data[, num_columns] > > # Calculate the medians for each column > Medians <- apply(numeric_data, 1, median) > > # Sort Medians in descending order > Medians <- sort(Medians, decreasing = TRUE) > > # Print Medians > print(Medians) > > > This code assumes that the tsv file is called "file.tsv" and is stored in the > current working directory. It also assumes that the first row of the file > contains the column names. > > > > MY PROMPT: > == > Can you provide me with a bit of sample data that I can use to test this code? > > AI RESPONSE: > > Certainly! Here is some sample data that you can use to test the code I > provided: > > > name age height weight > Alice 23 5.6 120 > Bob 32 6.2 180 > Charlie 29 5.9 160 > Alice 23 5.6 120 > David 25 5.8 140 > Eve 27 5.5 110 > > > You can save this data as a tsv file and use it to test the code. > > > -- > > Notes: > 1) The code, as written does not work with the test data. The reason is that > the test data specifies a column "name", but the code assumes "names" (which > I specified in the instructions). Once this is fixed in the test data, the > code executes. > > 2) The instructions are not accurately translated: I had asked for row means. > But changing the margin in the apply() statement from 2 to 1 does the correct > thing. > > 3) Note that the test data contains both numeric and non-numeric columns. > Also, the name "Alice" is duplicated in the test data, which is subtle, and > the right thing to do. > > > -- > > A similar experiment, run on the beta-site of OpenAI's Codex code-completion > tool gives the following code (complete with the comments.): > > > # Read in the data > data <- read.table("mydata.tsv", sep = "\t", header = TRUE) > > # Remove all rows that have duplicates in the column "names" > unique_names <- unique(data$names) > data_unique <- data[data$names %in% unique_names,] > > # Calculate medians > Medians <- sapply(data_unique[, sapply(data_unique, is.numeric)], median, > na.rm = TRUE) > > # Print last variable in order form largest to smallest > rev(sort(Medians[length(Medians)])) > > > Note that the strategy is quite different (e.g using %in%, not duplicated() > ), the interpretation of "last variable" is technically correct but not what > I had in mind (ChatGPT got that right though). > > > Changing my prompts slightly resulted it going for a dplyr solution instead, > complete with %>% idioms etc ... again, syntactically correct but not giving > me the fully correct results. > >
Re: [R] Circular Graph Recommendation Request
If the units of analysis are real spatial regions (e.g. states), how about a cartogram? https://gisgeography.com/cartogram-maps/ An R package (I have no experience with it) https://cran.r-project.org/web/packages/cartogram/index.html The advantage of a cartogram is that it is a single graphic, rather than 2 like the original post referenced. No need to move eye back and forth to decode the colors. And it maintains---as much as possible given the distortion, which is the whole point of a cartogram--- the relative spatial positions of the areal units (in this case, states.) The round figure in the original post has the northern midwestern region in the 7:00 to 8:00-ish position, what might be considered notionally the "southwest." A little counterintuitive. --Chris Ryan Bert Gunter wrote: > Very nice plot. Thanks for sharing. > Can't help directly, but as the plot is sort of a map with polygonal > areas encoding the value of a variable, you might try posting on > r-sig-geo instead where there might be more relevant expertise in such > things -- or perhaps suggestions for alternative visualizations that > work similarly. > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Sat, May 28, 2022 at 8:39 AM Stephen H. Dawson, DSL via R-help > wrote: >> >> https://www.visualcapitalist.com/us-goods-exports-by-state/ >> Visualizing U.S. Exports by State >> >> Good Morning, >> >> >> https://www.visualcapitalist.com/wp-content/uploads/2022/05/us-exports-by-state-infographic.jpg >> >> Saw an impressive graph today. Sharing with the list. >> >> The size proportionality of the state segments in a circle graph is catchy. >> >> QUESTION >> Is there a package one could use with R to accomplish this particular >> circular-style graph? >> >> >> Kindest Regards, >> -- >> *Stephen Dawson, DSL* >> /Executive Strategy Consultant/ >> Business & Technology >> +1 (865) 804-3454 >> http://www.shdawson.com >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External Email] Re: how to rename variables by lopping off first 3 characters
Thanks! This has gotten me on the right track. One of my mistakes was that I was explicitly using the word select, as in rename_at(select(starts_with or rename_at(select(vars(starts_with --Chris Ryan On Mon, Mar 14, 2022 at 1:11 PM Bert Gunter wrote: > If you care to entertain this, one of many simple base R ways to do this > is: > > dat <- data.frame( > +i..One = 1:3, > +i..Two = letters[1:3], > +ixx = 5:7) > > dat > i..One i..Two ixx > 1 1 a 5 > 2 2 b 6 > 3 3 c 7 > > > nm <- names(dat) > > nm <- ifelse(substring(nm, 1,3) == "i..", > + substring(nm,4), > + nm) > > names(dat) <- nm > > dat > One Two ixx > 1 1 a 5 > 2 2 b 6 > 3 3 c 7 > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Mar 14, 2022 at 9:27 AM Christopher W Ryan via R-help > wrote: > > > > I have data coming to me from another source, in which some of the > variable > > names begin with "i.." > > > > As in "i..actual_meaningful_var_name" > > > > I would like to remove the first three characters from any variable name > if > > they are "i.." > > > > I'm using R on Win 10 and dplyr, so ideally I'm looking for a dplyr > > solution. Apparently I'm just not understanding how the various select, > > contains, rename_at, rename_with, and so-on dplyr expressions work. I've > > tried various arrangements of them, usually resulting in > > > > Error: `contains()` must be used within a *selecting* function. > > i See <https://tidyselect.r-lib.org/reference/faq-selection-context.html > > > > > > A simple select(contains(foo)) I can do fine, to select a subset of > > variables. It's combining it with renaming that I am struggling with. > > > > Grateful for any advice. > > > > Thanks. > > > > --Chris Ryan > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] the opposite of pluck() in purrr
I've just learned about pluck() and chuck() in the purrr package. Very cool! As I understand it, they both will return one element of a list, either by name or by [[]] index, or even "first" or "last" I was hoping to find a way to return all *but* one specified element of a list. Speaking loosely, pluck(-1) or pluck(!1) or !pluck(1), but none of those of course work. Thinking of English language, I had hopes for chuck(1) as in "chuck element 1 away, leaving the rest" but that's now how it works. Any tidyverse-centric ways to return all except one specified element of a list? Thanks. --Chris Ryan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External Email] Re: how to make the far right section of a smoother line look different from the rest of that line?
Thanks Bert, that looks promising. panel.smoother() is from latticeExtra https://rdrr.io/cran/latticeExtra/man/panel.smoother.html --Chris Ryan On Tue, Nov 16, 2021 at 2:08 PM Bert Gunter wrote: > Where did you get panel.smoother()? There is no such panel function in > lattice. > > Is something like this what you want? > > x <- 1:100 > y <- rnorm(100, mean =5) > end <- 91 # tentative smooth after this > xyplot(y ~x, cutoff = end, col1 = "black", col2 = "red" >, panel = function(x, y, col1, col2, cutoff) { > sqleft <- seq_len(cutoff) > sqright <- seq.int(cutoff +1, length(x)) > col <- rep(c(col1,col2), times = c(cutoff, length(x) - cutoff)) > panel.points(x, y, col = col) > ylo <- predict(loess(y ~ x)) > panel.lines(x[sqleft], ylo[sqleft], col = col1, lwd =2) > panel.lines(x[sqright], ylo[sqright], col = col2, lwd = 2, lty = > "dotted") >}) > > Notes: > 1. This works because of loess default to predict at given x's. Modify as > required if you change to another smoother or wish to use different points > at which to plot the smoother. > 2. This can almost certainly be done by creating a grouping variable to > separate the two plotting regimes and might be slicker and more robust with > that approach. > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Nov 16, 2021 at 7:45 AM Christopher W Ryan via R-help < > r-help@r-project.org> wrote: > >> eclrs.3 %>% >> mutate(start.week = floor_date(realCollectionDate, unit = "week")) %>% >> group_by(start.week, k12) %>% >> summarise(n = n(), pctpos = 100 * mean(realResult)) %>% >> xyplot(pctpos ~ start.week | k12, col = "red", data = ., layout = c(1,2), >> ylab = "percent of test results positive", xlab = "specimen collection >> date", strip = strip.custom(strip.names = c(TRUE, TRUE)), sub = "The >> final week shown may not yet be complete so is likely inaccurate", panel >> = function(...){ >> panel.xyplot(..., type = "p", cex = 0.8) >> panel.smoother(..., span = 0.3, col.se = "red", alpha.se = 0.08, lwd >> = >> 2)}) >> >> The above takes patient-level data, each record containing a date, and >> aggregates them by week according to that date, then plots 2 weekly time >> series, one for k12 and one for not-k12, each with a smoother. Note my >> disclaimer in the subtitle that "the final week shown may not yet be >> complete . . . ." since I might run this on any arbitrary day. How might I >> change the appearance of the smoother lines to emphasize to the viewer >> that >> the recent trends are tentative due to daily data still coming in? For >> example, how might I make the far right-end stretch of the lines, >> representing the most recent week or two, dotted? >> >> Thanks. >> >> --Chris Ryan >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to make the far right section of a smoother line look different from the rest of that line?
eclrs.3 %>% mutate(start.week = floor_date(realCollectionDate, unit = "week")) %>% group_by(start.week, k12) %>% summarise(n = n(), pctpos = 100 * mean(realResult)) %>% xyplot(pctpos ~ start.week | k12, col = "red", data = ., layout = c(1,2), ylab = "percent of test results positive", xlab = "specimen collection date", strip = strip.custom(strip.names = c(TRUE, TRUE)), sub = "The final week shown may not yet be complete so is likely inaccurate", panel = function(...){ panel.xyplot(..., type = "p", cex = 0.8) panel.smoother(..., span = 0.3, col.se = "red", alpha.se = 0.08, lwd = 2)}) The above takes patient-level data, each record containing a date, and aggregates them by week according to that date, then plots 2 weekly time series, one for k12 and one for not-k12, each with a smoother. Note my disclaimer in the subtitle that "the final week shown may not yet be complete . . . ." since I might run this on any arbitrary day. How might I change the appearance of the smoother lines to emphasize to the viewer that the recent trends are tentative due to daily data still coming in? For example, how might I make the far right-end stretch of the lines, representing the most recent week or two, dotted? Thanks. --Chris Ryan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External Email] Live Online Training for High School Teachers and Students
Tracy-- I enjoy doing this sort of thing. Over the years I've done two full-day "introduction to R" workshops for high school students. The workshops also inevitably get into software-agnostic, basic issues about how to think about data, and how to measure, record, and store it---which is all pretty cool. They were in-person, pre-pandemic workshops, but I believe could be adapted to a remote, online approach. Feel free to email me. If you don't already know about it, you might also be interested in the R-sig-teaching List here: https://stat.ethz.ch/mailman/listinfo/r-sig-teaching --Chris Ryan On Thu, Nov 4, 2021 at 3:06 AM Tracy Lenz wrote: > Hi, > > I am looking for live training that can be conducted via Zoom or another > online platform to assist high school teachers and students who are working > with R. These teachers and students are using R at a very basic level. > They've reviewed a variety of beginner-level texts and videos on R, but > they continue to encounter issues that could be resolved in a session with > someone who is more familiar with R. I'm not looking for a long-term > solution such as a Code Academy course; rather, this session would be > intended as a brief beginner's introduction to R as well as a Q for > specific use cases and troubleshooting. I've searched online for such > offerings but have not found anything. If anyone has any advice, I'd > appreciate it. Thanks! > > Tracy Lenz > > > > > > > > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External Email] Group by and add a constant value based on a condition dply
Is the grouping beforehand necessary? Could you simply, "for all the dates that are "4.01.2020" and have the "Value" greater than zero add 5 to the "Value" "? I may be missing something. --Chris Ryan On Wed, May 26, 2021 at 11:53 AM Elahe chalabi via R-help < r-help@r-project.org> wrote: > Hi everyone, > > I have the following dataframe: > > > > structure(list(Department = c("A", "A", "A", "A", "A", "A", "A", > "A", "B", "B", "B", "B", "B", "B", "B", "B"), Class = c(1L, 1L, > 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), Value = c(0L, > 100L, 800L, 800L, 0L, 300L, 1200L, 0L, 0L, 0L, 400L, 400L, 200L, > 800L, 1200L, 0L), Date = c("1.01.2020", "2.01.2020", "3.01.2020", > "4.01.2020", "1.01.2020", "2.01.2020", "3.01.2020", "4.01.2020", > "1.01.2020", "2.01.2020", "3.01.2020", "4.01.2020", "1.01.2020", > "2.01.2020", "3.01.2020", "4.01.2020")), class = "data.frame", > row.names = c(NA, > -16L)) > > using dplyr I need to group by "Depatment" and "Class" and then for all > the dates that are "4.01.2020" and have the "Value" greater than zero add > 5 to the "Value", meaning the desired dataframe will be (NewValue column) : > > > >structure(list(Department = c("A", "A", "A", "A", "A", "A", "A", > "A", "B", "B", "B", "B", "B", "B", "B", "B"), Class = c(1L, 1L, > 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), Value = c(0L, > 100L, 800L, 800L, 0L, 300L, 1200L, 0L, 0L, 0L, 400L, 400L, 200L, > 800L, 1200L, 0L), Date = c("1.01.2020", "2.01.2020", "3.01.2020", > "4.01.2020", "1.01.2020", "2.01.2020", "3.01.2020", "4.01.2020", > "1.01.2020", "2.01.2020", "3.01.2020", "4.01.2020", "1.01.2020", > "2.01.2020", "3.01.2020", "4.01.2020"), NewValue = c(0L, 100L, > 800L, 805L, 0L, 300L, 1200L, 0L, 0L, 0L, 400L, 405L, 200L, 800L, > 1200L, 0L)), class = "data.frame", row.names = c(NA, -16L)) > > > Thanks a lot for any help! > Elahe > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.