Ouch. I should have know all those points Rui: my bad. Casual behaviour while just rushing up a little example. Good to be reminded.
group_modify() is clearly exactly what I wanted and I will experiment with it and make sure I understand it properly. I see from the help that it evolves from, or supercedes aspects of do() which I think must have been the function I had forgotten. Even more interestingly I see that it seems to lead me into interesting options and experimental developments in tidyverse that I didn't know. Excellent. Perfect help ... many thanks! Chris ----- Original Message ----- > From: "Rui Barradas" <ruipbarra...@sapo.pt> > To: "Chris Evans" <chrish...@psyctc.org>, "R-help" <r-help@r-project.org> > Sent: Sunday, 5 July, 2020 13:16:19 > Subject: Re: [R] Can I pass the grouped portions of a dataframe/tibble to a > function in dplyr > Hello, > > I forgot to say I redid the data set setting the RNG seed first. > > > > set.seed(2020) > n <- 50 > x <- 1:n > y <- sample(1:3, n, replace = TRUE) > z <- rnorm(n) > tib <- tibble(x,y,z) > > > Also, don't do > > as_tibble(cbind(...)) > as.data.frame(cbind(...)) > > > If one of the variables is of a different class (example, "character") > all variables are coerced to the least common denominator. It's much > better to call tibble() or data.frame() directly. > > Hope this helps, > > Rui Barradas > > > Às 12:04 de 05/07/2020, Rui Barradas escreveu: >> Hello, >> >> You can pass a grouped tibble to a function with grouped_modify but the >> function must return a data.frame (or similar). >> >> ## this will also do it >> #sillyFun <- function(tib){ >> # tibble(nrow = nrow(tib), ncol = ncol(tib)) >> #} >> >> >> sillyFun <- function(tib){ >> data.frame(nrow = nrow(tib), ncol = ncol(tib))) >> } >> >> tib %>% >> group_by(y) %>% >> group_modify(~ sillyFun(.)) >> ## A tibble: 3 x 3 >> ## Groups: y [3] >> # y nrow ncol >> # <dbl> <int> <int> >> #1 1 17 2 >> #2 2 21 2 >> #3 3 12 2 >> >> >> Hope this helps, >> >> Rui Barradas >> >> Às 09:43 de 05/07/2020, Chris Evans escreveu: >>> Apologies if this is a stupid question but searching keeps getting >>> things I know and don't need. >>> >>> What I want to do is to use the group-by() power of dplyr to run >>> functions that expect a dataframe/tibble per group but I can't see how >>> do it. Here is a reproducible example. >>> >>> ### create trivial tibble >>> n <- 50 >>> x <- 1:n >>> y <- sample(1:3, n, replace = TRUE) >>> z <- rnorm(n) >>> tib <- as_tibble(cbind(x,y,z)) >>> >>> ### create trivial function that expects a tibble/data frame >>> sillyFun <- function(tib){ >>> return(list(nrow = nrow(tib), >>> ncol = ncol(tib))) >>> } >>> >>> ### works fine on the whole tibble >>> tib %>% >>> summarise(dim = list(sillyFun(.))) %>% >>> unnest_wider(dim) >>> >>> That gives me: >>> # A tibble: 1 x 2 >>> nrow ncol >>> <int> <int> >>> 1 50 3 >>> >>> >>> ### So I try the following hoping to apply the function to the grouped >>> tibble >>> tib %>% >>> group_by(y) %>% >>> summarise(dim = list(sillyFun(.))) %>% >>> unnest_wider(dim) >>> >>> ### But that gives me: >>> # A tibble: 3 x 3 >>> y nrow ncol >>> <dbl> <int> <int> >>> 1 1 50 3 >>> 2 2 50 3 >>> 3 3 50 3 >>> >>> Clearly "." is still passing the whole tibble, not the grouped >>> subsets. What I can't find is whether there is an alternative to "." >>> that would pass just the grouped subset of the tibble. >>> >>> I have bodged my way around this by writing a function that takes >>> individual columns and reassembles them into a data frame that the >>> actual functions I need to use require but that takes me back to a lot >>> of clumsiness both selecting the variables to pass in the dplyr call >>> to the function and putting the reassemble-to-data-frame bit in the >>> function I call. (The functions I really need are reliability >>> explorations and can called on whole dataframes.) >>> >>> I know I can do this using base R split and lapply but I feel sure it >>> must be possible to do this within dplyr/tidyverse. I'm slowly >>> transferring most of my code to the tidyverse and hitting frustrations >>> but also finding that it does really help me program more sensibly, >>> handle relational data structures more easily, and write code that I >>> seem better at reading when I come back to it after months on other >>> things so I am slowly trying to move all my coding to tidyverse. If I >>> could see how to do this, it would help. >>> >>> Very sorry if the answer should be blindingly obvious to me. I'd also >>> love to have pointers to guidance to the tidyverse written for people >>> who aren't professional coders or statisticians and that go a bit >>> beyond the obvious basics of tidyverse into issues like this. >>> >>> TIA, >>> >>> Chris >>> >> > > -- > Este e-mail foi verificado em termos de vírus pelo software antivírus Avast. > https://www.avast.com/antivirus -- Small contribution in our coronavirus rigours: https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/ Chris Evans <ch...@psyctc.org> Visiting Professor, University of Sheffield <chris.ev...@sheffield.ac.uk> I do some consultation work for the University of Roehampton <chris.ev...@roehampton.ac.uk> and other places but <ch...@psyctc.org> remains my main Email address. I have a work web site at: https://www.psyctc.org/psyctc/ and a site I manage for CORE and CORE system trust at: http://www.coresystemtrust.org.uk/ I have "semigrated" to France, see: https://www.psyctc.org/pelerinage2016/semigrating-to-france/ https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/ If you want an Emeeting, I am trying to keep them to Thursdays and my diary is at: https://www.psyctc.org/pelerinage2016/ceworkdiary/ Beware: French time, generally an hour ahead of UK. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.