Re: [R] combine filter() and select()
On Wed, Aug 19, 2020 at 10:03 AM Ivan Calandra wrote: > > Dear useRs, > > I'm new to the tidyverse world and I need some help on basic things. > > I have the following tibble: > mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = > 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) > > I want to subset the rows with "a" in the column "files", and keep only > that column. > > So I did: > myfile <- mytbl %>% > filter(grepl("a", files)) %>% > select(files) > > It works, but I believe there must be an easier way to combine filter() > and select(), right? Not in the tidyverse. As others have mentioned, both [ and subset() in base R allow you to simultaneously subset rows and columns, but there's no single verb in the tidyverse that does both. This is somewhat informed by the observation that in data frames, unlike matrices, rows and columns are not exchangeable, and you typically want to express subsetting in rather different ways. Hadley -- http://hadley.nz __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combine filter() and select()
A kind of hybrid answer is to use base::subset(), which supports non-standard evaluation (it searches for unquoted symbols like 'files' in the code line below in the object that is its first argument; %>% puts 'mytbl' in that first position) and row (filter) and column (select) subsets > mytbl %>% subset(files %in% "a", files) # A tibble: 1 x 1 files 1 a Or subset(grepl("a", files), files) if that was what you meant. One important idea that the tidyverse implements is, in my opinion, 'endomorphism' -- you get back the same type of object as you put in -- so I wouldn't use a base R idiom that returned a vector unless that were somehow essential for the next step in the analysis. There is value in having separate functions for filter() and select(), and probably there are edge cases where filter(), select(), and subset() behave differently, but for what it's worth subset() can be used to perform these operations individually > mytbl %>% subset(, files) # A tibble: 6 x 1 files 1 a 2 b 3 c 4 d 5 e 6 f > mytbl %>% subset(grepl("a", files), ) # A tibble: 1 x 2 files prop 1 a 1 Martin Morgan On 8/20/20, 2:48 AM, "R-help on behalf of Ivan Calandra" wrote: Hi Jeff, The code you show is exactly what I usually do, in base R; but I wanted to play with tidyverse to learn it (and also understand when it makes sense and when it doesn't). And yes, of course, in the example I gave, I end up with a 1-cell tibble, which could be better extracted as a length-1 vector. But my real goal is not to end up with a single value or even a single column. I just thought that simplifying my example was the best approach to ask for advice. But thank you for letting me know that what I'm doing is pointless! Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 19/08/2020 19:27, Jeff Newmiller wrote: > The whole point of dplyr primitives is to support data frames... that is, lists of columns. When you pare your data frame down to one column you are almost certainly using the wrong tool for the job. > > So, sure, your code works... and it even does what you wanted in the dplyr style, but what a pointless exercise. > > grep( "a", mytbl$file, value=TRUE ) > > On August 19, 2020 7:56:32 AM PDT, Ivan Calandra wrote: >> Dear useRs, >> >> I'm new to the tidyverse world and I need some help on basic things. >> >> I have the following tibble: >> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = >> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) >> >> I want to subset the rows with "a" in the column "files", and keep only >> that column. >> >> So I did: >> myfile <- mytbl %>% >> filter(grepl("a", files)) %>% >> select(files) >> >> It works, but I believe there must be an easier way to combine filter() >> and select(), right? >> >> Thank you! >> Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combine filter() and select()
Hi Jeff, The code you show is exactly what I usually do, in base R; but I wanted to play with tidyverse to learn it (and also understand when it makes sense and when it doesn't). And yes, of course, in the example I gave, I end up with a 1-cell tibble, which could be better extracted as a length-1 vector. But my real goal is not to end up with a single value or even a single column. I just thought that simplifying my example was the best approach to ask for advice. But thank you for letting me know that what I'm doing is pointless! Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 19/08/2020 19:27, Jeff Newmiller wrote: > The whole point of dplyr primitives is to support data frames... that is, > lists of columns. When you pare your data frame down to one column you are > almost certainly using the wrong tool for the job. > > So, sure, your code works... and it even does what you wanted in the dplyr > style, but what a pointless exercise. > > grep( "a", mytbl$file, value=TRUE ) > > On August 19, 2020 7:56:32 AM PDT, Ivan Calandra wrote: >> Dear useRs, >> >> I'm new to the tidyverse world and I need some help on basic things. >> >> I have the following tibble: >> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = >> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) >> >> I want to subset the rows with "a" in the column "files", and keep only >> that column. >> >> So I did: >> myfile <- mytbl %>% >> filter(grepl("a", files)) %>% >> select(files) >> >> It works, but I believe there must be an easier way to combine filter() >> and select(), right? >> >> Thank you! >> Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combine filter() and select()
Dear Chris, I didn't think about having the assignment at the end as you showed; it indeed fits the pipe workflow better. By "easy", I actually meant shorter. As you said, in base R, I usually do that in 1 line, so I was hoping to do the same in tidyverse. But I'm glad to hear that I'm using tidyverse the proper way :) Best regards, Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 19/08/2020 19:21, Chris Evans wrote: > Inline > > - Original Message - >> From: "Ivan Calandra" >> To: "R-help" >> Sent: Wednesday, 19 August, 2020 16:56:32 >> Subject: [R] combine filter() and select() >> Dear useRs, >> >> I'm new to the tidyverse world and I need some help on basic things. >> >> I have the following tibble: >> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = >> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) >> >> I want to subset the rows with "a" in the column "files", and keep only >> that column. >> >> So I did: >> myfile <- mytbl %>% >> filter(grepl("a", files)) %>% >> select(files) >> >> It works, but I believe there must be an easier way to combine filter() >> and select(), right? > I would write > > mytbl %>% > filter(grepl("a", files)) %>% > select(files) -> myfile > > as I like to keep a sort of "top to bottom and left to right" flow when > writing in the tidyverse dialect of R but that's really not important. > > Apart from that I think what you've done is "proper tidyverse". To me another > difference between the dialects is that classical R often seems to put value > on, and make it easy, to do things with incredible few characters. I think > the people who are brilliant at that sort of coding, and there are many on > this list, that sort of coding is also easy to read. I know that Chinese is > easy to read if you grew up on it but to a bear of little brain like me, the > much more verbose style of tidyverse repays typing time with readability when > I come back to my code and, though I have little experience of this yet, when > I read other poeple's code. > > What did you think wasn't "easy" about what you wrote? > > Very best (all), > > Chris > >> Thank you! >> Ivan >> >> -- >> Dr. Ivan Calandra >> TraCEr, laboratory for Traceology and Controlled Experiments >> MONREPOS Archaeological Research Centre and >> Museum for Human Behavioural Evolution >> Schloss Monrepos >> 56567 Neuwied, Germany >> +49 (0) 2631 9772-243 >> https://www.researchgate.net/profile/Ivan_Calandra >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combine filter() and select()
The whole point of dplyr primitives is to support data frames... that is, lists of columns. When you pare your data frame down to one column you are almost certainly using the wrong tool for the job. So, sure, your code works... and it even does what you wanted in the dplyr style, but what a pointless exercise. grep( "a", mytbl$file, value=TRUE ) On August 19, 2020 7:56:32 AM PDT, Ivan Calandra wrote: >Dear useRs, > >I'm new to the tidyverse world and I need some help on basic things. > >I have the following tibble: >mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = >1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) > >I want to subset the rows with "a" in the column "files", and keep only >that column. > >So I did: >myfile <- mytbl %>% > filter(grepl("a", files)) %>% > select(files) > >It works, but I believe there must be an easier way to combine filter() >and select(), right? > >Thank you! >Ivan -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combine filter() and select()
Inline - Original Message - > From: "Ivan Calandra" > To: "R-help" > Sent: Wednesday, 19 August, 2020 16:56:32 > Subject: [R] combine filter() and select() > Dear useRs, > > I'm new to the tidyverse world and I need some help on basic things. > > I have the following tibble: > mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = > 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) > > I want to subset the rows with "a" in the column "files", and keep only > that column. > > So I did: > myfile <- mytbl %>% > filter(grepl("a", files)) %>% > select(files) > > It works, but I believe there must be an easier way to combine filter() > and select(), right? I would write mytbl %>% filter(grepl("a", files)) %>% select(files) -> myfile as I like to keep a sort of "top to bottom and left to right" flow when writing in the tidyverse dialect of R but that's really not important. Apart from that I think what you've done is "proper tidyverse". To me another difference between the dialects is that classical R often seems to put value on, and make it easy, to do things with incredible few characters. I think the people who are brilliant at that sort of coding, and there are many on this list, that sort of coding is also easy to read. I know that Chinese is easy to read if you grew up on it but to a bear of little brain like me, the much more verbose style of tidyverse repays typing time with readability when I come back to my code and, though I have little experience of this yet, when I read other poeple's code. What did you think wasn't "easy" about what you wrote? Very best (all), Chris > > Thank you! > Ivan > > -- > Dr. Ivan Calandra > TraCEr, laboratory for Traceology and Controlled Experiments > MONREPOS Archaeological Research Centre and > Museum for Human Behavioural Evolution > Schloss Monrepos > 56567 Neuwied, Germany > +49 (0) 2631 9772-243 > https://www.researchgate.net/profile/Ivan_Calandra > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Small contribution in our coronavirus rigours: https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/ Chris Evans Visiting Professor, University of Sheffield I do some consultation work for the University of Roehampton and other places but remains my main Email address. I have a work web site at: https://www.psyctc.org/psyctc/ and a site I manage for CORE and CORE system trust at: http://www.coresystemtrust.org.uk/ I have "semigrated" to France, see: https://www.psyctc.org/pelerinage2016/semigrating-to-france/ https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/ If you want an Emeeting, I am trying to keep them to Thursdays and my diary is at: https://www.psyctc.org/pelerinage2016/ceworkdiary/ Beware: French time, generally an hour ahead of UK. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combine filter() and select()
Dear useRs, I'm new to the tidyverse world and I need some help on basic things. I have the following tibble: mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) I want to subset the rows with "a" in the column "files", and keep only that column. So I did: myfile <- mytbl %>% filter(grepl("a", files)) %>% select(files) It works, but I believe there must be an easier way to combine filter() and select(), right? Thank you! Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.