Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Clint Bowman
Thanks, Dimitri. Burt is the real wizard here--I'll bet he can conjure up an elegant solution. For me, just reaching a desired endpoint is enoughg. Clint Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu

[R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Dimitri Liakhovitski
Hello! I have a data frame: md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5), device = c(1,1,2,2,3,3)) myvars = c(a, b, c) md[2,3] - NA md[4,1] - NA md I want to count number of 5s in each column - by device. I can do it like this: library(dplyr) group_by(md,

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Clint Bowman
Any problem with colSums(md==5, na.rm=T) Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600FAX:(360)

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Clint Bowman
It would help if I could see beyond my allergy meds. A start could be: colSums(subset(md,md$device==1)==5,na.rm=T) colSums(subset(md,md$device==2)==5,na.rm=T) colSums(subset(md,md$device==3)==5,na.rm=T) Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Clint Bowman
May want to add headers but the following provides the device number with each set fo sums: for (dev in (unique(md$device))) {cat(colSums(subset(md,md$device==dev)==5,na.rm=T),dev,\n)} Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET:

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Dimitri Liakhovitski
No problem at all, Clint. I was just trying to figure out of dplyr can do it. On Tue, Jun 16, 2015 at 1:40 PM, Clint Bowman cl...@ecy.wa.gov wrote: Any problem with colSums(md==5, na.rm=T) Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Dimitri Liakhovitski
Thank you, Bert. I'll be honest - I am just learning dplyr and was wondering if one could do it in dplyr. But of course your solution is perfect... On Tue, Jun 16, 2015 at 1:50 PM, Bert Gunter bgunter.4...@gmail.com wrote: Well, dplyr seems a bit of overkill as it's so simple with plain old

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Bert Gunter
Well, dplyr seems a bit of overkill as it's so simple with plain old vapply() in base R : dat - data.frame (a=sample(1:5,10,rep=TRUE), +b=sample(3:7,10,rep=TRUE), +g = sample(7:9,10,rep=TRUE)) vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L) a b g 5 4 0

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Dimitri Liakhovitski
Except, of course, Bert, that you forgot that it had to be done by device. Your solution ignores the device. md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5), device = c(1,1,2,2,3,3)) myvars = c(a, b, c) md[2,3] - NA md[4,1] - NA md vapply(md[myvars], function(x)

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Bert Gunter
Yes, indeed. Thanks, David. But if you check, tapply, aggregate(), by(), etc. are all basically wrappers to lapply() .So it's all a question of what syntax one feels most comfortable with. However note that data.table, plyR stuff and perhaps others are different in that they re-implement the

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread David Winsemius
On Jun 16, 2015, at 11:18 AM, Clint Bowman wrote: Thanks, Dimitri. Burt is the real wizard here--I'll bet he can conjure up an elegant solution. This would be base method: by( md[-4]==5, md[4], colSums) device: 1 a b c 1 2 0 - device:

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Hadley Wickham
On Tue, Jun 16, 2015 at 12:24 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame: md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5), device = c(1,1,2,2,3,3)) myvars = c(a, b, c) md[2,3] - NA md[4,1] - NA md I want to

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread David L Carlson
Not in base, but in stats: aggregate(md[,-4]==5, list(device=md$device), sum, na.rm=TRUE) device a b c 1 1 1 2 0 2 2 0 1 0 3 3 1 0 2 - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Dimitri Liakhovitski
Thank you guys - it's a great learning: 'summarise_each' and 'funs' On Tue, Jun 16, 2015 at 3:47 PM, Hadley Wickham h.wick...@gmail.com wrote: On Tue, Jun 16, 2015 at 12:24 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame: md - data.frame(a =

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Dimitri Liakhovitski
Thank you, Clint. That's the thing: it's relatively easy to do it in base, but the resulting code is not THAT simple. I thought dplyr would make it easy... On Tue, Jun 16, 2015 at 2:06 PM, Clint Bowman cl...@ecy.wa.gov wrote: May want to add headers but the following provides the device number

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Bert Gunter
... my bad! -- I filed to read carefully. A base syntax version is: dat - data.frame (a=sample(1:5,10,rep=TRUE), b=sample(3:7,10,rep=TRUE), g = sample(7:9,10,rep=TRUE)) dev - sample(1:3,10,rep=TRUE) sapply(dat,function(x)