The problem is that I cannot see how your use of rle and/or seq_along could possibly lead to the sample result you are giving us. That is why I asked for a new example. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity.
On January 2, 2015 5:11:09 PM PST, Beejai <kate.ignat...@gmail.com> wrote: >Obviously this is why I need help... > >This is a larger data frame. I'm only posting something small here to >make it simple. There are many Groups which are larger, and I want to >assign a sequence value to consecutive rows where sumchild in not >equal to 0. As the data frame I'm working with is much larger, this >goes up to 100 maybe even 200 and I have many different groups 20K+. >I would like to do this for every group, not for the whole data frame. > >There is no particular science behind this, only data organizing. > >So just say we had data like so: > > Dad Mum Child Group sumdad summum sumchild childseg > 1: AA RR RA A 2 2 0 0 > 2: AA RR RR A 2 2 1 1 > 3: AA AA AA B 4 5 5 1 > 4: AA AA RA B 4 5 5 0 > 5: RA AA RR B 0 5 5 2 > 6: RR AA RR B 4 5 5 2 > 7: AA AA AA B 4 5 5 2 > 8: AA AA AA C 3 3 0 1 > 9: AA AA RA C 3 3 0 0 >10: AA RR RR C 3 3 0 2 > 11: AA RR RA C 2 2 0 0 > 12: AA RR RR C 2 2 1 3 > 13: AA AA AA C 4 5 5 3 > 14: AA AA RA C 4 5 5 0 > 15: RA AA RR C 0 5 5 4 > >On Fri, Jan 2, 2015 at 12:29 PM, David Winsemius [via R] ><ml-node+s789695n4701316...@n4.nabble.com> wrote: >> >> On Jan 2, 2015, at 12:07 AM, Kate Ignatius wrote: >> >>> Ah, crap. Yep you're right. This is not going too well. Okay - let >>> me try that again: >>> >>> x$childseg<-0 >>> x<-x$sumchild !=0 >> >> That previous line would appear to overwrite the entire dataframe >with the >> value of one vector >> >>> span<-rle(x)$lengths[rle(x)$values==TRUE] >>> x$childseg[x]<-rep(seq_along(span), times = span) >>> >>> Does this one have any errors? >> Even assuming that the code from Jeff Newmiller is creating those >objects I >> get >> >>> x$childseg[x]<-rep(seq_along(span), times = span) >> Error in `*tmp*`$childseg : $ operator is invalid for atomic vectors >> >> In the last line you are indexing a vector with a dataframe (or >perhaps a >> data.table). >> >> If we use Newmiller's object and then change some of the instances of >"x" in >> your code to DT we get: >> >>> DT$childseg<-0 >>> x<-DT$sumchild !=0 # Try not to overwrite your data-objects >>> span<-rle(x)$lengths[rle(x)$values==TRUE] >>> DT$childseg[x]<-rep(seq_along(span), times = span) >>> DT >> Dad Mum Child Group sumdad summum sumchild childseg >> 1: AA RR RA A 2 2 0 0 >> 2: AA RR RR A 2 2 1 1 >> 3: AA AA AA B 4 5 5 1 >> 4: AA AA AA B 4 5 5 1 >> 5: RA AA RR B 0 5 5 1 >> 6: RR AA RR B 4 5 5 1 >> 7: AA AA AA B 4 5 5 1 >> 8: AA AA RA C 3 3 0 0 >> 9: AA AA RA C 3 3 0 0 >> 10: AA RR RA C 3 3 0 0 >> >> You persist in posting code where you do not explain what you are >trying to >> do with it. You have already been told that your earlier efforts >using `rle` >> did not make any sense. Post a complete example and then explain what >you >> desire as an object. It's often helpful to provide a scientific >background >> for what the data represents. >> >> -- >> David. >> >>> >>> >>> On Fri, Jan 2, 2015 at 2:32 AM, David Winsemius <[hidden email]> >wrote: >>>> >>>>> On Jan 1, 2015, at 5:07 PM, Kate Ignatius <[hidden email]> wrote: >>>>> >>>>> Apologies - mix up of syntax all over the place, a habit of mine. >The >>>>> last line was in there because of code beforehand so it really >doesn't >>>>> need to be there. Here is the proper code I hope: >>>>> >>>>> childseg<-0 >>>>> x<-sumchild ==0 >>>>> span<-rle(x)$lengths[rle(x)$values==TRUE] >>>>> childseg[x]<-rep(seq_along(span), times = span) >>>>> >>>> >>>> This remains not reproducible. We have no idea what sumchild might >be and >>>> the code throws an error. My guess is that you are trying to get a >result >>>> such as would be delivered by: >>>> >>>> childseg <- sumchild[ sumchild != 0 ] >>>> >>>> — >>>> David. >>>> >>>>> >>>>> On Thu, Jan 1, 2015 at 12:13 PM, Jeff Newmiller >>>>> <[hidden email]> wrote: >>>>>> Thank you for attempting to encode what you want using R syntax, >but >>>>>> you are not really succeeding yet (too many errors). Perhaps >another hand >>>>>> generated result would help? A new input data frame might or >might not be >>>>>> needed to illustrate desired results. >>>>>> >>>>>> Your second and third lines are syntactically incorrect, and I >don't >>>>>> understand what you hope to accomplish by assigning an empty >string to a >>>>>> numeric in your last line. >>>>>> >>>>>> >--------------------------------------------------------------------------- >>>>>> Jeff Newmiller The ..... ..... >Go >>>>>> Live... >>>>>> DCN:<[hidden email]> Basics: ##.#. ##.#. Live Go... >>>>>> Live: OO#.. Dead: OO#.. >Playing >>>>>> Research Engineer (Solar/Batteries O.O#. #.O#. >with >>>>>> /Software/Embedded Controllers) .OO#. .OO#. >>>>>> rocks...1k >>>>>> >>>>>> >--------------------------------------------------------------------------- >>>>>> Sent from my phone. Please excuse my brevity. >>>>>> >>>>>> On January 1, 2015 4:16:52 AM PST, Kate Ignatius <[hidden email]> >>>>>> wrote: >>>>>>> Is it possible to add the following code or similar in >data.table: >>>>>>> >>>>>>> childseg<-0 >>>>>>> x:=sumchild <-0 >>>>>>> span<-rle(x)$lengths[rle(x)$values==TRUE >>>>>>> childseg[x]<-rep(seq_along(span), times = span) >>>>>>> childseg[childseg == 0]<-'' >>>>>>> >>>>>>> I was hoping to do this code by Group for mum, dad and >>>>>>> child. The problem I'm having is with the >>>>>>> span<-rle(x)$lengths[rle(x)$values==TRUE line which I'm not sure >can >>>>>>> be added to data.table. >>>>>>> >>>>>>> [Previous email had incorrect code] >>>>>>> >>>>>>> On Wed, Dec 31, 2014 at 3:45 AM, Jeff Newmiller >>>>>>> <[hidden email]> wrote: >>>>>>>> I do not understand the value of using the rle function in your >>>>>>> description, >>>>>>>> but the code below appears to produce the table you want. >>>>>>>> >>>>>>>> Note that better support for the data.table package might be >found at >>>>>>>> stackexchange as the documentation specifies. >>>>>>>> >>>>>>>> x <- read.table( text= >>>>>>>> "Dad Mum Child Group >>>>>>>> AA RR RA A >>>>>>>> AA RR RR A >>>>>>>> AA AA AA B >>>>>>>> AA AA AA B >>>>>>>> RA AA RR B >>>>>>>> RR AA RR B >>>>>>>> AA AA AA B >>>>>>>> AA AA RA C >>>>>>>> AA AA RA C >>>>>>>> AA RR RA C >>>>>>>> ", header=TRUE, stringsAsFactors=FALSE ) >>>>>>>> >>>>>>>> library(data.table) >>>>>>>> DT <- data.table( x ) >>>>>>>> DT[ , cdad := as.integer( Dad %in% c( "AA", "RR" ) ) ] >>>>>>>> DT[ , sumdad := 0L ] >>>>>>>> DT[ 1==DT$cdad, sumdad := sum( cdad ), by=Group ] >>>>>>>> DT[ , cdad := NULL ] >>>>>>>> DT[ , cmum := as.integer( Mum %in% c( "AA", "RR" ) ) ] >>>>>>>> DT[ , summum := 0L ] >>>>>>>> DT[ 1==DT$cmum, summum := sum( cmum ), by=Group ] >>>>>>>> DT[ , cmum := NULL ] >>>>>>>> DT[ , cchild := as.integer( Child %in% c( "AA", "RR" ) ) ] >>>>>>>> DT[ , sumchild := 0L ] >>>>>>>> DT[ 1==DT$cchild, sumchild := sum( cchild ), by=Group ] >>>>>>>> DT[ , cchild := NULL ] >>>>>>>> >>>>>>>>> DT >>>>>>>> >>>>>>>> Dad Mum Child Group sumdad summum sumchild >>>>>>>> 1: AA RR RA A 2 2 0 >>>>>>>> 2: AA RR RR A 2 2 1 >>>>>>>> 3: AA AA AA B 4 5 5 >>>>>>>> 4: AA AA AA B 4 5 5 >>>>>>>> 5: RA AA RR B 0 5 5 >>>>>>>> 6: RR AA RR B 4 5 5 >>>>>>>> 7: AA AA AA B 4 5 5 >>>>>>>> 8: AA AA RA C 3 3 0 >>>>>>>> 9: AA AA RA C 3 3 0 >>>>>>>> 10: AA RR RA C 3 3 0 >>>>>>>> >>>>>>>> >>>>>>>> On Tue, 30 Dec 2014, Kate Ignatius wrote: >>>>>>>> >>>>>>>>> I'm trying to use both these packages and wondering whether >they are >>>>>>>>> possible... >>>>>>>>> >>>>>>>>> To make this simple, my ultimate goal is determine long >stretches of >>>>>>>>> 1s, but I want to do this within groups (hence using the >data.table >>>>>>> as >>>>>>>>> I use the "set key" option. However, I'm I'm not having much >luck >>>>>>>>> making this possible. >>>>>>>>> >>>>>>>>> For example, for simplistic sake, I have the following data: >>>>>>>>> >>>>>>>>> Dad Mum Child Group >>>>>>>>> AA RR RA A >>>>>>>>> AA RR RR A >>>>>>>>> AA AA AA B >>>>>>>>> AA AA AA B >>>>>>>>> RA AA RR B >>>>>>>>> RR AA RR B >>>>>>>>> AA AA AA B >>>>>>>>> AA AA RA C >>>>>>>>> AA AA RA C >>>>>>>>> AA RR RA C >>>>>>>>> >>>>>>>>> And the following code which I know works >>>>>>>>> >>>>>>>>> hetdad <- as.numeric(x[c(1)]=="AA" | x[c(1)]=="RR") >>>>>>>>> sumdad <- rle(hetdad)$lengths[rle(hetdad)$values==1] >>>>>>>>> >>>>>>>>> hetmum <- as.numeric(x[c(2)]=="AA" | x[c(2)]=="RR") >>>>>>>>> summum <- rle(hetmum)$lengths[rle(hetmum)$values==1] >>>>>>>>> >>>>>>>>> hetchild <- as.numeric(x[c(3)]=="AA" | x[c(3)]=="RR") >>>>>>>>> sumchild <- rle(hetchild)$lengths[rle(hetchild)$values==1] >>>>>>>>> >>>>>>>>> However, I wish to do the above code by Group (though this >file is >>>>>>>>> millions of rows long and groups will be larger but just >wanted to >>>>>>>>> simply the example). >>>>>>>>> >>>>>>>>> I did something like this but of course I got an error: >>>>>>>>> >>>>>>>>> LOH[,hetdad:=as.numeric(x[c(1)]=="AA" | x[c(1)]=="RR")] >>>>>>>>> >LOH[,sumdad:=rle(hetdad)$lengths[rle(hetdad)$values==1],by=Group] >>>>>>>>> LOH[,hetmum:=as.numeric(x[c(2)]=="AA" | x[c(2)]=="RR")] >>>>>>>>> >LOH[,summum:=rle(hetmum)$lengths[rle(hetmum)$values==1],by=Group] >>>>>>>>> LOH[,hetchild:=as.numeric(x[c(3)]=="AA" | x[c(3)]=="RR")] >>>>>>>>> >>>>>>> >>>>>>> >LOH[,sumchild:=rle(hetchild)$lengths[rle(hetchild)$values==1],by=Group] >>>>>>>>> >>>>>>>>> The reason being as I want to eventually have something like >this: >>>>>>>>> >>>>>>>>> Dad Mum Child Group sumdad summum sumchild >>>>>>>>> AA RR RA A 2 2 0 >>>>>>>>> AA RR RR A 2 2 1 >>>>>>>>> AA AA AA B 4 5 5 >>>>>>>>> AA AA AA B 4 5 5 >>>>>>>>> RA AA RR B 0 5 5 >>>>>>>>> RR AA RR B 4 5 5 >>>>>>>>> AA AA AA B 4 5 5 >>>>>>>>> AA AA RA C 3 3 0 >>>>>>>>> AA AA RA C 3 3 0 >>>>>>>>> AA RR RA C 3 3 0 >>>>>>>>> >>>>>>>>> That is, I would like to have the specific counts next to what >I'm >>>>>>>>> consecutively counting per group. So for Group A for dad >there are >>>>>>> 2 >>>>>>>>> AAs, there are two RRs for mum but only 1 AA or RR for the >child >>>>>>> and >>>>>>>>> that is RR (so the 1 is next to the RR and not the RA). >>>>>>>>> >>>>>>>>> Can this be done? >>>>>>>>> >>>>>>>>> K. >>>>>>>>> >>>>>>>>> ______________________________________________ >>>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>> PLEASE do read the posting guide >>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>> and provide commented, minimal, self-contained, reproducible >code. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >--------------------------------------------------------------------------- >>>>>>>> Jeff Newmiller The ..... ..... > Go >>>>>>> Live... >>>>>>>> DCN:<[hidden email]> Basics: ##.#. ##.#. Live >>>>>>> Go... >>>>>>>> Live: OO#.. Dead: OO#.. >>>>>>> Playing >>>>>>>> Research Engineer (Solar/Batteries O.O#. #.O#. > with >>>>>>>> /Software/Embedded Controllers) .OO#. .OO#. >>>>>>> rocks...1k >>>>>>>> >>>>>>> >>>>>>> >--------------------------------------------------------------------------- >>>>>> >>>>> >>>>> ______________________________________________ >>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >> >> David Winsemius >> Alameda, CA, USA >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> ________________________________ >> If you reply to this email, your message will be added to the >discussion >> below: >> >http://r.789695.n4.nabble.com/rle-with-data-table-is-it-possible-tp4701211p4701316.html >> To unsubscribe from rle with data.table - is it possible?, click >here. >> NAML > > > > >-- >View this message in context: >http://r.789695.n4.nabble.com/rle-with-data-table-is-it-possible-tp4701211p4701332.html >Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.