A potential problem with
ave(dat_2$D, dat_2$S, FUN=order)
is that it will silently give the wrong answer
or give an error if dat_2$D is not numeric.
E.g., if D is a Date vector we get
> dat_3 <- dat_2[,1:2]
> dat_3$D <- as.Date(paste0("2015-02-", dat_2$D))
> with(dat_3, ave(D, S, FUN=order))
Error in as.Date.numeric(value) : 'origin' must be supplied
Another problem is that it may take a lot more time than
is required if you have a lot of small groups in your data.
Both of those are avoided if you sort the entire dataset first
and 'unsort' the results when putting them into dataset.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Feb 4, 2015 at 12:53 PM, David L Carlson <[email protected]> wrote:
> How about?
>
> > ave(dat$D, dat$S, FUN=order)
> [1] 2 1 1 1 2 3
> > ave(dat_2$D, dat_2$S, FUN=order)
> [1] 2 2 1 1 1 3
>
> Note, your answer for the second example is incorrect since row 2 (c, 3)
> and row 5 (c, 2) are both assigned 2.
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -----Original Message-----
> From: R-help [mailto:[email protected]] On Behalf Of Tom Wright
> Sent: Wednesday, February 4, 2015 2:08 PM
> To: Rui Barradas
> Cc: [email protected]
> Subject: Re: [R] Still trying to avoid loops
>
> Thanks, I was not aware of order().
> I did deliberately mess up the order of S. The following example breaks
> your solution
> dat_2<-data.frame(S=factor(c('a','c','a','b','c','c')),
> D=c(5,3,1,3,2,4))
>
> which should give the answer c(2,2,1,1,2,3)
>
> Your solution does indicate that sorting the data correctly before
> starting might solve the problem.
>
>
> On Wed, 2015-02-04 at 19:49 +0000, Rui Barradas wrote:
> > Hello,
> >
> > Aren't the levels of your example wrong? If the levels are
> > levels=c('a','b','c'), not c('b', 'a', 'c'), then the following will do
> > the job.
> >
> > unname(unlist(tapply(dat$D, dat$S, order)))
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Em 04-02-2015 19:34, Tom Wright escreveu:
> > > Given a dataframe:
> > >
> dat<-data.frame(S=factor(c('a','b','a','c','c','c',levels=c('b','a','c')),
> > > D=c(1,5,3,2,3,4))
> > >
> > > where S is a subject identifier and D a visit (actually a date in my
> > > real dataset). I would like to generate another column giving the visit
> > > number
> > >
> > > R=c(2,1,1,1,2,3)
> > >
> > > My current solution uses nested loops and is slow and ugly. I've looked
> > > at by() but can't see how to keep the order of R correct.
> > >
> > > Thanks,
> > > Tom
> > >
> > > ______________________________________________
> > > [email protected] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
> ______________________________________________
> [email protected] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [email protected] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.