Sorry - I should have clarified: My identifiers (in column "item") will always be unique. In other words, one entry in column "item" will never be repeated - neither in x nor in y. Dimitri
On Wed, Jan 30, 2013 at 1:27 PM, Dimitri Liakhovitski < dimitri.liakhovit...@gmail.com> wrote: > Thank you, everyone! I'll try to test those different approaches. Really > appreciate your help! > Dimitri > > On Wed, Jan 30, 2013 at 11:03 AM, arun <smartpink...@yahoo.com> wrote: > >> HI, >> >> Sorry, my previous solution doesn't work. >> This should work for your dataset: >> set.seed(1851) >> x<- >> data.frame(item=sample(letters[1:5],20,replace=TRUE),a=sample(1:15,20,replace=TRUE),b=sample(20:30,20,replace=TRUE),stringsAsFactors=F) >> y<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) >> x[x$a%in%which.min(x[x$a<y$a,]$a),]<- y #if there are multiple minimum >> values >> >> set.seed(1241) >> x1<- >> data.frame(item=sample(letters[1:10],1e4,replace=TRUE),a=sample(1:30,1e4,replace=TRUE),b=sample(1:100,1e4,replace=TRUE),stringsAsFactors=F) >> y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) >> length(x1$a[x1$a==1]) >> #[1] 330 >> system.time({x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1}) >> # user system elapsed >> # 0.000 0.000 0.001 >> length(x1$a[x1$a==1]) >> #[1] 0 >> >> >> #For some reason, it is not working when the multiple number of minimum >> values > some value >> >> set.seed(1241) >> x1<- >> data.frame(item=sample(letters[1:10],1e5,replace=TRUE),a=sample(1:30,1e5,replace=TRUE),b=sample(1:100,1e5,replace=TRUE),stringsAsFactors=F) >> y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) >> length(x1$a[x1$a==1]) >> #[1] 3404 >> x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1 >> length(x1$a[x1$a==1]) >> #[1] 3404 #not getting replaced >> >> #However, if I try: >> set.seed(1241) >> x1<- >> data.frame(item=sample(letters[1:10],1e6,replace=TRUE),a=sample(1:5000,1e6,replace=TRUE),b=sample(1:100,1e6,replace=TRUE),stringsAsFactors=F) >> y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) >> length(x1$a[x1$a==1]) >> #[1] 208 >> system.time(x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1) >> #user system elapsed >> # 0.124 0.016 0.138 >> length(x1$a[x1$a==1]) >> #[1] 0 >> >> >> #Tried Jessica's solution: >> set.seed(1851) >> x<- >> data.frame(item=sample(letters[1:5],20,replace=TRUE),a=sample(1:15,20,replace=TRUE),b=sample(20:30,20,replace=TRUE),stringsAsFactors=F) >> y<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) >> x[intersect(which(x$a < y$a),which.min(x$a)),] <- y >> x >> # item a b >> #1 a 8 25 >> #2 a 10 26 >> #3 f 3 10 #replaced >> #4 e 15 26 >> #5 b 13 20 >> #6 a 5 23 >> #7 d 4 29 >> #8 e 2 24 >> #9 c 7 30 >> #10 e 14 24 >> #11 d 2 20 >> #12 e 10 21 >> #13 c 13 27 >> #14 d 12 23 >> #15 b 11 26 >> #16 e 5 22 >> #17 c 1 26 #it is not replaced >> #18 a 8 21 >> #19 e 10 26 >> #20 c 2 22 >> >> >> >> A.K. >> >> >> >> >> >> ----- Original Message ----- >> From: Dimitri Liakhovitski <dimitri.liakhovit...@gmail.com> >> To: r-help <r-help@r-project.org> >> Cc: >> Sent: Tuesday, January 29, 2013 4:11 PM >> Subject: [R] Fastest way to compare a single value with all values in one >> column of a data frame >> >> Hello! >> >> I have a large data frame x: >> x<-data.frame(item=letters[1:5],a=1:5,b=11:15) # in actuality, x has 1000 >> rows >> x$item<-as.character(x$item) >> I also have a small data frame y with just 1 row: >> y<-data.frame(item="f",a=3,b=10) >> y$item<-as.character(y$item) >> >> I have to decide if y$a is larger than the smallest of all the values in >> x$a. If it is, I want y to replace the whole row in x that has the lowest >> value in column a. >> This is how I'd do it. >> >> if(y$a>min(x$a)){ >> whichmin<-which(x$a==min(x$a)) >> x[whichmin,]<-y[1,] >> } >> >> >> I am wondering if there is a faster way of doing it. What would be the >> fastest possible way? I'd have to do it, unfortunately, many-many times. >> >> Thank you very much! >> >> -- >> Dimitri Liakhovitski >> gfk.com <http://marketfusionanalytics.com/> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > -- > Dimitri Liakhovitski > gfk.com <http://marketfusionanalytics.com/> > -- Dimitri Liakhovitski gfk.com <http://marketfusionanalytics.com/> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.