I will wait for the next version-2.9.1 and presently using Petr's suggestion, i.e., (x[1]*length(x))==sum(x) which significantly reduced the run time.
The problem is now there might be only small differences ,say, of the order of 10^-10 which I want to ignore. So I used: isTRUE(all.equal((x[1]*length(x)),sum(x))) as suggested in the documentation of all.equal. But this again increased the run time to *five times*. 1) Is there any faster way of doing the same? 2) Will the function "anyDuplicated" treat almost equal values as duplicated or not? Actually I need both the options. Regards Utkarsh Prof Brian Ripley wrote: > On Tue, 16 Jun 2009, Prof Brian Ripley wrote: > >> On Tue, 16 Jun 2009, jim holtman wrote: >> >>> I think the only way that you are going to get it to stop on the first >>> mismatch is to write your own function in C if you are concerned >>> about the >>> time. Matching on character vectors will be even more costly since >>> it is >>> having to loop to check the equality of each character in each element. >>> This is one of the places it might pay to convert to factors and >>> then the >>> comparison only uses the integer values assigned to the factors. >> >> Not so in a recent R: comparison of character vectors is now done by >> comparing pointers in the first instance so (at least on a 32-bit >> platform) is as fast as comparing integers. And on x86_64 Linux: >> >>> x <- as.character(c(1,2,rep(1,10000000))) >>> system.time(print(all(x[1] == x))) >> [1] FALSE >> user system elapsed >> 0.123 0.019 0.142 >> >>> system.time(xx <- as.factor(x)) >> user system elapsed >> 9.874 0.284 10.159 >>> system.time(print(all(xx[1] == xx))) >> [1] FALSE >> user system elapsed >> 0.511 0.145 0.656 >> >> Recent pre-release versions of R (e.g. 2.9.1 beta) allow >> >>> system.time(anyDuplicated(x)) >> user system elapsed >> 0.034 0.078 0.113 >>> system.time(anyDuplicated(xx)) >> user system elapsed >> 0.037 0.076 0.113 > > I'm sorry, a line got reverted here: I had edited this to say > > 'which is a C-level speedup of the sort the original poster seemed to > be looking for' > >> >>> >>> On Tue, Jun 16, 2009 at 8:31 AM, utkarshsinghal < >>> utkarsh.sing...@global-analytics.com> wrote: >>> >>>> Hi Jim, >>>> >>>> What you are saying is correct. Although, my computer might not >>>> have same >>>> speed and I am getting the following for 10M entries: >>>> >>>> user system elapsed >>>> 0.559 0.038 0.607 >>>> >>>> Moreover, in the case of character vectors, it gets more than double. >>>> >>>> In my modeling, which is already highly time consuming, I need to >>>> do check >>>> this for few thousand vectors and the entries can easily be 10M in >>>> each >>>> vector. So I am just looking for any possibilities of time saving. >>>> I am >>>> pretty sure that whenever elements are not all equal, it can be >>>> concluded >>>> from any few entries (most of the times). It will be worth if I can >>>> find a >>>> way which stops checking further the moment it find two distinct >>>> elements. >>>> >>>> Regards >>>> Utkarsh >>>> >>>> >>>> >>>> jim holtman wrote: >>>> >>>> Just check that the first (or any other element) is equal to all >>>> the rest: >>>> >>>>> x = c(1,2,rep(1,10000000)) # 10,000,000 >>>>> system.time(print(all(x[1] == x))) >>>> [1] FALSE >>>> user system elapsed >>>> 0.18 0.00 0.19 >>>> >>>>> >>>> This was for 10M entries. >>>> >>>> On Tue, Jun 16, 2009 at 7:42 AM, utkarshsinghal < >>>> utkarsh.sing...@global-analytics.com> wrote: >>>> >>>>> >>>>> Hi All, >>>>> >>>>> There are several replies to the question below, but I think there >>>>> must >>>>> exist a better way of doing so. >>>>> I just want to check whether all the elements of a vector are >>>>> same. My >>>>> vector has one million elements and it is highly likely that there >>>>> are >>>>> distinct elements in the first few itself. For example: >>>>> >>>>> > x = c(1,2,rep(1,100000)) >>>>> >>>>> I want the answer as FALSE, which is clear from the first two >>>>> observations itself and we don't need to check for the rest. >>>>> >>>>> Does anybody know the most efficient way of doing this? >>>>> >>>>> Regards >>>>> Utkarsh >>>>> >>>>> >>>>> >>>>> From: Francisco J. Zagmutt <gerifalte28_at_hotmail.com >>>>> <mailto:gerifalte28_at_hotmail.com >>>>> ?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix>> >>>>> >>>>> >>>>> >>>>> Date: Tue 30 Aug 2005 - 06:05:20 EST >>>>> >>>>> >>>>> Hi Doran >>>>> >>>>> The documentation for isTRUE reads 'isTRUE(x)' is an abbreviation of >>>>> 'identical(TRUE,x)' so actually Vincent's solutions is "cleaner" than >>>>> using identical :) >>>>> >>>>> Cheers >>>>> >>>>> Francisco >>>>> >>>>> />From: "Doran, Harold" <hdo...@air.org> / >>>>> />To: <vincent.gou...@act.ulaval.ca>, <r-h...@stat.math.ethz.ch> / >>>>> />Subject: Re: [R] Testing if all elements are equal in a >>>>> vector/matrix / >>>>> />Date: Mon, 29 Aug 2005 15:49:20 -0400 / >>>>> /> / >>>>> >See ?identical >>>>> <http://tolstoy.newcastle.edu.au/R/help/05/08/11201.html#11202qlink1> >>>>> /> / >>>>> />-----Original Message----- / >>>>> />From: r-help-boun...@stat.math.ethz.ch / >>>>> />[mailto:r-help-boun...@stat.math.ethz.ch] On Behalf Of Vincent >>>>> Goulet / >>>>> />Sent: Monday, August 29, 2005 3:35 PM / >>>>> />To: r-h...@stat.math.ethz.ch / >>>>> />Subject: [R] Testing if all elements are equal in a vector/matrix / >>>>> /> / >>>>> /> / >>>>> />Is there a canonical way to check if all elements of a vector or >>>>> matrix are / >>>>> />the same? Solutions below work, but look hackish to me. / >>>>> /> / >>>>> /> > x <- rep(1, 10) / >>>>> /> > all(x == x[1]) # == operator does not provide for small >>>>> differences / >>>>> */>[1] TRUE / >>>>> */> > isTRUE(all.equal(x, rep(x[1], length(x)))) # ugly / >>>>> */>[1] TRUE / >>>>> */> / >>>>> />Best, / >>>>> /> / >>>>> />Vincent / >>>>> />-- / >>>>> /> Vincent Goulet, Associate Professor / >>>>> /> ?cole d'actuariat / >>>>> /> Universit? Laval, Qu?bec / >>>>> /> >>>>> Vincent.Goulet_at_act.ulaval.ca<http://vincent.goulet_at_act.ulaval.ca/> >>>>> >>>>> <mailto:Vincent.Goulet_at_act.ulaval.ca >>>>> ?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix> >>>>> >>>>> >>>>> http://vgoulet.act.ulaval.ca / >>>>> /> / >>>>> />______________________________________________ / >>>>> />r-h...@stat.math.ethz.ch mailing list / >>>>> />https://stat.ethz.ch/mailman/listinfo/r-help / >>>>> />PLEASE do read the posting guide! / >>>>> />http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/ >>>>> >>>>> >>>>> /> / >>>>> />______________________________________________ / >>>>> />r-h...@stat.math.ethz.ch mailing list / >>>>> />https://stat.ethz.ch/mailman/listinfo/r-help / >>>>> />PLEASE do read the posting guide! / >>>>> />http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/ >>>>> >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >>>>> >>>>> >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> >>>> >>>> >>>> -- >>>> Jim Holtman >>>> Cincinnati, OH >>>> +1 513 646 9390 >>>> >>>> What is the problem that you are trying to solve? >>>> >>>> >>>> >>> >>> >>> -- >>> Jim Holtman >>> Cincinnati, OH >>> +1 513 646 9390 >>> >>> What is the problem that you are trying to solve? >>> >>> [[alternative HTML version deleted]] >>> >>> >> >> -- >> Brian D. Ripley, rip...@stats.ox.ac.uk >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >> University of Oxford, Tel: +44 1865 272861 (self) >> 1 South Parks Road, +44 1865 272866 (PA) >> Oxford OX1 3TG, UK Fax: +44 1865 272595 >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.