Re: [R] Testing if all elements are equal in a vector/matrix

utkarshsinghal Wed, 17 Jun 2009 06:39:04 -0700

I will wait for the next version-2.9.1 and presently using Petr's 
suggestion, i.e.,
(x[1]*length(x))==sum(x)
which significantly reduced the run time.


The problem is now there might be only small differences ,say, of the 
order of 10^-10 which I want to ignore.

So I used:
isTRUE(all.equal((x[1]*length(x)),sum(x)))
as suggested in the documentation of all.equal.

But this again increased the run time to *five times*.

1) Is there any faster way of doing the same?
2) Will the function "anyDuplicated" treat almost equal values as 
duplicated or not? Actually I need both the options.


Regards
Utkarsh



Prof Brian Ripley wrote:
> On Tue, 16 Jun 2009, Prof Brian Ripley wrote:
>
>> On Tue, 16 Jun 2009, jim holtman wrote:
>>
>>> I think the only way that you are going to get it to stop on the first
>>> mismatch is to write your own function in C if you are concerned 
>>> about the
>>> time.  Matching on character vectors will be even more costly since 
>>> it is
>>> having to loop to check the equality of each character in each element.
>>> This is one of the places it might pay to convert to factors and 
>>> then the
>>> comparison only uses the integer values assigned to the factors.
>>
>> Not so in a recent R: comparison of character vectors is now done by 
>> comparing pointers in the first instance so (at least on a 32-bit 
>> platform) is as fast as comparing integers.  And on x86_64 Linux:
>>
>>> x <- as.character(c(1,2,rep(1,10000000)))
>>> system.time(print(all(x[1] == x)))
>> [1] FALSE
>>   user  system elapsed
>>  0.123   0.019   0.142
>>
>>> system.time(xx <- as.factor(x))
>>   user  system elapsed
>>  9.874   0.284  10.159
>>> system.time(print(all(xx[1] == xx)))
>> [1] FALSE
>>   user  system elapsed
>>  0.511   0.145   0.656
>>
>> Recent pre-release versions of R (e.g. 2.9.1 beta) allow
>>
>>> system.time(anyDuplicated(x))
>>   user  system elapsed
>>  0.034   0.078   0.113
>>> system.time(anyDuplicated(xx))
>>   user  system elapsed
>>  0.037   0.076   0.113
>
> I'm sorry, a line got reverted here: I had edited this to say
>
> 'which is a C-level speedup of the sort the original poster seemed to 
> be looking for'
>
>>
>>>
>>> On Tue, Jun 16, 2009 at 8:31 AM, utkarshsinghal <
>>> utkarsh.sing...@global-analytics.com> wrote:
>>>
>>>> Hi Jim,
>>>>
>>>> What you are saying is correct. Although, my computer might not 
>>>> have same
>>>> speed and I am getting the following for 10M entries:
>>>>
>>>>    user  system elapsed
>>>>   0.559   0.038   0.607
>>>>
>>>> Moreover, in the case of character vectors, it gets more than double.
>>>>
>>>> In my modeling, which is already highly time consuming,  I need to 
>>>> do check
>>>> this for few thousand vectors and the entries can easily be 10M in 
>>>> each
>>>> vector. So I am just looking for any possibilities of time saving.  
>>>> I am
>>>> pretty sure that whenever elements are not all equal, it can be 
>>>> concluded
>>>> from any few entries (most of the times). It will be worth if I can 
>>>> find a
>>>> way which stops checking further the moment it find two distinct 
>>>> elements.
>>>>
>>>> Regards
>>>> Utkarsh
>>>>
>>>>
>>>>
>>>> jim holtman wrote:
>>>>
>>>> Just check that the first (or any other element) is equal to all 
>>>> the rest:
>>>>
>>>>> x = c(1,2,rep(1,10000000)) # 10,000,000
>>>>> system.time(print(all(x[1] == x)))
>>>> [1] FALSE
>>>>    user  system elapsed
>>>>    0.18    0.00    0.19
>>>>
>>>>>
>>>> This was for 10M entries.
>>>>
>>>> On Tue, Jun 16, 2009 at 7:42 AM, utkarshsinghal <
>>>> utkarsh.sing...@global-analytics.com> wrote:
>>>>
>>>>>
>>>>> Hi All,
>>>>>
>>>>> There are several replies to the question below, but I think there 
>>>>> must
>>>>> exist a  better way of doing so.
>>>>> I just want to check whether all the elements of a vector are 
>>>>> same. My
>>>>> vector has one million elements and it is highly likely that there 
>>>>> are
>>>>> distinct elements in the first few itself. For example:
>>>>>
>>>>> > x = c(1,2,rep(1,100000))
>>>>>
>>>>> I want the answer as FALSE, which is clear from the first two
>>>>> observations itself and we don't need to check for the rest.
>>>>>
>>>>> Does anybody know the most efficient way of doing this?
>>>>>
>>>>> Regards
>>>>> Utkarsh
>>>>>
>>>>>
>>>>>
>>>>> From: Francisco J. Zagmutt <gerifalte28_at_hotmail.com
>>>>> <mailto:gerifalte28_at_hotmail.com
>>>>> ?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix>>
>>>>>  
>>>>>
>>>>>
>>>>> Date: Tue 30 Aug 2005 - 06:05:20 EST
>>>>>
>>>>>
>>>>> Hi Doran
>>>>>
>>>>> The documentation for isTRUE reads 'isTRUE(x)' is an abbreviation of
>>>>> 'identical(TRUE,x)' so actually Vincent's solutions is "cleaner" than
>>>>> using identical :)
>>>>>
>>>>> Cheers
>>>>>
>>>>> Francisco
>>>>>
>>>>> />From: "Doran, Harold" <hdo...@air.org> /
>>>>> />To: <vincent.gou...@act.ulaval.ca>, <r-h...@stat.math.ethz.ch> /
>>>>> />Subject: Re: [R] Testing if all elements are equal in a 
>>>>> vector/matrix /
>>>>> />Date: Mon, 29 Aug 2005 15:49:20 -0400 /
>>>>> /> /
>>>>> >See ?identical
>>>>> <http://tolstoy.newcastle.edu.au/R/help/05/08/11201.html#11202qlink1>
>>>>> /> /
>>>>> />-----Original Message----- /
>>>>> />From: r-help-boun...@stat.math.ethz.ch /
>>>>> />[mailto:r-help-boun...@stat.math.ethz.ch] On Behalf Of Vincent 
>>>>> Goulet /
>>>>> />Sent: Monday, August 29, 2005 3:35 PM /
>>>>> />To: r-h...@stat.math.ethz.ch /
>>>>> />Subject: [R] Testing if all elements are equal in a vector/matrix /
>>>>> /> /
>>>>> /> /
>>>>> />Is there a canonical way to check if all elements of a vector or
>>>>> matrix are /
>>>>> />the same? Solutions below work, but look hackish to me. /
>>>>> /> /
>>>>> /> > x <- rep(1, 10) /
>>>>> /> > all(x == x[1]) # == operator does not provide for small 
>>>>> differences /
>>>>> */>[1] TRUE /
>>>>> */> > isTRUE(all.equal(x, rep(x[1], length(x)))) # ugly /
>>>>> */>[1] TRUE /
>>>>> */> /
>>>>> />Best, /
>>>>> /> /
>>>>> />Vincent /
>>>>> />-- /
>>>>> /> Vincent Goulet, Associate Professor /
>>>>> /> ?cole d'actuariat /
>>>>> /> Universit? Laval, Qu?bec /
>>>>> /> 
>>>>> Vincent.Goulet_at_act.ulaval.ca<http://vincent.goulet_at_act.ulaval.ca/> 
>>>>>
>>>>> <mailto:Vincent.Goulet_at_act.ulaval.ca
>>>>> ?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix>
>>>>>  
>>>>>
>>>>> http://vgoulet.act.ulaval.ca /
>>>>> /> /
>>>>> />______________________________________________ /
>>>>> />r-h...@stat.math.ethz.ch mailing list /
>>>>> />https://stat.ethz.ch/mailman/listinfo/r-help /
>>>>> />PLEASE do read the posting guide! /
>>>>> />http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/
>>>>>  
>>>>>
>>>>> /> /
>>>>> />______________________________________________ /
>>>>> />r-h...@stat.math.ethz.ch mailing list /
>>>>> />https://stat.ethz.ch/mailman/listinfo/r-help /
>>>>> />PLEASE do read the posting guide! /
>>>>> />http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/
>>>>>  
>>>>>
>>>>>
>>>>>        [[alternative HTML version deleted]]
>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>>>>>  
>>>>>
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>
>>>>
>>>> -- 
>>>> Jim Holtman
>>>> Cincinnati, OH
>>>> +1 513 646 9390
>>>>
>>>> What is the problem that you are trying to solve?
>>>>
>>>>
>>>>
>>>
>>>
>>> -- 
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem that you are trying to solve?
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>>
>>
>> -- 
>> Brian D. Ripley,                  rip...@stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Testing if all elements are equal in a vector/matrix

Reply via email to