Re: [R] Near function?

Bart Joosen Sun, 11 Feb 2007 07:49:24 -0800

Dear all,

Wolfgang, that's yet another way of calculating, which is also interesting.
But this afternoon, I did a few tries, and this is what I get:


near <- function (x,th=3){
    aant <- NROW(x)
    y <- 1:aant 
    y <- y[order(x)]
    x <- x[y]
    for (i in 1:(aant-1)) { 
        for (j in (i+1):(min(c(i+th),aant))) { 
            if (!((abs(x[i]-x[j])<=th)& (y[i] != 0) & (y[j] != 0))) break
            if (which.max(y[c(i,j)]) ==1) y[i] <- 0 else y[j] <- 0
}}
y[which(y==0)] <- NA 
x[order(y,na.last=NA)]
}


It's quiet fast (fast enough for my purpose),  the only minor thing which 
annoyes me is the fact that I'm adding a zero to the y integer, which I replace 
afterwards with a NA. I can't put a NA at the first time, because you can't 
compare with an NA. But beside this, everything goes fine (1.08seconds for an 
integer of 20000, while I'm going to work with an integer of only about 800).


Thanks to all for you input, it was a great help!!


Bart Joosen



----- Original Message ----- 
From: "Wolfgang Huber" <[EMAIL PROTECTED]>
To: "Bart Joosen" <[EMAIL PROTECTED]>
Cc: <[email protected]>
Sent: Sunday, February 11, 2007 4:09 PM
Subject: Re: [R] Near function?


> Dear Bart,
> 
> "hclust" might be useful for this as well:
> 
>   dat = c(1,20,2,21)
> 
>   hc = hclust(dist(dat))
> 
>   thresh = 2
>   ct = cutree(hc, h=thresh)
> 
>   clusteredNumbers = split(dat, ct)
>   firstOne = dat[!duplicated(ct)]
> 
> >  clusteredNumbers
> $`1`
> [1] 1 2
> $`2`
> [1] 20 21
> 
> 
> > firstOne
> [1]  1 20
> 
> 
>  Best wishes
>   Wolfgang
> 
> 
>> 
>> I have an integer which is extracted from a dataframe, which is sorted by 
>> another column of the dataframe.
>> Now I would like to remove some elements of the integer, which are near to 
>> others by their value. For example: integer: c(1,20,2,21) should be c(1,20).
>> 
>> I tried to write a function, but for some reason, somethings won't work
>> 
>> x <- 1:20
>> near <- function(x,th) {
>>     nr <- NROW(x)
>>         for (i in 1:(nr-1)){
>>         for (j in (i+1):nr){
>>             if (j > nr) break
>>             t=0
>>             if (abs(x[i] - x[j]) < th) t = 1
>>             if (t== 1) x <- x[-j]
>>             if (t== 1) nr <- nr-1
>>             if (t== 1) j <- (j-1)
>>             cat (" i",i," j",j,"\n")
>>             }} 
>> x
>> }
>> near(x,10)
>> 
>> 
>> This gives you 1  3  7 13 17 while I was suspecting 1, 20 as the outcome.
>> If you look at the intermediate results of the cat instruction, you see 
>> that, after he substracted a number, he skipped the next one.
>> 
>> Sorting the integer is not an option, the order is important.
>> I used an integer from 1:20 as an example, while x <- sample((1:20),20) is 
>> maybe a bit more representable for our data, but isn't reproducible for the 
>> output of the function.
>> 
>> Maybe there is already an R-function, which does such thing, or what is 
>> wrong with my coding?
>> 
>> 
>> thanks a lot for your time
>> 
>> 
>> Bart
>> [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> -- 
> ------------------------------------------------------------------
> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
>
        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Near function?

Reply via email to