[R] rank function and NA in 2.3.1
Hi. I am using R 2.3.1 on WIndows XP, and I am having trouble with the rank function in the presence of numerical NA data. I want the NA's all to get the same rank, but they don't. Here is an example from my session: ct_align_rets_f2$liq[6851:6859] [1] 115396 NA 362595 NA 242986 340805 NA 692905 251533 rankl=rank(ct_align_rets_f2$liq,na.last=FALSE,ties.method=min) rankl[6851:6859] [1] 4392 2424 5535 2425 5037 5451 2426 6625 5082 What am I doing wrong? Is there a way to check whether there's a problem with the data, i.e., somehow the NA's have different values? (By the way, I have tried not using na.last, and also different ties.methods, but the NA ranks have never come out equal.) Thanks! -- TMK -- 212-460-5430home 917-656-5351cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rank function and NA in 2.3.1
Hi one workaround could be to change your NA values to some number and do the rank with average, min or max option. x-1:12 x[c(5,10)]-NA rank(x) [1] 1 2 3 4 11 5 6 7 8 12 9 10 rank(x, ties=average) [1] 1 2 3 4 11 5 6 7 8 12 9 10 x[which(is.na(x))]-999 rank(x, ties=average) [1] 1.0 2.0 3.0 4.0 11.5 5.0 6.0 7.0 8.0 11.5 9.0 10.0 Or you can go through rank source code and to change it so as it will behave as you wish. HTH Petr On 11 Jan 2007 at 18:29, Talbot Katz wrote: From: Talbot Katz [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Date sent: Thu, 11 Jan 2007 18:29:16 -0500 Subject:[R] rank function and NA in 2.3.1 Hi. I am using R 2.3.1 on WIndows XP, and I am having trouble with the rank function in the presence of numerical NA data. I want the NA's all to get the same rank, but they don't. Here is an example from my session: ct_align_rets_f2$liq[6851:6859] [1] 115396 NA 362595 NA 242986 340805 NA 692905 251533 rankl=rank(ct_align_rets_f2$liq,na.last=FALSE,ties.method=min) rankl[6851:6859] [1] 4392 2424 5535 2425 5037 5451 2426 6625 5082 What am I doing wrong? Is there a way to check whether there's a problem with the data, i.e., somehow the NA's have different values? (By the way, I have tried not using na.last, and also different ties.methods, but the NA ranks have never come out equal.) Thanks! -- TMK -- 212-460-5430 home 917-656-5351 cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rank Function
y-round(c(0.68,0.95,b,c,d),2) rank(y) [1] 3.5 5.0 1.0 2.0 3.5 On 10/10/06, Li Zhang [EMAIL PROTECTED] wrote: Does anyone know why the two rank functions gives different results? I need to use the rank function in a for loop, so the sequence to be ranked is given values in the form of part (1). How can I use assignment like in part (1) to get correct ranks as in part (2)? Thank You Part (1) i-1.94 b-0.95-i c-1.73-i d-2.62-i y-c(0.68,0.95,b,c,d) y 0.68 0.95 -0.99 -0.21 0.68 rank(y) 3 5 1 2 4 Part(2) rank(c(0.68,0.95,-0.99,-0.21,0.68)) 3.5 5.0 1.0 2.0 3.5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rank Function
Because y[1] and y[5] are not the same in Part1 but are in Part2: # using y from Part1 y[5] - y[1] [1] 1.110223e-16 You could round your numbers to 2 digits, say: rank(round(100*y)) # y is from Part1 [1] 3.5 5.0 1.0 2.0 3.5 On 10/10/06, Li Zhang [EMAIL PROTECTED] wrote: Does anyone know why the two rank functions gives different results? I need to use the rank function in a for loop, so the sequence to be ranked is given values in the form of part (1). How can I use assignment like in part (1) to get correct ranks as in part (2)? Thank You Part (1) i-1.94 b-0.95-i c-1.73-i d-2.62-i y-c(0.68,0.95,b,c,d) y 0.68 0.95 -0.99 -0.21 0.68 rank(y) 3 5 1 2 4 Part(2) rank(c(0.68,0.95,-0.99,-0.21,0.68)) 3.5 5.0 1.0 2.0 3.5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rank Function
On Tue, 10 Oct 2006, Gabor Grothendieck wrote: Because y[1] and y[5] are not the same in Part1 but are in Part2: # using y from Part1 y[5] - y[1] [1] 1.110223e-16 Yes, this is FAQ 7.31: Why doesn't R think these numbers are equal? i-1.94 d-2.62-i print(0.68, digits=16) [1] 0.68 print(d, digits=16) [1] 0.6802 identical(d, 0.68) [1] FALSE all.equal(d, 0.68) [1] TRUE with the internal rank function ignoring numeric fuzz. You could round your numbers to 2 digits, say: rank(round(100*y)) # y is from Part1 [1] 3.5 5.0 1.0 2.0 3.5 On 10/10/06, Li Zhang [EMAIL PROTECTED] wrote: Does anyone know why the two rank functions gives different results? I need to use the rank function in a for loop, so the sequence to be ranked is given values in the form of part (1). How can I use assignment like in part (1) to get correct ranks as in part (2)? Thank You Part (1) i-1.94 b-0.95-i c-1.73-i d-2.62-i y-c(0.68,0.95,b,c,d) y 0.68 0.95 -0.99 -0.21 0.68 rank(y) 3 5 1 2 4 Part(2) rank(c(0.68,0.95,-0.99,-0.21,0.68)) 3.5 5.0 1.0 2.0 3.5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rank Function
Li Zhang wrote: Does anyone know why the two rank functions gives different results? I need to use the rank function in a for loop, so the sequence to be ranked is given values in the form of part (1). How can I use assignment like in part (1) to get correct ranks as in part (2)? Thank You Part (1) i-1.94 b-0.95-i c-1.73-i d-2.62-i y-c(0.68,0.95,b,c,d) y 0.68 0.95 -0.99 -0.21 0.68 rank(y) 3 5 1 2 4 Part(2) rank(c(0.68,0.95,-0.99,-0.21,0.68)) 3.5 5.0 1.0 2.0 3.5 You have specified the exact numbers in part(2). Try part(1) with the following: rank(zapsmall(y)) zapsmall removes tiny floating point errors that are not visible with the default representation of numbers. Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rank function
lt;!--startrecall--gt;lt;img src=quot;http://mail.skku.edu/mail/write/[EMAIL PROTECTED]quot;gt;lt;!--endrecall--gt; Hello! I have a question on rank function that i#039;m working on now. Even though my English i not good, I hope you understand what i#039;m asking for. It is a program that i made (It must not to use the function from the R) ## datalt;-sample(c(1:100),10) rank.data lt;- rep(0,length(data)) for(i in 1:length(data)){ for(j in 1:length(data)){ if(data[i]lt;data[j]){ rank.data[j] lt;- rank.data[j] + 1 } } } rank.data lt;- rank.data + 1 data rank.data rank(data) ## I wrote out again because i wanted to decrease it to 55times for efficiency of calculation. # datalt;-sample(c(1:100),10) test.datalt;-data nlt;-length(data) min.datalt;-1000 for(j in 1:10){ for(i in 1:n){ if(data[i]lt;min.data){ min.data lt;- data[i] }} rank.data[rank.data==min.data]lt;-j data lt;- data[data!=min.data] } test.data rank.data rank(test.data) # ### This is output ## Error in if (data[i] lt; min.data) { : missing value where TRUE/FALSE needed gt; test.data [1] 97 25 90 76 85 32 79 8 39 35 gt; rank.data [1] 3 9 7 4 6 95 1 1 1 5 gt; rank(test.data) [1] 10 2 9 6 8 3 7 1 5 4 gt; I added it to R after i copied the sources then error occured instead of the result that i wanted. How can i get the correct results? And how can i correct second source? [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help