The x's and y's are different sets--210,000 values altogether. That is
really the issue--they can't just be sorted, at least that I can
see....
Sean
On Mar 3, 2005, at 5:38 PM, Huntsinger, Reid wrote:
When you say the 130,000 points are from the empirical distribution,
how did
you get them? Is each one really one of the values of y? If you sorted
y
first, would you know which one (ie which index) each x is? (Sorting
80,000
elements took essentially no time at all on my sub-gigahertz Pentium
III.)
But maybe that's not an option... more details would help.
Reid Huntsinger
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Sean Davis
Sent: Thursday, March 03, 2005 5:22 PM
To: r-help
Subject: [R] Rank-based p-value on large dataset
I have a fairly simple problem--I have about 80,000 values (call them
y) that I am using as an empirical distribution and I want to find the
p-value (never mind the multiple testing issues here, for the time
being) of 130,000 points (call them x) from the empirical distribution.
I typically do that (for one-sided test) something like
loop over i in x
p.val[i] = sum(y>x[i])/length(y)
and repeat for all i. However, length(x) is large here as is
length(y), so this process takes quite a long time. Any suggestions?
Thanks,
Sean
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
-----------------------------------------------------------------------
-------
Notice: This e-mail message, together with any attachment...{{dropped}}
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html