"Robert Dole" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > "jackson marshmallow" <[EMAIL PROTECTED]> wrote > > > > I need to write a program that will calculate a non-parametric correlation > > between two time series. The series' length is usually about 1000 points. > > > > Let's say for Spearman's rho the minimal cost of calculation equals the > > number of data points N. I will also need to compute the p-value. If I use > > randomization to determine the p-value, and there are P permutations, then > > the cost is N*P operations. > > > > I understand that the more robust statistic is Kendall's tau (or, in this > > case, its variant, Somer's D), but the cost is N-square/2. > > > > The question is this: can I select a limited number of random pairs to > > calculate a valid estimate of Kendall's tau? > > > I wouldn't select a limited number of pairs (e.g. selecting 400 out of > 1000 pairs). >
Well, the problem is that the number of pairs for 1000 points would be 499500... > Instead, I would select a limited number of permutations of the entire > set. The total number of possible permutations is, uh, a large number, > for 1000 pairs. Select some limited, but relatively large, number of > permutations (e.g. 3000 or so) which would be obtained by leaving > series 1 as 1, 2, .... 1000 and shuffling series 2 in a random order. > > The fact that some of the particular orderings may be repeated is of > no consequence at all with 1000 points. > > Note that this does not give you an exact answer, but one that will > not vary much from run to run if the number of permutations chosen is > fairly large. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
