"jackson marshmallow" <[EMAIL PROTECTED]> wrote > > I need to write a program that will calculate a non-parametric correlation > between two time series. The series' length is usually about 1000 points. > > Let's say for Spearman's rho the minimal cost of calculation equals the > number of data points N. I will also need to compute the p-value. If I use > randomization to determine the p-value, and there are P permutations, then > the cost is N*P operations. > > I understand that the more robust statistic is Kendall's tau (or, in this > case, its variant, Somer's D), but the cost is N-square/2. > > The question is this: can I select a limited number of random pairs to > calculate a valid estimate of Kendall's tau? > I wouldn't select a limited number of pairs (e.g. selecting 400 out of 1000 pairs).
Instead, I would select a limited number of permutations of the entire set. The total number of possible permutations is, uh, a large number, for 1000 pairs. Select some limited, but relatively large, number of permutations (e.g. 3000 or so) which would be obtained by leaving series 1 as 1, 2, .... 1000 and shuffling series 2 in a random order. The fact that some of the particular orderings may be repeated is of no consequence at all with 1000 points. Note that this does not give you an exact answer, but one that will not vary much from run to run if the number of permutations chosen is fairly large. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
