The KS statistic is based on the largest distance between the cdf of the two variables. It doesn't matter where this largest gap occurs between the cdfs. This paper has helped me much in thinking about this type of problem: Hirzel, A. H., Hausser, J., Chessel, D. and Perrin, N. 2002. Ecological-niche factor analysis: How to compute habitat-suitability maps without absence data? - Ecology 83: 2027-2036. Instead of doing a KS you could consider comparing means AND variances in a test, to look whether your insects use habitat neutrally according to some variable measured. If a factor turns out to be limiting, one would expect that the mean and/or the variance in the distribution differs. Of course, one could construct situations, in which this paradigm could fail (e.g. bimodal distribution).
HTH Volker Ryan Utz wrote: > I've come across a statistical mystery. Any insight would be much > appreciated. > > I attempted to attach a figure to illustrate the problem, but the listserv > rejected it. I can send it to anyone who thinks they may help. > > The data concerns presence/absence data for aquatic insects across several > thousand streams. We're looking to find the effect of land use change (i.e. > urban, agriculture, etc) on individual taxa. In particular, we want to > determine if a specific land use has the potential to eliminate a taxa once > development reaches a certain point. > > To do this, my first approach went as follows: > > 1) Narrow down the data to the appropriate subset of physiographic provinces > where the specific insect is found > > 2) Compare the cumulative frequency distribution of all watersheds (expected > CDF) to the actual CDF where the insects were collected (observed CDF) based > on different levels of land cover change > > 3) Run a Kolmogorov-Smirnov test between the 2 distributions > > This worked fine until I came across the scenario of the insect Baetis in > relation to the variable agricultural development. The the distribution > where the insect Baetis ocurred was very close to the expected. In fact, in > the extreme right end of the distribution (where the stressor is highest), > observed vs. expected distributions were almost identical. We visually > interpreted this to mean agriculture had little affect on whether or not > you'd see Baetis in a stream. > > However... > > The KS test stated otherwise; the distributions were highly significantly > different. I attributed this to the high power of the KS test and the > sensitivity of the test near the middle of the curve. Other similar tests > behaved similarly. But as you would see in the figure (if you'd like me to > send it), Baetis survives at high levels of agriculture, even if variation > between the curves ocurrs earlier on (maybe ~40% agricultural development). > > Does anyone know of a better approach? I've tried a number of other > tests/approaches, but nothing gets me anywhere. > > Many thanks to anyone who could possibly help. > > -Ryan Utz > > PhD Student > University of Maryland > Appalachian Laboratory > > >
