Re: [MORPHMET] Re: Stability of p-values (physignal and testing for morphological integration)

2015-06-03 Thread Dennis E. Slice



On 6/3/15 6:47 AM, Tsung Fei Khang wrote:

@Aki: Thank you. 1/#iterations is problematic because one could then get
arbitrarily small p-values... should converge to some value (however
small) as the number of iterations exceeds some threshold, which is
dependent on data set.


^ Note, that is NOT what Aki said. What he said was that the lowest 
reportable p-value is a function of your number of iterations. If you do 
9 random draws and compare with your observed data, the lowest p-value 
you could possibly get is 1/10=0.1 (e.g., you would never see anything 
lower). If you did 99, then the lowest would be 0.01, 999 -> 0.001, etc.


Under the null model, on average 10% randomized values would be below 
0.1, 5% below 0.05, 1% below 0.01, etc. The actual proportion you get 
for your data will vary around some value, but that value is determined 
by your original value compared to the randomized results. It has more 
precision as the number of iterations increases, but it does not get 
lower because of those iterations.


-ds

--
If a response is important to you, keep trying -> I receive 50-100 msgs/day

--
MORPHMET may be accessed via its webpage at http://www.morphometrics.org

To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org.


Re: [MORPHMET] Re: Stability of p-values (physignal and testing for morphological integration)

2015-06-03 Thread Novack-Gottshall, Philip M.
There was a similar concern (about arbitrary p-values in simulations) expressed 
in a nice, recent Oikos paper, too:

White, J. W., A. Rassweiler, J. F. Samhouri, A. C. Stier, and C. White. 2014. 
Ecologists should not use statistical significance tests to interpret 
simulation model results. Oikos 123(4):385-388.

I think some of the concerns are also wise to heed for those doing resampling. 
It's always worth doing a quick sensitivity analysis (running the resampling 
algorithm for increasing numbers of replicates) to identify the point at which 
p-values (or other stats you're interested in) become stable. But there's no 
need to run huge replicates if they're not needed. See p. 40 in the paper below 
(focused more on general paleontological applications than morphometrics) for 
discussion and recommendations on this issue, with some advice from the 
literature. The online supplement at 
http://www.paleosoc.org/shortcourse2010/Resampling_KandN-G_Appendices_Oct11.doc 
 has some (clunky) sample R code (section 2.8) demonstrating this.

Kowalewski, M., and P. Novack-Gottshall. 2010. Resampling methods in 
paleontology. Pp. 19-54. In J. Alroy, and G. Hunt, eds. Quantitative Methods in 
Paleobiology. Short Courses in Paleontology 16. Paleontological Society and 
Paleontological Research Institute, Ithaca, NY.

Cheers,
Phil

On 6/2/2015 11:47 PM, Tsung Fei Khang wrote:
Dear community,

Many thanks to everyone who responded with your opinions and also references. I 
think the set.seed solves the reproducibility problem, and for practicality, I 
would just set seed,  make a single run at a high number of replicates such as 
10,000, and then report a reasonable upper bound for the p-value (e.g. p-value 
< 0.01 if I get something like 0.0068).

@Aki: Thank you. 1/#iterations is problematic because one could then get 
arbitrarily small p-values... should converge to some value (however small) as 
the number of iterations exceeds some threshold, which is dependent on data set.

On Tuesday, June 2, 2015 at 3:48:37 PM UTC+8, Tsung Fei Khang wrote:
Dear community,

I would like to share my experience with using some (really cool) computational 
tools for phylogenetic signal and morphological integration analysis.

I am using physignal (geomorph R package) and the Phylo.Morphol.PLS function 
provided in the paper by Adams and Felice (2014; PLoS ONE, 9:e94335) in my 
work. I noticed that if the same analysis is rerun for a particular number of 
iterations, the results may vary. Additionally, I observed that increasing the 
number of iterations, up to some critical point, may push down the p-value, 
depending on data set (didn't happen with the plethspecies (9 species) data, 
but happened in my data set - 13 species, not salamanders). I attach runs (10 
times) for both data sets for iterations of 100, 1000, 1 and 10 here 
for Phylo.Morphol.PLS. Note that some kind of stable results is attained after 
1000 iterations (default) for the plethspecies data, but for my case, which 
needs 1.

I think the notion that p-values returned from a permutation method are 
actually realizations of random variables with a certain mean and variance may 
not be familiar to many biologists, who are accustomed to expect a reproducible 
p-value when the same data set is rerun using common statistical tests. Perhaps 
in a future version the authors of the code can implement a checker within the 
functions that checks the number of iterations for  attaining "convergence", so 
that a more stable p-value is returned?






" PENAFIAN: E-mel ini dan apa-apa fail yang dikepilkan bersamanya ("Mesej") 
adalah ditujukan hanya untuk kegunaan penerima(-penerima) yang termaklum di 
atas dan mungkin mengandungi maklumat sulit. Anda dengan ini dimaklumkan bahawa 
mengambil apa jua tindakan bersandarkan kepada, membuat penilaian, mengulang 
hantar, menghebah, mengedar, mencetak, atau menyalin Mesej ini atau sebahagian 
daripadanya oleh sesiapa selain daripada penerima(-penerima) yang termaklum di 
atas adalah dilarang. Jika anda telah menerima Mesej ini kerana kesilapan, anda 
mesti menghapuskan Mesej ini dengan segera dan memaklumkan kepada penghantar 
Mesej ini menerusi balasan e-mel. Pendapat-pendapat, rumusan-rumusan, dan 
sebarang maklumat lain di dalam Mesej ini yang tidak berkait dengan urusan 
rasmi Universiti Malaya adalah difahami sebagai bukan dikeluar atau diperakui 
oleh mana-mana pihak yang disebut.


DISCLAIMER: This e-mail and any files transmitted with it ("Message") is 
intended only for the use of the recipient(s) named above and may contain 
confidential information. You are hereby notified that the taking of any action 
in reliance upon, or any review, retransmission, dissemination, distribution, 
printing or copying of this Message or any part thereof by anyone other than 
the intended recipient(s) is strictly prohibited. If you have received this 
Message in error, you should delete this Message immediately and advise the