Suppose I have a data set
D = {e1, e2, e3, e4, e5, e6, e7, e8, e9, e10}
I would like to divide it up into a testing set and a training set.
What is the usual practice in setting ratio between them? 70 %
traninng, 30 % testing? Or something else?
How do I choose the such sets? I mean which elements should be
included in the training set and which should be included in the
testing test?
Again, if I randomly pick them, is it enough for me to try one random
set only? How good is this random set?
If I try different random set, I will come up with several results,
which one should I choose? And how do I deduce which one is better?
Thank you!
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================