Sampling problem

ABC Fri, 20 Dec 2002 05:54:06 -0800

Suppose I have a data set 

        D = {e1, e2, e3, e4, e5, e6, e7, e8, e9, e10}


I would like to divide it up into a testing set and a training set. 
What is the usual practice in setting ratio between them? 70 %
traninng, 30 % testing? Or something else?

How do I choose the such sets? I mean which elements should be
included in the training set and which should be included in the
testing test?
Again, if I randomly pick them, is it enough for me to try one random
set only? How good is this random set?
If I try different random set, I will come up with several results,
which one should I choose? And how do I deduce which one is better?

Thank you!
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Sampling problem

Reply via email to