Hi all,
My classification problem is very similar to the "20 newsgroups"
example. But I don't have the possibility to use a large quantity of
data for training.
I'd like to know what would be the "minimum" size of training data for
SGD or SVM algorithms to have reasonable results.
My datas are the same kind of the 20 newsgroups example but they have
fewer lines.
The body of each entry is about beetween 40 and 60 words.
I'd like to try with 10 examples by category (with 2 or 3 category),
choosing good examples with the more frequent keywords to be sure that
the learning phase will be efficient.
Can it be relevant with so little data ?
Thanks,
Loic