On Jul 17, 2009, at 5:06 AM, Robin Anil wrote:


the reason i used countries was i couldn't think of some other larger group
of labels.
Also wikipedia has over 100K categories, A document has multiple categories too. So finding a non overlapped sets of documents wasn't easy(Which makes it easy to differentiate them).First thing I could think of was countries

Are you saying that you think docs only have one country assigned to them?

In the little bit of grepping I've done, I think I might try a hand at something like "school subjects", i.e Math, History, Science. Of course, the multiple categories thing is a bit weird since we are trying to classify to a single category. For now, the example is first one found is the chosen one.

-Grant

Reply via email to