Sam, the list wouldn't let attachments .
On Fri, Dec 9, 2011 at 11:57 AM, Sam Cunningham <[email protected]> wrote: > I really need help. I am working on a project: I have a cron job that collects > RSS feeds from news sites (Reuters and Associated Press). I need to classify > these news data based on their content (just like 20news example). The > categories are business, entertainment, health, politics, scitech, and > sports. I > use half of the data for training and the other half for testing. Attached, > please find the training, testing and model files in compressed form. As you > will see when I test the model I get extremely good results for some topics > (business, sports, and entertainment). I get really bad results (almost %0) > for > other topics (health, scitech, and politics). What's wrong? > > What is more interesting is that I get real bad results with "health" topic > when > I test the classifier against the training data which is the dataset in > creating > the model, itself. This is strange. > > Please help. > > Thank you, > > Sam > > >
