Thanks for replying Robin , I am quoting conversation between Grant and Me earlier Now I want to know how to implement the second problem ?
To be specific my problem is to classify a piece text crawled from web into > two classes > > 1.It <http://1.it/> is a +ve feedback > 2.It <http://2.it/> is -ve feed back. > > I can use the two news group example and create a model with some text > (may be a large no of text ) by inputtng the trainer with these two > labels.Should I leave everything to the trainer completely like this ? > > > Yes, that should be fine. The trainer doesn't care about the name of the > label, it just cares that the two sets are relatively independent. Keep in > mind, you should set aside some of your data for testing as well. > > Or Do I have flexibility to give some other input specific to my problem ? > Such as if words like "Problem", "Complaint" etc are more likely to appear > in a text containing grievance. > > > You can provide a Weight, usually TF-IDF, that often does a good job of > factoring in the importance of words. If you have certain sentiment words > that you think influence things one way or the other, you could consider a > weighting process that adds weight to those words, I suppose, but I would > want to experiment with that a bit. > On Thu, Sep 30, 2010 at 8:55 PM, Robin Anil <[email protected]> wrote: > It does that by default for all words. What else do you have in mind? > > On Thu, Sep 30, 2010 at 8:07 PM, Neil Ghosh <[email protected]> wrote: > >> Does anybody have examples/reference how to use TF-IDF weights in mahout >> cbayes for particular words and phrases while doing text classification ? >> >> -- >> Thanks and Regards >> Neil >> http://neilghosh.com >> > > -- Thanks and Regards Neil http://neilghosh.com
