Yeah... what Hector says. You can even make the output of preliminary classifiers be features for new classifiers.
Or if you have two different target variables, you can make a model that predicts one target be a feature in the model that predicts the other. Feature extraction generally has more potential for performance improvements than any algorithm changes. On Mon, Jun 27, 2011 at 7:22 PM, Hector Yee <[email protected]> wrote: > Redacted to pass the overly aggressive spam filter. > > On Mon, Jun 27, 2011 at 7:19 PM, Hector Yee <[email protected]> wrote: > > > Just make the pattern a feature and feed it into the machine learning. > > > > e.g. if its a spam model and you notice v**gra is a spam term just make > > feature 0 = "v**gra count" and the rest your regular bag of words. > > > > The only thing you have to be careful of is the relative weights between > > each feature category. Typical normalizations is to L2 norm each feature > > category separately before concatenation. > > Another option is to use a "scale free" classification algorithm like > > adaboost. > > > > > > On Mon, Jun 27, 2011 at 5:51 PM, Patrick Collins < > > [email protected]> wrote: > > > >> Has anyone got any advice on how to combine heuristics and > classification? > >> > >> When preparing my data to build out the features to feed into my > >> classification model I keep noticing patterns of text which I know with > >> 99.99% probability implies a certain outcome. > >> > >> How would you construct the data/features in order to pre-classify this > >> data to provide much more likelihood that the classifier comes to the > >> "correct" conclusion? > >> > >> For example, I remember seeing an anti-spam machine which used a > >> combination of fuzzy logic and then classification to build a better > outcome > >> (but he did not detail out how it was actually implemented). He used a > whole > >> range of heuristics to determine that a certain sender is known to be a > >> spammer rather than just blindly passing this data in to the classifier. > >> > >> In my dataset I have a LOT of patterns like this that I can identify and > >> then determine with very high probability the outcome. I say high > >> probability, but I cannot say absolutely. Ideally if I could pre compute > a > >> lot of this data using heuristics I could feed this information in to > the > >> classifier to greatly reduce the number of features. But the classifiers > do > >> not allow me the ability to provide a "weight" to a certain feature. > >> > >> Other than "well just try and see what works", I was wondering how do > >> people deal with this problem? Do they just leave it to the classifier > and > >> hope that the classifier picks up the same patterns? > >> > >> I'm a bit new to mahout and classification algorithms and so am just > >> trying to get some input from how others might see this problem and > whether > >> I'm barking up the wrong tree. > >> > >> Patrick. > >> > > > > > > > > -- > > Yee Yang Li Hector > > http://hectorgon.blogspot.com/ (tech + travel) > > http://hectorgon.com (book reviews) > > > > > > > -- > Yee Yang Li Hector > http://hectorgon.blogspot.com/ (tech + travel) > http://hectorgon.com (book reviews) >
