Hello,
I'm working on a classification problem. My datasets are basically text entries.
To find the right class, I know that some words are very important. Is
there a way to tell the classifier that this words should have a greater
weight?
Another very important thing is the position of this words and their
distance to other important words.
Example: I want to classifify black and white cars. I know that the
words "car", "sedan" and "limo" are very important, and that their
localisation in relation to "white" and "black" words is very important too.
The sentence "white sedan with dark windows" sould be classified in
white cars, not black cars even if the black word is here.
The localisation of coulors ("black" is further than "sedan" in relation
to"white") should help us a lot.
Is there a way to express that with Mahout classifiers (I 'm currently testing
with SGD) ?
If yes, do you have any idea or example about how to do that?
Thans a lot for your help
Loic