Hi,

Thanks for your help,

If you have a really good, cheap feature, it is criminal not
to expose that feature to the classifier

Yes i's cheap because it's a human process that has already been done for most categories. For each one, we have a set of representative keywords.

One thing that you said that worried me is your comment about putting a
really high weight on the feature.  I am surprised that this was required.

Maybe it is because my training data are not good enough...
What would be a "reasonable" order of height for such a weight?

Another thing that made me feel it could be a dirty hack is that the 'keyword' feature added manually during learning cannot be found in test or production data because it does not exist :

My train data are text files. For example, news articles.
They have one class, the category : politics, sport, finance...
They have 2 attributes : title and body.

So the keywords I add with the code I put in the previous mail is completely "virtual". It will never be found in title or body. Despite that, I guess that the SGD alorithm looks for the keywords into the other attributes (title and body) to match the right category.

But maybe it's the reason why it works only with very big weight?

Loic

Le 21.09.2011 20:32, Ted Dunning a écrit :
This comes under the heading of "inventing good features".

In real data mining projects, easily 90% of the effort goes into this sort
of activity.  If you have a really good, cheap feature, it is criminal not
to expose that feature to the classifier.

If you don't really have the feature because it is expensive or requires a
time machine to derive, then obviously you have to do something different.

One thing that you said that worried me is your comment about putting a
really high weight on the feature.  I am surprised that this was required.

On Wed, Sep 21, 2011 at 8:22 AM, Loic Descotte<[email protected]>wrote:

Did someone experienced this kind of things? Do you have advices? Or is it
just a wrong idea?


Reply via email to