On 07/03/2012 02:17 PM, Daniel wrote:
I understand, but I still have some problems understanding what is the
utility of create a custom feature generator, could you give me any
example or any link with more information about it?

In my opinion you should always start with the default feature generation.
This gives you a good baseline which you can use to compare your modifications
against. To do the actual measurement we added evaluation support directly
to the name finder, you can either evaluate on a test set or do cross validation.

See here for more details:
http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.namefind.eval

If you use the API where you pass in the feature generator you need to ensure its put together the same way during runtime when you process your data. To make this easier we added a xml based DSL you can use to describe your feature generation.

A sample of that can be found in our documentation:
http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.namefind.training.featuregen

To use it you need to call this train method:
NameFinderME.train(String languageCode, String type,
      ObjectStream<NameSample> samples, TrainingParameters trainParams,
      byte[] featureGeneratorBytes, final Map<String, Object> resources)

The xml descriptor is passed in as a byte array.

Hope that helps,
Jörn



Reply via email to