[opennlp-dev] TokenNameFinderFactory new features and extension

Rodrigo Agerri Fri, 03 Oct 2014 02:59:39 -0700

Hello,

I have implemented a number of new features for the name finder. These
include Brown clusters features (duplicated per Brown path for each
feature activated involving a token) and Clark cluster features
(similar to the WordClusterFeatureGenerator currently available) among
other local extra features which interact well with the clustering
ones.


I think it will be nice to include them before the new release. I will
open issues about each of them. What do you think?

In the meantime, I am in the process of testing these new features
locally but I have run into a number of issues/questions about how to
proceed about the extension of the TokenNameFinderFactory:

1. I add the new features to the GeneratorFactory.
2. I create a new feature descriptor accordingly with some of the new features.
3. I extend the TokenNameFinderFactory and I instantiate the subclass
via the TokenNameFinderFactory.create(subclassName, featuregenerator[]
bytes, resources, sequenceCodec) method.
4. I override the TokenNameFinderFactory.createFeatureGenerators()
method in the extended class.
5. At this point, I do not have access to the featureGeneratorBytes[]
because the TokenNameFinderFactory does not provide a getter. I add a
getter accordingly in the TokenNameFinderFactory class.
Should we do this? Or I am doing the extension of the TokenNameFactory
in a wrong way?
6. *Some* of the new features work. If an Element name in the
descriptor does not match in the GeneratorFactory, then the
TokenNameFinderFactory.createFeatureGenerators() gives a null and the
TokenNameFinderFactory.createContextGenerator() automatically stops
the feature creation and goes for the
NameFinderME.createFeatureGenerator().
Is this the desired behaviour? Perhaps we could add a log somewhere?
To inform of the backoff to the default features if one descriptor
element does not match?

Any comment appreciated.

Thanks,

Rodrigo

[opennlp-dev] TokenNameFinderFactory new features and extension

Reply via email to