Hello,

it really depends on what are you trying to achieve.

Maybe you know exactly what you want, in that case I would recommend to
sub-class the TokenNameFinderFactory, there could override the method to
create the feature generators. The default constructor is fine. The name
finder supports different encodings, currently Bio and Bilou. You would
need to pass a reference to one of those classes, or just use the default
(which is Bio).

If you just want to have the name finder with custom feature generation I
would suggest to define an xml descriptor for it and just use our cmd line
interface to build the model. The cmd lie inerface has the advantage that
you can use all the tools without coding yourself, especially evaluation
and cross validation should be interesting for you.

TokenNameFinderFactory(byte[] featureGeneratorBytes,
                              Map<String,Object> resources,
                              SequenceCodec<String> seqCodec)

The byte[] is supposed to contain the feature generator xml bytes.

HTH,
Jörn


On Wed, Jan 18, 2017 at 1:38 PM, Markus M. Berg <mmb...@web.de> wrote:

> Dear all,
>
> I am trying to train the NameFinderME using a custom set of feature
> generators. However, I am not able to add the feature generators to the
> name finder.
>
> Here is what I do:
> As described in the documentation (https://opennlp.apache.org/
> documentation/1.7.0/manual/opennlp.html#tools.namefind.training.featuregen),
> I used the following code to set up the list of feature generators:
>
>    AdaptiveFeatureGenerator featureGenerator = new CachedFeatureGenerator(
>            new AdaptiveFeatureGenerator[]{
>            new WindowFeatureGenerator(new TokenFeatureGenerator(), 2, 2),
>            new WindowFeatureGenerator(new TokenClassFeatureGenerator(true),
> 2, 2),
>            new OutcomePriorFeatureGenerator(),
>            new PreviousMapFeatureGenerator(),
>            new BigramNameFeatureGenerator(),
>            new SentenceFeatureGenerator(true, false),
>            new BrownTokenFeatureGenerator(BrownCluster dictResource)
>            });
>
> Afterwards, in the documentation it is explained, that "the
> TokenNameFinderFactory allows to specify a custom feature generator".
> However, I don't know how to do this, since there is no add-Method or any
> parameter of type AdaptiveFeatureGenerator in the constructor.
>
>    TokenNameFinderFactory factory = new TokenNameFinderFactory()
>    ... //how to add the FeatureGenerator?
>    model = NameFinderME.train("en", "default", sampleStream,
> TrainingParameters.defaultParams(), factory);
>
> In an older release of OpenNlp, it was possible to add the
> featureGenerators via the train-Method like this:
>
>    train(String languageCode, String type, ObjectStream<NameSample>
> samples,
>        TrainingParameters trainParams, AdaptiveFeatureGenerator generator,
> final Map<String, Object> resources)
>
> But this not possible any longer. Can anybody describe the new way to
> implement this? An example would be great!
>
> I only found this:
>
>    public TokenNameFinderFactory(byte[] featureGeneratorBytes,
>                               Map<String,Object> resources,
>                               SequenceCodec<String> seqCodec)
>
> But I don’t know what parameters to pass (why a byte array?
> SequenceCodec?)...
>
> Any help is appreciated,
> Thanks in advance!
>
> Best,
> Markus
>

Reply via email to