On 10/06/2014 04:49 PM, Rodrigo Agerri wrote:
As I said, I have issue 717 solved by adding a getter for the
featureGenerator in the TokenNameFactory and using that getter to
parametrized correctly the creation of the TokenNameFinderModel after
training.

Isn't that how it is implemented today? The feature generators can't be shared and therefore we have the createFeatureGenerators method in the TokenNameFinderFactory
which creates a new feature generator every time one is needed.
That one tries to read the xml descriptor from the model and creates the feature generators.

You say it uses the default feature generation, that can only happen if the createFeatureGenerator
method returns null. Is that true in your case?

In which place, exactly, did you add the getter method to fix the problem, and where in TokenNameFinderModel did you call it? The TokenNameFinderFactory doesn't have an instance variable called featureGenerator.
I am just trying to understand how your proposed fix works.

Usually the model is created by using one of the constructors which take an InputStream,
File or URL. Did you use a different constructor to create the model?

I will try to reproduce the bug you see.

How can I do that?

First train a model with this command:
bin/opennlp TokenNameFinderTrainer -featuregen bigram.xml -factory
opennlp.tools.namefind.TokenNameFinderFactory -sequenceCodec BIO
-params lang/ml/PerceptronTrainerParams.txt -lang nl -model test.bin
-data ~/experiments/nerc/opennlp/data/nl/conll2002/nl_opennlp.testa.train

and this feature generator config:
<generators>
  <cache>
    <generators>
      <window prevLength = "2" nextLength = "2">
        <tokenclass/>
      </window>
      <window prevLength = "2" nextLength = "2">
        <token/>
      </window>
      <definition/>
      <prevmap/>
      <bigram/>
      <sentence begin="true" end="false"/>
      <prefix/>
      <suffix/>
    </generators>
  </cache>
</generators>

Did you use the command line tool for the evaluation too?
Maybe you can post the command for that.

Jörn

Reply via email to