On 10/07/2014 06:40 PM, Rodrigo Agerri wrote:
Hello,
One question regarding the WordClusterFeatureGenerator implementation
which I am using as template for the Brown features and so on. I
cannot seem to make it work, it complains all the time that the value
of the attribute "dict" I provide is not an instance of a
W2VClassesDictionary:
Exception in thread "main"
opennlp.tools.namefind.TokenNameFinderModel$FeatureGeneratorCreationError:
opennlp.tools.util.InvalidFormatException: Not a W2VClassesDictionary
resource for key: opennlp.tools.util.featuregen.W2VClassesDictionary
I have tried both from the CLI and programatically and I get the same result.
From the CLI I write an element like this:
<w2vwordcluster dict="opennlp.tools.util.featuregen.W2VClassesDictionary" />
The dict parameter expects the name of the w2v dictionary resource. The
way it works is
that you create a resource directory e.g. "model-resources" and in this
directory you place
a file. The name of the file goes in the dict attribute.
For example:
<w2vwordcluster dict="xyz-classes.txt" />
which i add to the default descriptor. I also pass the relevant
directory containing the word2vec clusters via the -resources
parameter.
Yes, that has to be done, otherwise it doesn't know where to look for
the resources.
The entire resource directory is included in the model, and there can be
multiple
feature generators using files from these resources.
Jörn