And what about sequence validators? How to alternate from the default one? The factory should be used to load custom resources, like a different implementation of a dictionary, am I right?
Thank you, William On Tue, Feb 7, 2012 at 11:57 AM, Joern Kottmann <[email protected]> wrote: > Yes, lets see what we could do. > > The name finder already supports custom feature generation, > the same feature generation code could be reused by the POS Tagger. > This is actually already half done. > > One of the current limitations is that we cannot store "custom" resources > in > a model. If we specify some kind of Factory class it would be nice if it > can help > us to locate the Artifact Serializer for a custom resource. > > We could define one Factory class per component which is able to influence > how this component is created from the model. > > What do you think? > > Jörn > > On Tue, Feb 7, 2012 at 2:17 PM, [email protected] < > [email protected]> wrote: > > > Hi, > > > > I would like to work on that now, passing a Factory class name to the CLI > > tools and saving it to the model as a configuration. > > Do you still think it is a good idea? Or we should find a better way to > > load custom feature generator and custom sequence validators? I would > like > > to do it for SentenceDetector and POS Tagger for now. > > > > Thanks, > > William > > > > On Tue, Jun 21, 2011 at 11:58 AM, Jörn Kottmann <[email protected]> > > wrote: > > > > > On 6/14/11 4:23 AM, [email protected] wrote: > > > > > >> Hi, > > >> > > >> Currently we only have implemented custom feature generators that we > can > > >> pass from command line only for NameFinder, but it would be very nice > to > > >> have it for all tools. > > >> The Thai sentence detector customization is nice and simple, but to do > > >> something for other languages the user would need to branch the code. > We > > >> should allow users to pass a factory class name from command line. > Maybe > > >> we > > >> could do it for every tool that doesn't use sequence feature > generator. > > >> Also > > >> would be nice to save the factory class name to the model to make sure > > we > > >> are using the same feature generator during runtime and evaluation. > > >> > > >> What do you think? Maybe you have thought a better solution for that. > > >> > > > > > > The first approach OpenNLP come up with to customize the feature > > generation > > > of a component is to simply pass in a context generator. Well, that > does > > > not > > > really work with the new model packages and the command line. > > > We never really came up with a solution to this problem or discussed > it. > > > > > > William suggest that we should use a class name to load a factory > class. > > > And I think we then should also remove the support to pass in a context > > > generator. > > > > > > I believe it is a good way of solving the issue, since the model can > than > > > be used > > > by an code which integrates OpenNLP and has an additional jar on the > > > classpath. > > > That will for example work well with our UIMA integration. > > > > > > These models might not be well suited for distribution to a wider group > > of > > > people > > > since they always need the factory class which we cannot put inside the > > > model because > > > of security issues. > > > > > > For components where we need to adapt the feature generation to a > > language > > > I still > > > suggest that we continue to define default feature generation which is > > > dependent on > > > the language, as we already do for thai in the sentence detector. > > > > > > Well, I am not yet sure how it should be done for the parser, doccat > and > > > coref. > > > > > > Jörn > > > > > >
