[
https://issues.apache.org/jira/browse/OPENNLP-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989920#comment-12989920
]
Jörn Kottmann commented on OPENNLP-17:
--------------------------------------
The custom xml solution wasn't that bad, I created a small xml based DSL to
specify the feature generation. Compared to the
spring xml file it does look much nicer and is super easy to understand and
specify. You also avoid the dependency which is in some use cases a bit
annoying.
I still really like the idea with the java script api. Just place some script
file in the model which constructs the built in feature generators and can even
implement a new feature generator. The new custom feature generator can be
distributed inside the model. This way the model can just be deployed and
without deploying additional jar files.
We also discussed just to place an entire jar file inside the model, but that
might be problematic in certain deployments.
I am actually -1 for dependency injection, because of the additional
dependencies and because the xml descriptor does not look nice.
+1 for custom xml and +1 for java scripting api.
> Add support for custom feature generator configuration embedded in the model
> package
> ------------------------------------------------------------------------------------
>
> Key: OPENNLP-17
> URL: https://issues.apache.org/jira/browse/OPENNLP-17
> Project: OpenNLP
> Issue Type: Improvement
> Components: Chunker, Name Finder, POS Tagger
> Affects Versions: tools-1.5.0-sourceforge
> Reporter: Jörn Kottmann
>
> Add support for custom feature generator configuration embedded in the model
> package.
> The configuration of the feature generators for the name finder component can
> be quite complex and the configuration must
> be always done twice once for training and once for tagging. Doing it twice
> at two different points in time makes
> the feature generation very error prone. Small mistakes lead to a drop in
> detection performance which might
> be difficult to notice.
> To solve this issue add the configuration to the model, then it must only be
> specified during training and
> can be loaded from the model during tagging.
> Another advantage is that custom feature generation is difficult to use
> otherwise, because the integration
> code must deal itself with setting up the feature generators. In some cases
> the user even does not have control
> over the code, or does not want to change it, e.g. in the UIMA wrappers.
> The same logic should be used for the POS Tagger and Chunker.
> The issues is migrated from SourceForge:
> https://sourceforge.net/tracker/?func=detail&aid=1941380&group_id=3368&atid=353368
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira