[
https://issues.apache.org/jira/browse/CONNECTORS-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118175#comment-15118175
]
Rafa Haro commented on CONNECTORS-1270:
---------------------------------------
Hi [~daddywri]. Regarding the multithreading operation, OpenNLP's are thread
safe and because of model's binary files are usually heavy, they must be loaded
in memory statically as singletons and shared all along the system. This is
precisely what is done at OpenNlpExtractorConfig class.
About the models files, at Apache Stanbol project we already had a similar
problem regarding distribution. Most of the available models for different
languages are not Apache license compatible so we can't distribute them as part
of the project.
I see two main problems for downloading the models files on the fly: the first
one, ManifoldCF could be running offline without access to internet. The second
one, OpenNLP is at the end a tool for training your own Annotation models, so
providing only URL based access to the models would prevent someone to use
their own trained ones for a custom annotation job.
> Import OpenNLP connector into trunk
> -----------------------------------
>
> Key: CONNECTORS-1270
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1270
> Project: ManifoldCF
> Issue Type: Task
> Reporter: Karl Wright
> Assignee: Rafa Haro
> Fix For: ManifoldCF 2.4
>
>
> An OpenNLP connector has been contributed on github. Need to import it into
> MCF, first to a branch, then to trunk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)