GitHub user kottmann opened a pull request:
https://github.com/apache/opennlp/pull/51
OPENNLP-923: Wrap all lines longer than 110 chars
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kottmann/opennlp OPENNLP-923-2
GitHub user smarthi opened a pull request:
https://github.com/apache/opennlp/pull/50
OPENNLP-930: [WIP Don't Merge] Write test for RegexNameFinderFactory
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/smarthi/opennlp
Github user asfgit closed the pull request at:
https://github.com/apache/opennlp/pull/49
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user asfgit closed the pull request at:
https://github.com/apache/opennlp/pull/48
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
GitHub user kottmann opened a pull request:
https://github.com/apache/opennlp/pull/49
OPENNLP-923: Wrap all lines longer than 110 chars
And also add checkstyle enforcement
You can merge this pull request into a Git repository by running:
$ git pull
On Wed, 2017-01-11 at 17:14 +, Russ, Daniel (NIH/CIT) [E] wrote:
> Hi,
>
> I am little confused. Why do you want to share an instance of a
> SentenceDetectorME across threads? Are you documents very long single
> sentences? I don’t think there is enough work for the
> SentenceDetectorME to
+1 ease of use is important for us and has always been a strong focus
here.
Jörn
On Wed, 2017-01-11 at 17:39 +0100, Thilo Goetz wrote:
> You can do all sorts of things. I implemented a version now that
> uses
> ThreadLocals. Works fine, but quite frankly, it's a pain in the
> butt.
> The world
On Wed, 2017-01-11 at 11:05 +0100, Thilo Goetz wrote:
> in a recent project, I was using SentenceDetectorME, TokenizerME and
> POSTaggerME. It turns out that none of those is thread safe. This is
> because the classification probabilities for the last tag() call
> (for
> example) are stored in
Hi,
I am little confused. Why do you want to share an instance of a
SentenceDetectorME across threads? Are you documents very long single
sentences? I don’t think there is enough work for the SentenceDetectorME to
make up the cost of multithreading on 4 cores.
Previously, I had
Control over threading is not required to "share the model between threads
and create one instance of the component per thread".
One could use a scope where variable references are guaranteed to be stored
in the call stack (say method-local variables in Java).
You could then:
a) Instantiate
+1 to make SentenceDectorME and TokenizerME thread safe and everything else
where it works out for us.
Making it thread safe only makes sense if you can get the throughput almost
multiplied by using more cores. This works with the current model.
For the POSTagger we would have to change the API
Correct me if I'm wrong, but that approach only works if you control the
thread creation yourself. In my case, for example, I was using Scala's
parallel collection API, and had no control over the threading. I will
usually want to create one service that does tokenization or POS tagging
or
Hello Thilo,
I am interested in your opinion about how this is done currently.
We say: "Share the model between threads and create one instance of the
component per thread".
Wouldn't that work well in your use case?
Jörn
On Wed, Jan 11, 2017 at 11:05 AM, Thilo Goetz wrote:
Github user asfgit closed the pull request at:
https://github.com/apache/opennlp/pull/44
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Hi,
in a recent project, I was using SentenceDetectorME, TokenizerME and
POSTaggerME. It turns out that none of those is thread safe. This is
because the classification probabilities for the last tag() call (for
example) are stored in a member variable and can be retrieved by a
separate API
GitHub user kottmann opened a pull request:
https://github.com/apache/opennlp/pull/47
OPENNLP-932: Use checkstyle suppression instead of mvn exclude
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kottmann/opennlp OPENNLP-932
16 matches
Mail list logo