On Wed, 2017-01-11 at 17:14 +0000, Russ, Daniel (NIH/CIT) [E] wrote:
> Hi,
> 
>    I am little confused. Why do you want to share an instance of a
> SentenceDetectorME across threads? Are you documents very long single
> sentences? I don’t think there is enough work for the
> SentenceDetectorME to make up the cost of multithreading on 4
> cores.  
> 
>    Previously, I had multiple threads (each with a separate
> SentenceDetectorME/TokenizerME/POSTaggerME) work on different parts
> of a document.   Have you considered decomposing the problem at the
> document level or higher instead of the sentence level?  Maybe you
> could use regex to break the document into paragraphs and have the
> threads work on the paragraphs.
> 

To me it reads like he is mostly concerned with his convenience in
using our tools like it suits him. That is fine. 

He wants to create an SentenceDetectorME instance for example, and then
have multiple threads call it to split some text blocks. The motivation
is that this is the easiest / most convenient thing to do, not that
there are no solutions to solve it somehow. And when it is thread safe
you have less worries about it using it.

We get reminded once in a while on the user list that this is not
possible and people never like it much.

+1 to merge PRs that help us with this

Jörn

 


Reply via email to