Hi,
in a recent project, I was using SentenceDetectorME, TokenizerME and
POSTaggerME. It turns out that none of those is thread safe. This is
because the classification probabilities for the last tag() call (for
example) are stored in a member variable and can be retrieved by a
separate API call.
I'm planning to build thread safe versions for myself, and I'd be happy
to contribute a patch if there is interest. This could be done as a
conservative extension with an additional method such as tagReentrant,
where the old API calls would continue to work as before and would still
not be thread safe. Alternatively, one could remodel the API so that
everything was thread safe, but that would break backwards compatibility.
Final question: if I do this for the classes mentioned above, are there
other tools that should be made thread safe while we're at it?
Opinions?
--Thilo
- Thread-safe versions of some of the tools Thilo Goetz
-