[
https://issues.apache.org/jira/browse/OPENNLP-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628499#comment-17628499
]
ASF GitHub Bot commented on OPENNLP-1320:
-----------------------------------------
rzo1 commented on code in PR #385:
URL: https://github.com/apache/opennlp/pull/385#discussion_r1013330656
##########
opennlp-morfologik-addon/src/main/java/opennlp/morfologik/lemmatizer/MorfologikLemmatizer.java:
##########
@@ -47,7 +47,7 @@ public MorfologikLemmatizer(Dictionary dictionary) throws
IllegalArgumentExcepti
dictLookup = new DictionaryLookup(dictionary);
}
- private List<String> lemmatize(String word, String postag) {
+ private synchronized List<String> lemmatize(String word, String postag) {
Review Comment:
An alternative to `synchronized` would be to re-create and dispose the
`DictionaryLookup`, which is " is cheap to create and dispose (so it makes no
sense to cache)" according to
https://github.com/morfologik/morfologik-stemming/issues/69, i.e. moving to
```java
List<WordData> dictMap = new
DictionaryLookup(dictionary).lookup(word.toLowerCase());
```
and just store the (thread-safe) dictionary in the `MorfologikLemmatizer`
instead of the not thread-safe `DictionaryLookup`.
That would avoid the synchronization cost. Wdyt?
> Makes lemmatize of MorfologikLemmatizer thread-safe
> ---------------------------------------------------
>
> Key: OPENNLP-1320
> URL: https://issues.apache.org/jira/browse/OPENNLP-1320
> Project: OpenNLP
> Issue Type: Bug
> Reporter: Lucas Avanço
> Priority: Major
>
> The method lemmatize of MorfologikLemmatizer is not thread-safe.
> Concurrent invokes may rise exceptions and return unpredictable resutls.
> It seems that the whole method must be sync because the variable returned by
> the morfologik lib is shared between threads.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)