rzo1 commented on PR #1003:
URL: https://github.com/apache/opennlp/pull/1003#issuecomment-4156851451

   Hi,
   
   Thanks for the contribution! Overall, I like the idea of looking into 
built-in thread safety rather than relying on ThreadLocal-based wrappers, which 
have known issues in Jakarta EE and other long-lived thread environments. 
   
   A few concerns I'd like to discuss before this can move forward (imho):      
                               
                                                                                
                                                                     
   1. Benchmarks (no JMH)                                                       
                                                 
                                                                                
                                                                     
     The benchmarks are hand-rolled System.nanoTime() + ExecutorService loops. 
Without JMH, the results are susceptible to JIT warmup, GC pauses, and profile 
pollution, i.e. there's no fork isolation, no warmup iterations, and no 
statistical variance reporting. For a change that removes multiple 
     caching layers and claims "performance within noise," JMH would be 
preferable. We already have a profile for JMH benchmarks.                       
                   
                               
   2. Caches removed without replacement
   
     Three layers of caching were removed as a shortcut to thread safety:       
                                                                     
     
     - `CachedFeatureGenerator`: the class is now a pass-through that caches 
nothing, despite its name. During beam search, the same token position may be 
feature-generated up to k times (beam width) with identical inputs. This cache 
was saving real work.
     - `DefaultPOSContextGenerator` / `ConfigurablePOSContextGenerator`: 
per-sentence context caches removed entirely.                                  
     - `BeamSearch.contextsCache: you note this was buggy (stale references 
from the shared probs[] buffer). That may be valid, but removing it rather than 
fixing it (e.g., storing copies) conflates a bug fix with the thread-safety 
change.                                                 
                                                           
    The regression benchmark reports "performance within noise," but without 
JMH-level statistical rigor that's hard to verify. More importantly, the 
benchmark uses a small set of short sentences: a benchmark against a real-world 
dataset (e.g., from the eval/test corpora: 
https://nightlies.apache.org/opennlp/) would be far more convincing, 
particularly for POS tagging where the feature generation cache had the most 
impact under larger workloads. A thread-safe alternative would be making the 
caches method-local rather than removing them entirely.                         
                                
   
   3. Thread-safety tests are not robust                                        
                                                                   
                               
     - No contention forcing: there's no CyclicBarrier or CountDownLatch to 
ensure threads hit the critical section simultaneously. Threads free-run, which 
reduces the probability of surfacing races.                                     
                                         
     - LemmatizerME was patched but has no thread-safety test.                  
                                                                     
     - probs() under concurrent access is not tested, despite being preserved 
as volatile for backward compatibility.                                
     - The test could pass on a 2-core CI machine and fail on a 64-core box. I 
think, that we sohuld at minimum set higher iteration counts with  
barrier-synchronized thread starts.                                             
                                                                
                                                                                
                                                                     
   4. Missing coverage                                                          
                                                                   
                               
     - Only 4 of 7 ME classes are addressed (ChunkerME, NameFinderME, 
LanguageDetectorME are untouched). This is fine as a scoped PR, but worth 
noting, so the existing ThreadSafe*ME wrappers can't be deprecated yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to