Concurrent modification checks dominate index iteration time.
-------------------------------------------------------------

                 Key: UIMA-1364
                 URL: https://issues.apache.org/jira/browse/UIMA-1364
             Project: UIMA
          Issue Type: Improvement
            Reporter: Branimir Lambov
         Attachments: ConcurrentModificationPatch.txt

Iterating over the annotation index with even a moderate number of defined 
types is dominated by the time spent checking individual indexes for concurrent 
modification. This is due to the fact that concurrent modification checks are 
done on all types being iterated over, even if the iteration only needs to 
process a couple of iterators. In fact, checking all iterators for modification 
has linear complexity in the number of subiterators used, while the actual 
iteration can be implemented with logarithmic complexity using e.g. a binary 
heap.

The UIMA documentation and JavaDoc do not state that the iterators should 
always recognize concurrent modification (FSIterator JavaDoc states 
"Implementations of this interface are not required to be fail-fast. That is, 
if the iterator's collection is modified, the effects on the iterator are in 
general undefined."). It thus makes sense to reduce the number of iterators 
being tested for concurrent modification at each moveToNext() step.

The attached patch replaces the checkConcurrentModificationAll() call in 
FSIndexRepositoryImpl.PointerIterator.moveToNext() with concurrent modification 
checks on only the iterators being used by the step; as the iterator becomes 
invalid it also checks all involved iterators for modification. By doing this 
it should be able to catch almost all concurrent modification without the 
excessive overhead.

In one of our performance tests iterating over the annotation index with 140 
types defined is more than twice faster after the attached patch is applied.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to