[ 
https://issues.apache.org/jira/browse/UIMA-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated UIMA-1364:
----------------------------------

    Attachment: ConcurrentModificationPatch.txt

> Concurrent modification checks dominate index iteration time.
> -------------------------------------------------------------
>
>                 Key: UIMA-1364
>                 URL: https://issues.apache.org/jira/browse/UIMA-1364
>             Project: UIMA
>          Issue Type: Improvement
>            Reporter: Branimir Lambov
>         Attachments: ConcurrentModificationPatch.txt
>
>
> Iterating over the annotation index with even a moderate number of defined 
> types is dominated by the time spent checking individual indexes for 
> concurrent modification. This is due to the fact that concurrent modification 
> checks are done on all types being iterated over, even if the iteration only 
> needs to process a couple of iterators. In fact, checking all iterators for 
> modification has linear complexity in the number of subiterators used, while 
> the actual iteration can be implemented with logarithmic complexity using 
> e.g. a binary heap.
> The UIMA documentation and JavaDoc do not state that the iterators should 
> always recognize concurrent modification (FSIterator JavaDoc states 
> "Implementations of this interface are not required to be fail-fast. That is, 
> if the iterator's collection is modified, the effects on the iterator are in 
> general undefined."). It thus makes sense to reduce the number of iterators 
> being tested for concurrent modification at each moveToNext() step.
> The attached patch replaces the checkConcurrentModificationAll() call in 
> FSIndexRepositoryImpl.PointerIterator.moveToNext() with concurrent 
> modification checks on only the iterators being used by the step; as the 
> iterator becomes invalid it also checks all involved iterators for 
> modification. By doing this it should be able to catch almost all concurrent 
> modification without the excessive overhead.
> In one of our performance tests iterating over the annotation index with 140 
> types defined is more than twice faster after the attached patch is applied.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to