[
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865437#comment-13865437
]
Peter Klügl commented on UIMA-2332:
-----------------------------------
Some more information about the profiling:
- the test script consists essentially of the first three phases of the ANNIE
NER together with the gazetteers. The third phase contains about 54 rules. The
Ruta script contains overall about 50 rules.
- ignoring the initialization, 82% of the time is used for inference (and
dictionaries), 18% for initializing the RutaStream, that is the seeding (2%)
and RutaBasics (16%)
- the main hotspot is TOP.getAddress() with 37%. 60% caused by
FSIteratorWrapper.get(), 25% caused by FeatureStructureImpl.getType()
The next step of improvement could be to reduce the usage of nice
lists/sets/maps, e.g., use arrays in RutaBasic.
> Profile and optimize Ruta inference performance
> -----------------------------------------------
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
> Issue Type: Improvement
> Components: ruta
> Affects Versions: 2.0.0TextMarker
> Reporter: Peter Klügl
> Assignee: Peter Klügl
> Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)