[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

JIRA Wed, 08 Jan 2014 05:50:29 -0800

    [ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865437#comment-13865437
 ]


Peter Klügl commented on UIMA-2332:
-----------------------------------

Some more information about the profiling:
- the test script consists essentially of the first three phases of the ANNIE 
NER together with the gazetteers. The third phase contains about 54 rules. The 
Ruta script contains overall about 50 rules.
- ignoring the initialization, 82% of the time is used for inference (and 
dictionaries), 18% for initializing the RutaStream, that is the seeding (2%) 
and RutaBasics (16%)
- the main hotspot is TOP.getAddress() with 37%. 60% caused by 
FSIteratorWrapper.get(), 25% caused by FeatureStructureImpl.getType()

The next step of improvement could be to reduce the usage of nice 
lists/sets/maps, e.g., use arrays in RutaBasic.

> Profile and optimize Ruta inference performance
> -----------------------------------------------
>
>                 Key: UIMA-2332
>                 URL: https://issues.apache.org/jira/browse/UIMA-2332
>             Project: UIMA
>          Issue Type: Improvement
>          Components: ruta
>    Affects Versions: 2.0.0TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>            Priority: Minor
>             Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

Reply via email to