[ 
https://issues.apache.org/jira/browse/UIMA-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Eckart de Castilho updated UIMA-6232:
---------------------------------------------
    Summary: Reduce overhead of createTypeSystemDescription() and friends  
(was: Reduce overhead of createTypeSystemDescription())

> Reduce overhead of createTypeSystemDescription() and friends
> ------------------------------------------------------------
>
>                 Key: UIMA-6232
>                 URL: https://issues.apache.org/jira/browse/UIMA-6232
>             Project: UIMA
>          Issue Type: Improvement
>            Reporter: Richard Eckart de Castilho
>            Assignee: Richard Eckart de Castilho
>            Priority: Major
>             Fix For: 2.6.0uimaFIT
>
>
> uimaFIT offers a range of factory methods which use classpath scanning to 
> locate type system descriptions, type priority definitions and index 
> definitions. 
> The present implementation scans for each type of object once and then stores 
> the locations in which the descriptors were found in a global static 
> variable. The user can call a method to clear this variable and force a 
> re-scan.
> Whenever client code calls a method such as {{createTypeSystemDescription()}} 
> the cached locations are read, parsed, and a corresponding Java descriptor 
> object is created and returned.
> This issue is about two problems with this approach:
> 1) finding of the descriptor locations does only consider the ClassLoader 
> situation the first time the scanning takes place. If at a later stage, 
> {{createTypeSystemDescription()}} is called in the context of a ClassLoader 
> with access to a different set of descriptions, this is not considered.
> 2) parsing the XML files every time e.g.  {{createTypeSystemDescription()}} 
> is called is slowing uimaFIT down overall. These methods are potentially 
> called very often, in particular every time that 
> {{createEngineDescription()}} or similar methods are called. Depending on the 
> context, the parse overhead can have significant impact on the overall 
> execution time.
> As a solution for 1), we could adopt a similar approach as it is used for 
> JCas wrapper classes in the JCasImpl: the locations are stored in a 
> {{WeakHashMap}} mapping the current ClassLoader to the discovered locations. 
> The "current" ClassLoader is obtained via the Spring 
> {{ClassUtils.getDefaultClassLoader()}} which is also (indirectly) used in 
> many other places in uimaFIT. In particular, this method uses a Thead context 
> classloader - if one is available.
> As a solution for 2), we do not only keep a {{WeakHashMap}} cache for the 
> locations, but also for the parsed and aggregated XML files. When calling 
> e.g. {{createTypeSystemDescription()}} and the cache already contains a 
> respective descriptor, then a deep clone of it is returned. A similar 
> approach (cloning a descriptor) was recently also introduced into UIMA Core 
> to avoid repeatedly loading and parsing default flow controller definitions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to