[ 
https://issues.apache.org/jira/browse/TIKA-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594308#comment-14594308
 ] 

Hudson commented on TIKA-1654:
------------------------------

FAILURE: Integrated in tika-trunk-jdk1.7 #758 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/758/])
TIKA-1654: Reset cTAKES CAS into CTAKESParser (totaro: 
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1686518)
* 
/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ctakes/CTAKESContentHandler.java
* 
/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ctakes/CTAKESParser.java
* 
/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ctakes/CTAKESUtils.java


> Reset cTAKES CAS into CTAKESParser
> ----------------------------------
>
>                 Key: TIKA-1654
>                 URL: https://issues.apache.org/jira/browse/TIKA-1654
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Giuseppe Totaro
>            Assignee: Giuseppe Totaro
>              Labels: patch
>             Fix For: 1.10
>
>         Attachments: TIKA-1654.patch, TIKA-1654.v02.patch
>
>
> Using [CTAKESParser from Tika 
> Server|https://wiki.apache.org/tika/cTAKESParser], I noticed that an 
> exception occurs when the CTAKESParser is used multiple times:
> {noformat}
> org.apache.uima.cas.CASRuntimeException: Data for Sofa feature 
> setLocalSofaData() has already been set.
> {noformat}
> This is due to the CAS (Common Analysis System) used by CTAKESParser. The 
> CAS, as the AE (AnalysisEngine), is a static field into CTAKESParser to make 
> a sort of singleton.
> By the way, An Analysis Engine is a cTAKES/UIMA component responsible for 
> analyzing unstructured information, discovering and representing semantic 
> content. An AnalysisEngine operates on an "analysis structure" (implemented 
> by CAS).
> It is highly recommended to reuse the CAS, but it has to be reset before the 
> next run. The CTAKESUtils class ({{org.apache.tika.parser.ctakes}}) provides 
> the reset method to release all resources held by both AnalysisEngine and CAS 
> and then "destroy" them. This method prevents the CASRuntimeException error.
> You can find in attachment the patch including two new methods (resetCAS and 
> resetAE) to reset, but not to destroy, the CAS and the AnalysisEngine 
> respectively.
> By using only resetCAS, CTAKESParser can reuse both CAS and AE instead of 
> building them again for each run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to