[
https://issues.apache.org/jira/browse/TIKA-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581293#comment-14581293
]
Hudson commented on TIKA-1654:
------------------------------
SUCCESS: Integrated in tika-trunk-jdk1.7 #744 (See
[https://builds.apache.org/job/tika-trunk-jdk1.7/744/])
TIKA-1654 Reset cTAKES CAS into CTAKESParser (Fix for TIKA-1645) (totaro:
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1684801)
*
/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ctakes/CTAKESContentHandler.java
*
/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ctakes/CTAKESParser.java
*
/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ctakes/CTAKESUtils.java
> Reset cTAKES CAS into CTAKESParser
> ----------------------------------
>
> Key: TIKA-1654
> URL: https://issues.apache.org/jira/browse/TIKA-1654
> Project: Tika
> Issue Type: Bug
> Components: parser
> Reporter: Giuseppe Totaro
> Assignee: Giuseppe Totaro
> Labels: patch
> Fix For: 1.9
>
> Attachments: TIKA-1654.patch
>
>
> Using [CTAKESParser from Tika
> Server|https://wiki.apache.org/tika/cTAKESParser], I noticed that an
> exception occurs when the CTAKESParser is used multiple times:
> {noformat}
> org.apache.uima.cas.CASRuntimeException: Data for Sofa feature
> setLocalSofaData() has already been set.
> {noformat}
> This is due to the CAS (Common Analysis System) used by CTAKESParser. The
> CAS, as the AE (AnalysisEngine), is a static field into CTAKESParser to make
> a sort of singleton.
> By the way, An Analysis Engine is a cTAKES/UIMA component responsible for
> analyzing unstructured information, discovering and representing semantic
> content. An AnalysisEngine operates on an "analysis structure" (implemented
> by CAS).
> It is highly recommended to reuse the CAS, but it has to be reset before the
> next run. The CTAKESUtils class ({{org.apache.tika.parser.ctakes}}) provides
> the reset method to release all resources held by both AnalysisEngine and CAS
> and then "destroy" them. This method prevents the CASRuntimeException error.
> You can find in attachment the patch including two new methods (resetCAS and
> resetAE) to reset, but not to destroy, the CAS and the AnalysisEngine
> respectively.
> By using only resetCAS, CTAKESParser can reuse both CAS and AE instead of
> building them again for each run.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)