Re: question about InterOP between Apache UIMA and Omnifind Annotators (CAS2JDBC)

Thilo Goetz Wed, 24 Jun 2009 06:37:44 -0700

Chengmin Ding wrote:
> Thanks Thilo!  I didn't mean to cross-post to the other list but I didn't
> see my question posted in my gmail account so just tried again. Sorry about
> it.
> 
> A couple of years ago when we used the IBM UIMA framework, we could run
> CAS2JDBC out of Omnifind by including the Omnifind base annotators into the
> aggregate analysis engine. (following the Omnifind handbook and suggestions
> from Sebastian , c.f.
> http://www.ibm.com/developerworks/forums/thread.jspa?threadID=157872&tstart=0&messageID=13941628
> )
> 
> I guess my question should be better phrased this way:  we tried to use the
> IBM UIMA Adaptor to wrap up the Omnifind base annotator
> (of_tokenization.xml) and does this supposed to work?   In our pipeline, we
> used the Adaptor twice, one for the Omnifind base annotator(at the
> beginning), one of the CAS2JDBC consumer(at the end).
> 
> I appreciate any suggestions/comments on this.


Sorry, I told you everything I could dredge up from the depths
of my memory.  Please try the OF forum on developerworks (not
the UIMA forum): http://www.ibm.com/developerworks/forums/forum.jspa?forumID=757
You may have more luck there.

--Thilo

> 
> -Chengmin
> 
> On Wed, Jun 24, 2009 at 3:05 AM, Thilo Goetz <[email protected]> wrote:
> 
>> Hi Chengmin,
>>
>> please don't cross post.  Answers below.
>>
>> Chengmin Ding wrote:
>>> Hello,
>>>
>>> We have used the UIMA Adapter for IBM annotators and it worked for some
>> of
>>> our testing annotators.  However, when we tried it on cas2jdbc, we got
>> the
>>> following error:
>>>
>>> We have a CPE pipeline and the CAS2JDBC is the only consumer/engine based
>> on
>>> IBM UIMA framework. We are using Apache UIMA 2.2 for the entire pipeline.
>> We
>>> were thinking this was caused by missing Omnifind specific annotator
>> which
>>> fills out the DocumentAnnotation or the omnifind specific
>>> com.ibm.es.tt.DocumentMetaData feature structure (which contains
>> documentid
>>> etc features). We then added the base annotator from Omnifind
>>> (OF_Tokenization.xml etc) and also wrapped it up with the adapter. But we
>>> still got the same error. Our questions are:
>>>
>>> 1) Is the error indeed caused by missing some Omnifind specific annotator
>>> that fills out the DocumentAnnotation feature structure?
>> Not quite sure from the error message, but very likely yes.  I suppose
>> that cas2jdbc was never intended to be run outside the OF UIMA pipeline.
>> OF has an internal document model that is shared between its annotators,
>> and I assume that cas2jdbc relies on that model.  Seems reasonable, given
>> that you will later need to identify documents in the DB based on some ID
>> or other.
>>
>>> 2) Is there any way to further isolate the problem via any tools
>> considering
>>> we do not have the source code for cas2jdbc?
>> I can't think of any.  A better place to ask would be the IBM OF
>> support forum.
>>
>>> 3) Can the IBM UIMA Adapter be used the same way to wrap regular
>> annotator,
>>> aggregated analysis engine and consumers ?
>> Yes for primitive and aggregate AEs.  Consumers I actually don't know,
>> they used to have a special status in IBM UIMA.  It doesn't look like
>> that's your problem, though.
>>
>>> 4) Does Apache UIMA have any plan to come up with a CAS2JDBC compatible
>> db
>>> consumer?
>> If there is one, I don't know of it.
>>
>> --Thilo
>>
>>> Thanks a lot!
>>> ================================================
>>> org.apache.uima.analysis_engine.AnalysisEngineProcessException
>>> at
>>>
>> com.ibm.uima.adapter.ibm.IBMAnalysisEngineWrapper.processAndOutputNewCASes(Unknown
>>> Source)
>>> at
>>>
>> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
>>> at
>>>
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:892)
>>> at
>>>
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:577)
>>> Caused by: com.ibm.uima.analysis_engine.AnalysisEngineProcessException:
>> The
>>> common analysis structure cannot be processed. See the previous exception
>>> for details.
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.compatibility.CasConsumerAdapter.process(CasConsumerAdapter.java:93)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:392)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:297)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
>>> ... 4 more
>>> Caused by: com.ibm.uima.resource.ResourceProcessException: The common
>>> analysis structure cannot be processed. See the previous exception for
>>> details.
>>> at
>>>
>> com.ibm.uima.consumer.cas2jdbc.utils.Cas2JdbcLogger.log_PROCESS_CAS__SEVERE(Unknown
>>> Source)
>>> at com.ibm.uima.consumer.cas2jdbc.Cas2Jdbc.processCas(Unknown Source)
>>> at
>>>
>> com.ibm.uima.reference_impl.analysis_engine.compatibility.CasConsumerAdapter.process(CasConsumerAdapter.java:89)
>>> ... 7 more
>>> Caused by: com.ibm.uima.resource.ResourceProcessException: The document's
>> ID
>>> cannot be parsed. See the previous exception for details.
>>> at
>>>
>> com.ibm.uima.consumer.cas2jdbc.utils.Cas2JdbcLogger.log_BAD_DOCID__SEVERE(Unknown
>>> Source)
>>> at com.ibm.uima.consumer.cas2jdbc.Cas2Jdbc.parseDocID(Unknown Source)
>>> ... 9 more
>>> Caused by: java.lang.NullPointerException
>>> ... 10 more
>>>
>>> -Chengmin
>>>
>>
>

Re: question about InterOP between Apache UIMA and Omnifind Annotators (CAS2JDBC)

Reply via email to