Just to clarify, the CPM has no support for CasMultipliers. So any use
of a CM in a CPE would only be supported inside the AE of a single CAS
processor. That is, no child CAS would ever come out of one CAS
processor and flow into another.

So in your case, all the action is withing a single [aggregate] AE?


On Fri, Mar 16, 2012 at 5:07 PM, Eric Riebling <[email protected]> wrote:
> Last one, sorry list members for the spam.
>
> The reason things were funky is that this was a system that
> used CAS Multipliers to create the CASes that my Component
> was seeing.  If I run my Component in a straight-line pipeline,
> getEmptyCAS() produces CASes with the full type system as it
> is supposed to do.
>
> I don't fully understand the architecture of the surrounding
> system, but once I do, will supply you guys with the details in
> case this is a bug with the way UIMA handles CASes that are
> multiplied more than once.
>
>
> On 3/16/2012 2:16 PM, Eric Riebling wrote:
>>
>> And the difference in environment:
>>
>> * use SimpleRunCPE - user defined types don't show up
>> * use CPE GUI - they DO show up
>>
>> This is interesting!
>>
>> On 3/15/2012 6:50 PM, Eddie Epstein wrote:
>>>
>>> My last note was incorrect. Here is a paraphrase of working code:
>>>
>>> public AbstractCas next() throws AnalysisEngineProcessException {
>>> CAS aCAS = getEmptyCAS();
>>> try {
>>> ByteArrayInputStream casIn = getNextXmiCas();
>>> XmiCasDeserializer.deserialize(casIn, aCAS, true); //
>>> deserialize in a lenient fashion
>>> return aCAS;
>>> } catch (SAXException e) {
>>> throw new AnalysisEngineProcessException(e);
>>> } catch (IOException e) {
>>> throw new AnalysisEngineProcessException(e);
>>> }
>>> ...
>>>
>>>
>>> On Thu, Mar 15, 2012 at 5:59 PM, Marshall Schor<[email protected]> wrote:
>>>>
>>>>
>>>>
>>>> On 3/15/2012 4:38 PM, Eddie Epstein wrote:
>>>>>
>>>>>
>>>>> Cannot deserialize into a CAS from getEmptyCas().
>>>>
>>>>
>>>> This is not right. More information soon (ran out of time today).
>>>> -Marshall
>>>>
>>>>> Must use a CAS from
>>>>> CasCreationUtils.createCas for deserialization, and then use casCopier
>>>>> to copy to the CAS from getEmptyCas().
>>>>>
>>>>> Pick the version of createCas that specifies a typesystem, and use the
>>>>> typesystem from the pipeline CAS (i.e. the one from getEmptyCas).
>>>>>
>>>>> On Thu, Mar 15, 2012 at 2:44 PM, Eric Riebling<[email protected]> wrote:
>>>>>>
>>>>>>
>>>>>> Thanks, guys. This is getting me closer to the goal, and explains the
>>>>>> observed
>>>>>> behaviors. Now I'm facing issues when implemented as a CAS Multiplier.
>>>>>> I
>>>>>> try
>>>>>> creating a new CAS first with getEmptyJCas().
>>>>>>
>>>>>> Here are some various strategies and what resulted:
>>>>>>
>>>>>> * create a deserializer with the typesystem from the AE (which
>>>>>> includes types in the 'external' CAS to be deserialized)
>>>>>> * ues it to deserialize into the empty CAS created with getEmptyJCas()
>>>>>>
>>>>>> -> The deserialized CAS for some reason has only the base TOP
>>>>>> typesystem
>>>>>> -> Trying to access an annotation from an index (that should be there)
>>>>>> generates the "used in Java code, but was not declared in the XML
>>>>>> type
>>>>>> descriptor"
>>>>>> exception
>>>>>>
>>>>>> * same as above, but use CasCopier to try and copy the type system
>>>>>> (and everything else) from the CAS in the AE's process() method
>>>>>> into the empty CAS
>>>>>>
>>>>>> -> Attempted to copy a FeatureStructure of type "(my type name)",
>>>>>> which
>>>>>> is
>>>>>> not defined in the type system of the destination CAS.
>>>>>>
>>>>>> It seems the ONLY way to obtain a CAS (empty or otherwise) that has
>>>>>> the
>>>>>> type
>>>>>> system able
>>>>>> to accept the external CAS being deserialized is to use the very CAS
>>>>>> passed
>>>>>> into
>>>>>> the AE's process() method. Doing so obviously mangles that CAS for the
>>>>>> rest
>>>>>> of
>>>>>> the pipeline.
>>>>>>
>>>>>>
>>>>>> On 3/15/2012 1:50 PM, Marshall Schor wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 3/15/2012 10:38 AM, Eric Riebling wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> I have a pipeline with it's own type system.
>>>>>>>> I also have deserialized, annotated CASes on disk with a different
>>>>>>>> type
>>>>>>>> system.
>>>>>>>> Suppose I want an Analysis Engine in the pipeline to read in the
>>>>>>>> deserialized
>>>>>>>> CASes in order to obtain annotations and 'do things with them'
>>>>>>>>
>>>>>>>> I understand some limitations in the UIMA framework prevent this,
>>>>>>>> but
>>>>>>>> could it be done by making the first type system include that of the
>>>>>>>> CASes to deserialize?
>>>>>>>
>>>>>>>
>>>>>>> Yes, I think so.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Also, it would necessitate creating new CASes within the Analysis
>>>>>>>> Engine.
>>>>>>>> I could think of several approaches, and have tried some without
>>>>>>>> success:
>>>>>>>>
>>>>>>>> * Create a new, 'temporary' View in the AE's process() method,
>>>>>>>> obtain a
>>>>>>>> JCas, obtain it's CAS, and use that to store the deserialized CASes
>>>>>>>> (seems to mangle the original CAS and break downstream AEs in the
>>>>>>>> pipeline,
>>>>>>>> and seems to not be able to find any annotations in the deserialized
>>>>>>>> CAS)
>>>>>>>>
>>>>>>> This won't work. The deserialize method effectively "resets" the CAS
>>>>>>> before loading it.
>>>>>>> A view is not a new CAS; it is a new view of the same CAS.
>>>>>>>
>>>>>>>> * Use the CAS in the process() method to store the deserialized
>>>>>>>> CASes
>>>>>>>> (also mangles the original CAS, breaks downstream AEs, but DOES
>>>>>>>> permit obtaining annotations from the deserialized CASes)
>>>>>>>
>>>>>>>
>>>>>>> Right, deserializing into an existing CAS resets it in flight.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> * Make the Analysis Engine be a CAS Multiplier, and deserialize into
>>>>>>>> a CAS created with createEmtpyCas()
>>>>>>>> (I haven't tried this yet)
>>>>>>>
>>>>>>>
>>>>>>> Yes, this is the way to get a separate CAS instance to deserialize
>>>>>>> into.
>>>>>>> It's how Collection Readers do it.
>>>>>>> -Marshall
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> It's kind of a use case for a hybrid Component that behaves in some
>>>>>>>> ways
>>>>>>>> like
>>>>>>>> an AE (has a process() method), in some ways like XMI Collection
>>>>>>>> Reader,
>>>>>>>> and
>>>>>>>> in some ways like a CAS Multiplier.
>>>>>>>>
>>>>>>>> But it's a useful use case! It is also a very bizarre one becuase
>>>>>>>> you
>>>>>>>> could
>>>>>>>> almost think of it as a pipeline within a pipeline, which processes
>>>>>>>> a
>>>>>>>> set
>>>>>>>> of deserialized annotated XMI documents, within a pipeline that
>>>>>>>> processes
>>>>>>>> ...
>>>>>>>> in our case, a Question Answering system with question keyterms,
>>>>>>>> ranked lists of documents and answer candidates.
>>>>>>>>
>>>>
>>>
>>
>

Reply via email to