Just to clarify, the CPM has no support for CasMultipliers. So any use of a CM in a CPE would only be supported inside the AE of a single CAS processor. That is, no child CAS would ever come out of one CAS processor and flow into another.
So in your case, all the action is withing a single [aggregate] AE? On Fri, Mar 16, 2012 at 5:07 PM, Eric Riebling <[email protected]> wrote: > Last one, sorry list members for the spam. > > The reason things were funky is that this was a system that > used CAS Multipliers to create the CASes that my Component > was seeing. If I run my Component in a straight-line pipeline, > getEmptyCAS() produces CASes with the full type system as it > is supposed to do. > > I don't fully understand the architecture of the surrounding > system, but once I do, will supply you guys with the details in > case this is a bug with the way UIMA handles CASes that are > multiplied more than once. > > > On 3/16/2012 2:16 PM, Eric Riebling wrote: >> >> And the difference in environment: >> >> * use SimpleRunCPE - user defined types don't show up >> * use CPE GUI - they DO show up >> >> This is interesting! >> >> On 3/15/2012 6:50 PM, Eddie Epstein wrote: >>> >>> My last note was incorrect. Here is a paraphrase of working code: >>> >>> public AbstractCas next() throws AnalysisEngineProcessException { >>> CAS aCAS = getEmptyCAS(); >>> try { >>> ByteArrayInputStream casIn = getNextXmiCas(); >>> XmiCasDeserializer.deserialize(casIn, aCAS, true); // >>> deserialize in a lenient fashion >>> return aCAS; >>> } catch (SAXException e) { >>> throw new AnalysisEngineProcessException(e); >>> } catch (IOException e) { >>> throw new AnalysisEngineProcessException(e); >>> } >>> ... >>> >>> >>> On Thu, Mar 15, 2012 at 5:59 PM, Marshall Schor<[email protected]> wrote: >>>> >>>> >>>> >>>> On 3/15/2012 4:38 PM, Eddie Epstein wrote: >>>>> >>>>> >>>>> Cannot deserialize into a CAS from getEmptyCas(). >>>> >>>> >>>> This is not right. More information soon (ran out of time today). >>>> -Marshall >>>> >>>>> Must use a CAS from >>>>> CasCreationUtils.createCas for deserialization, and then use casCopier >>>>> to copy to the CAS from getEmptyCas(). >>>>> >>>>> Pick the version of createCas that specifies a typesystem, and use the >>>>> typesystem from the pipeline CAS (i.e. the one from getEmptyCas). >>>>> >>>>> On Thu, Mar 15, 2012 at 2:44 PM, Eric Riebling<[email protected]> wrote: >>>>>> >>>>>> >>>>>> Thanks, guys. This is getting me closer to the goal, and explains the >>>>>> observed >>>>>> behaviors. Now I'm facing issues when implemented as a CAS Multiplier. >>>>>> I >>>>>> try >>>>>> creating a new CAS first with getEmptyJCas(). >>>>>> >>>>>> Here are some various strategies and what resulted: >>>>>> >>>>>> * create a deserializer with the typesystem from the AE (which >>>>>> includes types in the 'external' CAS to be deserialized) >>>>>> * ues it to deserialize into the empty CAS created with getEmptyJCas() >>>>>> >>>>>> -> The deserialized CAS for some reason has only the base TOP >>>>>> typesystem >>>>>> -> Trying to access an annotation from an index (that should be there) >>>>>> generates the "used in Java code, but was not declared in the XML >>>>>> type >>>>>> descriptor" >>>>>> exception >>>>>> >>>>>> * same as above, but use CasCopier to try and copy the type system >>>>>> (and everything else) from the CAS in the AE's process() method >>>>>> into the empty CAS >>>>>> >>>>>> -> Attempted to copy a FeatureStructure of type "(my type name)", >>>>>> which >>>>>> is >>>>>> not defined in the type system of the destination CAS. >>>>>> >>>>>> It seems the ONLY way to obtain a CAS (empty or otherwise) that has >>>>>> the >>>>>> type >>>>>> system able >>>>>> to accept the external CAS being deserialized is to use the very CAS >>>>>> passed >>>>>> into >>>>>> the AE's process() method. Doing so obviously mangles that CAS for the >>>>>> rest >>>>>> of >>>>>> the pipeline. >>>>>> >>>>>> >>>>>> On 3/15/2012 1:50 PM, Marshall Schor wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 3/15/2012 10:38 AM, Eric Riebling wrote: >>>>>>>> >>>>>>>> >>>>>>>> I have a pipeline with it's own type system. >>>>>>>> I also have deserialized, annotated CASes on disk with a different >>>>>>>> type >>>>>>>> system. >>>>>>>> Suppose I want an Analysis Engine in the pipeline to read in the >>>>>>>> deserialized >>>>>>>> CASes in order to obtain annotations and 'do things with them' >>>>>>>> >>>>>>>> I understand some limitations in the UIMA framework prevent this, >>>>>>>> but >>>>>>>> could it be done by making the first type system include that of the >>>>>>>> CASes to deserialize? >>>>>>> >>>>>>> >>>>>>> Yes, I think so. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Also, it would necessitate creating new CASes within the Analysis >>>>>>>> Engine. >>>>>>>> I could think of several approaches, and have tried some without >>>>>>>> success: >>>>>>>> >>>>>>>> * Create a new, 'temporary' View in the AE's process() method, >>>>>>>> obtain a >>>>>>>> JCas, obtain it's CAS, and use that to store the deserialized CASes >>>>>>>> (seems to mangle the original CAS and break downstream AEs in the >>>>>>>> pipeline, >>>>>>>> and seems to not be able to find any annotations in the deserialized >>>>>>>> CAS) >>>>>>>> >>>>>>> This won't work. The deserialize method effectively "resets" the CAS >>>>>>> before loading it. >>>>>>> A view is not a new CAS; it is a new view of the same CAS. >>>>>>> >>>>>>>> * Use the CAS in the process() method to store the deserialized >>>>>>>> CASes >>>>>>>> (also mangles the original CAS, breaks downstream AEs, but DOES >>>>>>>> permit obtaining annotations from the deserialized CASes) >>>>>>> >>>>>>> >>>>>>> Right, deserializing into an existing CAS resets it in flight. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> * Make the Analysis Engine be a CAS Multiplier, and deserialize into >>>>>>>> a CAS created with createEmtpyCas() >>>>>>>> (I haven't tried this yet) >>>>>>> >>>>>>> >>>>>>> Yes, this is the way to get a separate CAS instance to deserialize >>>>>>> into. >>>>>>> It's how Collection Readers do it. >>>>>>> -Marshall >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It's kind of a use case for a hybrid Component that behaves in some >>>>>>>> ways >>>>>>>> like >>>>>>>> an AE (has a process() method), in some ways like XMI Collection >>>>>>>> Reader, >>>>>>>> and >>>>>>>> in some ways like a CAS Multiplier. >>>>>>>> >>>>>>>> But it's a useful use case! It is also a very bizarre one becuase >>>>>>>> you >>>>>>>> could >>>>>>>> almost think of it as a pipeline within a pipeline, which processes >>>>>>>> a >>>>>>>> set >>>>>>>> of deserialized annotated XMI documents, within a pipeline that >>>>>>>> processes >>>>>>>> ... >>>>>>>> in our case, a Question Answering system with question keyterms, >>>>>>>> ranked lists of documents and answer candidates. >>>>>>>> >>>> >>> >> >
