Exceptions on generated CASes are returned to client without parentage 
information
----------------------------------------------------------------------------------

                 Key: UIMA-1358
                 URL: https://issues.apache.org/jira/browse/UIMA-1358
             Project: UIMA
          Issue Type: Bug
          Components: Async Scaleout
    Affects Versions: 2.2.2
            Reporter: Burn Lewis
             Fix For: 2.3S


The client should be told which of its input CASes might have (indirectly) 
generated this failing CAS.  Also is there any value in sending more than one 
exception if many children fail?  If the aggregate is not a CM then the client 
should just be told that the input CAS failed.
Here is part of a recent email discussion on this problem:


I think I have a somewhat clearer picture of how we might handle errors on 
child CASes.  

First consider Primitive & Aggregate CMs, and then a non-CM aggregate that 
contains a CM. 
I can see 3 different ways an application may wish to handle errors on child 
CASes:

Primitive CM 
Stop generating children/descendants of the input CAS and return an exception 
on the input CAS
Generate an "incomplete" CAS -- perhaps marked as "damaged"
(useful when the total count must be preserved and a place-holder provided)
Ignore the error, generate no CAS and carry on to generate the next CAS (if any)

Aggregate CM
Stop generating any more children/descendants from the input CAS and return the 
exception on the input CAS
Allow the CAS to continue in the flow
Quietly drop the CAS, do not return it and do not generate an exception

Simple Aggregate with internal CM
Stop generating any more children/descendants from the input CAS and return the 
exception on the input CAS
Allow the CAS to continue in the flow (it will be dropped at the end of the 
flow)
Quietly drop the CAS as if it reached the end of the flow, and do not generate 
an exception

Currently our aggregate error-handling supports #2, while #3 doesn't depend on 
the framework.  I have added aggregate support for #3 to the 
AdvancedFixedFlowController in the UIMA-AS test suite (as part of Jira 1353) in 
the form of a new AllowDropOnFailure option which specifies the delegates for 
which a failing CAS can be dropped, i.e. skip to the end of the flow with the 
forceCasToBeDropped flag set.  (I used it to test the thresholdWindow error 
handling to verify that an intermittently failing delegate is disabled when N 
of the last M CASes fail.)

But I don't think our docs indicate what should happen in #1 and the current 
implementation handles it differently ... the exception is associated with the 
child CAS without any reference to the input CAS, and the CM continues to 
generate children, so the client can get many exceptions that refer to unknown 
CASes.  The getParentCasReferenceId() method in the UimaASProcessStatus (which 
I could not find in the JavaDocs) can be used to associate a child CAS with the 
input CAS that generated it, but it is always null when an exception is 
returned. 

Consider the information available to the entityProcessComplete callback when 
an input CAS successfully generates 2 children:

returnedCAS             getCasReferenceId()     getParentCasReferenceId()       
isException()
 
  Child1                ID-of-Child1                    ID-of-Parent            
false
  Child2                ID-of-Child2                    ID-of-Parent            
false
  Parent                ID-of-Parent                    null                    
        false

        If the 2nd child causes an exception then the client might see (Option 
A)

returnedCAS             getCasReferenceId()     getParentCasReferenceId()       
isException()
 
  Child1                ID-of-Child1                    ID-of-Parent            
false
  null          ID-of-Parent                    null                            
true

Or we could put the failing child's ID in the status (Option B)

returnedCAS             getCasReferenceId()     getParentCasReferenceId()       
isException()
 
  Child1                ID-of-Child1                    ID-of-Parent            
false
  null          ID-of-Child2                    ID-of-Parent            true

Note that in an Aggregate CM the failing CAS may not have been generated 
directly by the parent, but by any one of its descendants.

I think option A  is cleaner and easier to document ... "exception always on 
input CAS".  If the ID of the failing child is useful we could wrap the 
exception in another that said something like "Exception inherited from 
generated CAS xyz"

Any other options we should consider?
I'll put this in a Jira as that may be the better place to discuss it.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to