[ 
https://issues.apache.org/jira/browse/UIMA-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881925#action_12881925
 ] 

Marshall Schor commented on UIMA-1818:
--------------------------------------

Sounds like a valuable debugging aide.  

Is the idea that *every* CAS that comes thru a particular specified annotator 
would be saved to the file system?
* if so - maybe some parameter to control how many, or how frequently to 
sample, etc.?

The "COMPONENT_ARRAY" delegate keys need the x/y/z syntax for non UIMA-AS cases 
- where an aggregate contains another aggregate, etc.  This is already a 
convention in UIMA. So it would be good to just continue using it both for 
UIMA-AS cases and non-UIMA-AS cases.  

Would it be valuable to have a spec to say if the logging was to be before or 
after the AnalysisEnging, for each delegate? For instance, the spec could be 
e.g., someAggName/somePrimName:before:after  (showing both).  "before" could be 
the default.

Would it be valuable to dump only the changed data (a/la "delta cas")?  
(possible syntax: add modifier :delta)

It would be good if the output was consumable by the CAS Viewer, too :-).

> Provide simple mechanism to capture all CASes input to specified delegate
> -------------------------------------------------------------------------
>
>                 Key: UIMA-1818
>                 URL: https://issues.apache.org/jira/browse/UIMA-1818
>             Project: UIMA
>          Issue Type: New Feature
>          Components: Async Scaleout
>            Reporter: Eddie Epstein
>            Assignee: Eddie Epstein
>
> The existing approach to capturing CASes sent to a component is to insert a 
> new CAS-serializer-annotator just before it in the flow, or modify the 
> component itself to serialize CASes. Both of these approaches require 
> modifications to existing code and/or component descriptors, are somewhat 
> time consuming and error prone.
> A much simpler approach is to just "turn on" CAS logging for a particular 
> component using Java properties before starting the process, or to turn CAS 
> logging on/off for an already running process using JMX operations.
> This issue covers using Java properties to turn on CAS logging for any 
> delegate of an asynchronous aggregate.
> CAS logging would be controlled by the following properties:
> UIMA_CASLOG_BASE_DIRECTORY - optional; this is the directory under which 
> other directories with XmiCas files will be created. If not specified, the 
> processes current directory will be the base.
> UIMA_CASLOG_COMPONENT_ARRAY - This is a space separated list of delegates 
> keys. If a delegate is nested inside a co-located async aggregate, the name 
> would include the key name of the aggregate, e.g. "someAggName/someDelName". 
> The XmiCas files will then be written into 
> $UIMA_CASLOG_BASE_DIRECTORY/someAggName/someDelName/
> UIMA_CASLOG_TYPE_NAME - optional; this is the name of a FeatureStructure in 
> the CAS containing a unique string to use the name each XmiCas file. If not 
> specified, XmiCas file name will be NNN.xmi, where NNN is  the time in 
> microseconds since the component was initialized.
> UIMA_CASLOG_FEATURE_NAME - optional unless if the TYPE_NAME is specified; 
> this parameter gives the string feature to use. An example of type and 
> feature names to use would be 
> "org.apache.uima.examples.SourceDocumentInformation" and "uri".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to