[ 
https://issues.apache.org/jira/browse/UIMA-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419309#comment-13419309
 ] 

Mike Barborak commented on UIMA-2431:
-------------------------------------

In our real use case, it's generally true that information is added to the CAS 
and not removed or overwritten. So for us we would expect the memory overhead 
due to journaling to be less than some constant factor times the number of data 
entries in our final CAS. Since the number of data entries is necessarily less 
than the final CAS size, we expect the overhead to be less than that constant 
factor times our final CAS size and likely much less since total data size 
typically overwhelms number of data entries. 

So whatever this constant factor is, I have to imagine it's acceptable in some 
use case or another and simply gets reflected in the practice of using it. If 
the factor is low, then users just turn it on and leave it on all the time. If 
it's high then they disable it at times, enable it at others, "commit" the 
journal at points, etc. In other words, I think the features you mention are 
convenient things to get to a point where users just leave journaling on all 
the time but without them I still think it's useful and practical as long as 
the user has the functionality to deal effectively with the overhead.
                
> allow CAS changes to be rolled back to specified marks
> ------------------------------------------------------
>
>                 Key: UIMA-2431
>                 URL: https://issues.apache.org/jira/browse/UIMA-2431
>             Project: UIMA
>          Issue Type: New Feature
>          Components: Core Java Framework
>            Reporter: Mike Barborak
>
> As a CAS moves through a pipeline, there is a well defined sequence of 
> changes being applied to it. Currently, each change is applied to the CAS 
> immediately and so the sequence is lost. It would be nice to preserve the 
> sequence as it would make possible some useful features.
> One such feature would be the ability to rollback the sequence of changes to 
> some specified point. Imagine then a pipeline where a mark is written to the 
> sequence before each component is run. Then, if the component threw an 
> exception or entered some other undesirable state, the CAS could be rolled 
> back to the last mark. It could then be serialized to disk for debugging or a 
> flow controller could reroute it in the pipeline.
> Even when a component didn't enter such a state, being able to rollback a CAS 
> to a particular component's input state is useful for debugging performance 
> (in function or in resource usage) of that component.
> Being able to track changes also means being able to model CAS deltas. This 
> functionality could become the base for communicating CAS changes efficiently 
> for example in the case that a UIMA-AS service needs to report changes it 
> made to a large input CAS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to