Thilo Goetz wrote:
Marshall Schor wrote:
Thilo Goetz wrote:
Bhavani Iyer wrote:
Hi Thilo,

There are two separate requirements being addressed here:
1) delta CAS for optimizing remote services.
Here its agreed that there should be no measurable overhead when there
is no remoting.
There will be a single test against the high water mark. The high
water mark defaults to 0.  Only when the high
     water mark is set to a value greater than 0 is logging of  CAS
operations on FSs below the high water mark enabled.
2)  Journaling for debugging  aggregate components.
This capability is for Core UIMA as well as for remote services. This will have some additional overhead and will be have to be explicitly enabled
by the aggregate controller for a component. Basically the aggregate
controller enables journaling by setting the high water mark before the call
to process.

Regarding using the high water mark, this is already being used for merging
CAS.

That's not a good thing, and certainly no justification of using
the same design here.
Can you say more about why this is not a good thing? I see it as an internal design detail.

Precisely.  It's an implementation detail of the CAS heap that
we should be able to change -- that we must be able to change
if we would like to improve on the heap.
We could change it if we found a better approach.
The CAS heap and
in particular the way it grows is a major performance bottleneck
for large documents.  If we have other parts of UIMA depend on
the (bad) implementation details now, we'll never be able to
improve on the design.
Hmmm, I guess I was thinking that if we wanted to change this in the future, we could. I agree it would be more difficult; we've changed things like this before using the refactoring tools that let you see pretty clearly various dependencies...

-Marshall

Reply via email to