On 12/2/2014 3:20 PM, Richard Eckart de Castilho wrote: > On 02.12.2014, at 20:47, Marshall Schor <[email protected]> wrote: > >> Assuming we have "normally" the "automatic" style in effect, then yes, inside >> the protection block the automatic style would be temporarily disabled. The >> "removes" would still be done, but the info needed to do the addbacks would >> be >> kept. So, at the point of a feature update, the remove (only if needed, of >> course) would be done, and the update, but not the "add-back". (Remember, >> that >> doing the update before the remove causes index corruption.) For the second >> and >> subsequent update to (another) feature of that FS, no index operations would >> be >> done (it would already be "removed"). And then at the end, only the >> re-adding >> of whatever was removed would be done. > I wonder if this would lead to surprising behaviour in the sense that a second > query over an index would happening within the first one would suddenly not > see > the modified items anymore, because they have been removed as part of an > update > and only get added back at the end of the protection block. But then again, > since > we assume this is the unusual case (and I think it is), this would just be > something > the users need to be aware of. +1. > >>> A flow controller or the component base classes could forcibly put the CAS >>> back >>> into protection mode in case that the component coder forgot it (and log a >>> warning) - >>> or it could even throw an exception in such a case. >> This would be a partial solution, because there are cases where there would >> not >> be a flow controller involved, or even a base class. A complete solution is >> to >> have the API follow the style of using an inner class as discussed in >> previous >> notes in this change. > I don't claim it would be a solution. It would just be a precaution to alert > users. > A logged warning could look like this: > > WARNING: CAS index protection disabled after processing - please use a > try/finally > construct to ensure the protection mode is reenabled. > > The inner class would be implemented in terms of the beginProtection() and > endProtection() methods and use the try/finally construction. It would be the > recommended approach to use it, but there might be cases where a user as a > strange control flow or might want to use checked exceptions and would prefer > doing the beginProtection/endProtection manually. +1, but I was thinking of not offering anything other than the inner class structure. I doubt this would be used often, if we had the automatic method, so making even more varieties available I'd like to postpone until someone says they need an alternative :-)
> >> This "bulk" mode, though, I think would be the exception, because most users >> set >> lots of features when a new FS is created, but then the "typical" mode is to >> update just a few, (I'm guessing :-) ). If this is true, then the >> "automatic" >> mode (discussed at the top) would work, and the "bulk" mode would be >> relegated >> to just an optimization for a less-usual case. So, most people would not >> need >> to do anything, and their code would start working without corrupting >> indices. > For the use-cases I know, setting features before adding to the index is > definitely > the common case - updating features is the rare case. > > I have the feeling that some concepts of the UIMA is somewhat more geared > towards > immutable FSes than mutable FSes, but this appears to change now. Isn't it > the > case that feature updates are also a problem for delta CAS because the delta > mechanism doesn't notice the update? Yes, but that's being fixed, see: UIMA-4100 and UIMA-4126. > > I mean, another approach to what is currently being done would be to create a > copy > of an FS whenever a feature is updated, remove the original from the indexes > and > read the new one - but then we'd run into other problems... Right. Also, a main issue is that the heap is not garbage collected (normally), which makes the approach of creating a new FS and throwing the old one away not so appealing. -Marshall > > I think Peter mentioned that Ruta does update features as part of its normal > operation. But that's probably one of the very special use-cases that rarely > is encountered and where the bulk-mode fits in nicely. > > Cheers, > > -- Richard > > >
