On 02.12.2014, at 17:14, Marshall Schor <[email protected]> wrote: > A subsequent discussion with Burn L. produced the following two good ideas: > > 1) The UIMA framework could automatically do the safe thing on each feature > modification that required it. Although this might seem inefficient, it is > likely that in most cases, only one feature (used as a key in some index spec) > is being modified at any one time. For those cases where this isn't true, the > alternative of a index protection block encapsulating multiple updates could > be > used; but it's likely that would rarely be needed. > > The automatic approach would, in effect, do a remove, modify, add-back cycle > for > each feature modification, in all indices where the FS was in the index, if > the > feature was used as a key. > > This would be a boon to users - as their code would now work without the > danger > of accidentally corrupting indices.
Sounds good :) So by default, the CAS would protect itself. When a protection block (I cannot help thinking of this as a kind of transaction) is used, then the protection would be temporarily disabled and the modifications would be written to some kind of "transaction log". When the block is closed, the log is "committed", basically removing/readding all the modified FSes. Did I paraphrase this correctly? A flow controller or the component base classes could forcibly put the CAS back into protection mode in case that the component coder forgot it (and log a warning) - or it could even throw an exception in such a case. > 2) Because this would turn a feature update into (potentially) a remove - > update > - add operation, users writing feature updates inside an interator would be > exposed to suddenly getting illegal index modification while iterating > exceptions. > > This has long been an issue, I think, causing users to write loops that > extract > FSs into array lists and then iterate over those, while doing UIMA index adds/ > removes. Totally :) > How about we add a method to our iterator creation suite, perhaps named > safeIterator(), which creates a snapshot of the index its iterating over at > the > start, and then allows the user code to do arbitrary index adds/removes? Sounds good as well. I think that some UIMA core iterator already copies FSes to some collection before returning it. Some of the uimaFIT select*() methods certainly do this (but not all - and it is not advertised to users). > It seems this occurs frequently enough to warrant UIMA built-in support, and > some > optimizations may be available. It seems it could be especially helpful if (1) > were implemented, because the remove/add could occur unbeknownst to the user. > For example, the component writer may not have had a feature in any index, but > when his component was combined with others, an index could have been added > that > used the feature. > > WDYT? It is probably not a common problem, but from the perspective of the architecture, it would be good to avoid negative side-effects from a third component adding an index that could cause undesired or even wrong behavior. Cheers, -- Richard
