Taking your last point, a bit further. The reason for doing all this in the first place was to have the JCas class setup for feature offsets work when subsequent type systems implemented additional features, which were already in the JCas class.
The "trick" used was to merge in to the type system any features defined in the JCas class, but not in the type system, during type system commit. A better thing to do, maybe, would be to put these extra features into the type system, but not as features, but as only something used when setting up the "offsets", so the offsets get set properly. Advantages: things like serialization / deserialization which depend on the exact type system spec continue to work as before; serialization would serialize as if the extra features were not present in the type system, so interfacing with other systems would continue to work unchanged. Attempting to access a feature not in the type system could be made to give an error. This seems the most compatible way to introduce this capability - so I'll see if I can figure out how to do something like this. -Marshall On 1/19/2018 4:58 AM, Richard Eckart de Castilho wrote: > On 19.01.2018, at 03:37, Marshall Schor <[email protected]> wrote: >> The trouble with this is that it breaks several serializations (where the >> type >> system is required to be "known", for example Cas Complete serialization), >> because the layout of the serialized format is with respect to the type >> system >> which created the serialization. >> >> I'm not sure what a reasonable solution to this issue is; thoughts welcomed >> :-) > Several thoughts: > > - The CasCompleteSerializer (as opposed to the CASSerializer) includes a > CASMgrSerializer. > We use the CASMgrSerializer in CasIOUtils to optionally reinitialize the > CAS with a type > system different from what it had at creation time via > setupCasFromCasMgrSerializer(). > Doesn't that contain sufficient information about the type system to decode > it? > > - Allow disabling the augmentation of the CAS from JCas classes, e.g. for > people > that need full control over the type system in the CAS and do not want to > use > JCas with that CAS (and also cannot easily use classloader isolation). > The idea was brought up earlier already when we observed that UIMA v2 had an > option to disable JCas. > > - It may be possible to mark features which were automatically obtained from > the JCas and to not take these into account when deserializing with > CasComplete. > It would cause an inconsistency though: when loading the data into a CAS and > storing the data again (both with CasComplete), the type systems in the > input > and output would differ. > > Cheers, > > -- Richard > >
