On 19.01.2018, at 03:37, Marshall Schor <[email protected]> wrote: > > The trouble with this is that it breaks several serializations (where the type > system is required to be "known", for example Cas Complete serialization), > because the layout of the serialized format is with respect to the type system > which created the serialization. > > I'm not sure what a reasonable solution to this issue is; thoughts welcomed > :-)
Several thoughts: - The CasCompleteSerializer (as opposed to the CASSerializer) includes a CASMgrSerializer. We use the CASMgrSerializer in CasIOUtils to optionally reinitialize the CAS with a type system different from what it had at creation time via setupCasFromCasMgrSerializer(). Doesn't that contain sufficient information about the type system to decode it? - Allow disabling the augmentation of the CAS from JCas classes, e.g. for people that need full control over the type system in the CAS and do not want to use JCas with that CAS (and also cannot easily use classloader isolation). The idea was brought up earlier already when we observed that UIMA v2 had an option to disable JCas. - It may be possible to mark features which were automatically obtained from the JCas and to not take these into account when deserializing with CasComplete. It would cause an inconsistency though: when loading the data into a CAS and storing the data again (both with CasComplete), the type systems in the input and output would differ. Cheers, -- Richard
