Yes, these where just generated from the type system file using JCasGen.
> On 16 Sep 2019, at 15:32 , Marshall Schor <[email protected]> wrote: > > oops, ignore that - I see Container is a JCas class ... -M > > On 9/16/2019 9:30 AM, Marshall Schor wrote: >> I may have some version pblms. The LoadCompressedBinary has refs to a class >> "Container", but I don't seem to have that class - where is it coming from? >> >> -Marshall >> >> On 9/16/2019 8:11 AM, Mario Juric wrote: >>> Best Regards, >>> >>> Mario Juric >>> Principal Engineer >>> *UNSILO.ai* <http://unsilo.ai/> >>> mobile: +45 3082 4100 >>> >>> skype: mario.juric.dk <http://mario.juric.dk> >>> >>> >>> >>> >>> Hi Marshall, >>> >>> I have a small test case with 3 files excluding any JCasGen generated types >>> and UIMAfit types file. >>> >>> First you will have to generate the types and run the SaveCompressedBinary >>> to >>> produce the 3 binaries forms I have been experimenting with. Yo should then >>> be >>> able to run LoadCompressedBinaries as expected. >>> >>> Next you need to change the element type of Container.features from >>> FeatureAnnotation to FeatureRecord in the type system and generate the type >>> system again. Also change the FeatureAnnotation reference In >>> LoadCompressedBinaries l. 25 to FeatureRecord and then try to reload the >>> previously stored binaries again without saving them first using the new >>> type >>> system. >>> >>> You can see I have played with different ways of loading just to see if >>> anything worked, but much of it seems to result in exactly the same calls in >>> the lower layers. I didn’t get entirely the same results with the CAS we >>> actually store as in this example. E.g. I experienced some EOF with the >>> compressed filtered whereas I only get a class cast exception during >>> verification in this example. Note also that we keep both types in the new >>> type system, but we want to change the element type of the FSArray in the >>> Container. >>> >>> Hope this will yield some useful insights and thanks a lot :) >>> >>> Cheers >>> Mario >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>> On 13 Sep 2019, at 21:55 , Mario Juric <[email protected] >>>> <mailto:[email protected]>> >>>> wrote: >>>> >>>> Thanks Marshall, >>>> >>>> I’ll get back to you with a small sample as soon I get the time to do it. >>>> This will also get me a better understanding of the the format. >>>> >>>> >>>> Cheers, >>>> Mario >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>> On 13 Sep 2019, at 19:32 , Marshall Schor <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> I'm wondering if you could post a very small test case showing this >>>>> problem with >>>>> a small type system. >>>>> >>>>> With that, I could run in the debugger and see exactly what was >>>>> happening, and >>>>> see whether or not some small fix would make this work. >>>>> >>>>> The Deserializer for this already supports a certain type of mismatch >>>>> between >>>>> type systems, but mainly one where one is a subset of the other - see the >>>>> javadoc for the method >>>>> >>>>> org.apache.uima.cas.impl.BinaryCasSerDes6.java. >>>>> >>>>> But it must not currently cover this particular case. >>>>> >>>>> -Marshall >>>>> >>>>> On 9/13/2019 10:48 AM, Mario Juric wrote: >>>>>> Just a quick follow up. >>>>>> >>>>>> I played a bit around with the CasIOUtils, and it seems that it is >>>>>> possible >>>>>> to load and use the embedded type system, i.e. the old type system with >>>>>> X, >>>>>> but I found no way to replace it with the new type system and make the >>>>>> necessary mappings to Y. I tried to see if I could use the CasCopier in a >>>>>> separate step but it expectedly fails when it reaches to the FSArray of X >>>>>> in the source CAS because the destination type system requires elements >>>>>> of >>>>>> type Y. I could make my own modified version of the CasCopier that could >>>>>> take some mapping functions for each pair of source and destination types >>>>>> that need to be mapped, but this is where it starts to get too >>>>>> complicated, >>>>>> so I found it not to be worth it at this point, since we might then want >>>>>> to >>>>>> reprocess everything from scratch anyway. >>>>>> >>>>>> Cheers, >>>>>> Mario >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 12 Sep 2019, at 10:41 , Mario Juric <[email protected] >>>>>>> <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We use form 6 compressed binaries to persist the CAS. We now want to >>>>>>> make >>>>>>> a change to the type system that is not directly compatible, although in >>>>>>> principle the new type system is really a subset from a data >>>>>>> perspective, >>>>>>> so we want to migrate existing binaries to the new type system, but we >>>>>>> don’t know how. The change is as follows: >>>>>>> >>>>>>> In the existing type system we have a type A with a FSArray feature of >>>>>>> element type X, and we want to change X to Y where Y contains a genuine >>>>>>> feature subset of X. This means we basically want to replace X with Y >>>>>>> for >>>>>>> the FSArray and ditch a few attributes of X when loading the CAS into >>>>>>> the >>>>>>> new type system. >>>>>>> >>>>>>> Had the CAS been stored in JSON this would be trivial by just mapping >>>>>>> the >>>>>>> attributes that they have in common, but when I try to load the CAS >>>>>>> binary >>>>>>> into the new target type system it chokes with an EOF, so I don’t know >>>>>>> if >>>>>>> that is at all possible with a form 6 compressed CAS binary? >>>>>>> >>>>>>> I pocked a bit around in the reference, API and mailing list archive >>>>>>> but I >>>>>>> was not able to find anything useful. I can of course keep parallel >>>>>>> attributes for both X and Y and then have a separate step that makes an >>>>>>> explicit conversion/copy, but I prefer to avoid this. I would appreciate >>>>>>> any input to the problem, thanks :) >>>>>>> >>>>>>> Cheers, >>>>>>> Mario >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>
