If no one is currently needing COMPRESSED_FLTERED_TS now, I'm +1 for removing it for the following reasons:
- the current impl combines a very non-compressed format for the Type System and Index definition with a highly compressed CAS representation; not a good format. (I've seen some applications which have close to a 1000 types defined). - the two header problem could be solved by pushing the implementation down into the base form 6 serialization, but that would take a bit of time and thinking :-), and I'd rather get this release out and add that later. -Marshall On 8/4/2016 5:24 AM, Peter Klügl wrote: > So, what should we do? > > > deactivate COMPRESSED_FILTERED_TS completely, or even remove the > SerialFormat? > > > Best, > > > Peter > > > Am 04.08.2016 um 11:22 schrieb Richard Eckart de Castilho: >> I'd personally prefer only one header, but looks like that would require >> more refactoring, e.g. extracting the reading of the header out >> of org.apache.uima.cas.impl.CASImpl.reinit(InputStream)... >> >> Cheers, >> >> -- Richard >> >>> On 04.08.2016, at 09:07, Peter Klügl <[email protected]> wrote: >>> >>> Yes, form6+ gets two headers. The first one for identifying the format >>> and typesystem inclusion for the utils class, the second one for the >>> actual serialization code. I didn't see any better solution for this. >>> >>> >>> Am 03.08.2016 um 18:28 schrieb Richard Eckart de Castilho: >>>> It is a bit hard to see... do we have cases now where two headers are >>>> written to the file? >>>> E.g. in a form6 + TS, one before the type system and another one before >>>> the actual CAS data? >>>> >>>> -- Richard >
