Thank you both for your hints,
Jörn, this exact topic came to my mind earlier. I want to have different
"annotation stages" of the same artifacts, so some kind of delta storage would
make a lot of sense. Now I don't have time to write such a thing on my own (I
currently don't see an easy way to do it; I want to preserver the basic
annotation storage so I can experiment with the components doing the "higher"
annotations). Is there anything usable out-of-the-box regarding this topic?
Thanks!
Erik
Am 12.12.2012 um 18:28 schrieb Jörn Kottmann <[email protected]>:
> On 12/12/2012 05:27 PM, Erik Fäßler wrote:
>> i am currently looking for a good approach to store a lot of CAS data. What
>> I want to do is to annotate a lot of text with basic annotations and save
>> that. Then, I can read the CAS objects with these basic annotations and
>> don't have to do them over and over because they are basically never
>> changing. However, "basic" does not necessarily mean that the computation is
>> fast - that's why I want the storage.
>
> In my experiences its sometimes better to define a custom format to store the
> data in a database and not use CAS serialization.
>
> CAS serialization has some disadvantages. To read a piece of the data in a
> CAS it is necessary to load the entire CAS,
> but this might not be necessary for all operations which need to be
> performed, e.g. text indexing, calculating statistics, etc.
> To add new annotations to an existing CAS you need to re-write the entire CAS
> data instead of just adding a few bytes.
>
> Jörn