Looking at the code in CASImpl, I see
(line 112) cas.impl.Serialization: public static void
deserializeCASComplete(CASCompleteSerializer casCompSer ...) ->
(line 1231) reinit(CASCompleteSerializer....) calls (line 1260)
(line 1293) reinit( a bunch of arrays, including int[] fsIndex which is an array
of things to add to indexes) calls (line 1307)
(line 1730) reinitIndexedFSs(int[] fsIndex) which has a double loop - outer for
views, inner for all indexed fs where it does
(line 1770) addFS(fsIndex[i])

which does a one element add of the feature structure to the indexes.

Perhaps though I'm following the wrong code path ...

-Marshall

On 1/7/2016 10:36 AM, Richard Eckart de Castilho wrote:
> On 07.01.2016, at 15:12, Marshall Schor <[email protected]> wrote:
>> Thanks for explaining this "use case". 
>>
>> I was a bit unclear on the two instances of deserialization time. 
>> One (the 70%) was xmi, the other (2%) was S+.  From reading the email chain, 
>> it
>> seems S+ is the "CasCompleteSerializer".  This switches to plain binary 
>> mode. 
>> So you would avoid the XML parsing overhead. 
>>
>> But I think both deserializations would have the same issue around 
>> "allow_dups"
>> if that was where the substantial part of the slowdown was being spent, since
>> both would add all those annotations to the index.  Perhaps that was another 
>> use
>> case though...  Am I mixing these up?
> My understanding is that the CasCompleteSerializer is (de)serializing the heap
> structures and indexes as-is. So on loading, FSes are not passing through
> addToIndexes() and allow_dups at all. This should be what makes the S and S+
> faster than the other approaches that call addToIndexes().
>
> Btw. Cas(Complete)Serializer also has the nice effect that the addresses of 
> FSes remain fully stable as even unindexed/unreachable FSes are stored and
> loaded. I think all other serializers drop unreachable which can cause 
> addresses
> to change. I have a usecase in WebAnno where I'm absolutely relying on the
> stable addresses provided by the Cas(Complete)Serializer.
>
> Cheers,
>
> -- Richard

Reply via email to