[
https://issues.apache.org/jira/browse/UIMA-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marshall Schor updated UIMA-4820:
---------------------------------
Summary: uv3 Supporting Delta deserialization requires preserving simulated
heap addresses (was: uv3 Supporting Delta deserialization requires holding on
to FSs serialized)
> uv3 Supporting Delta deserialization requires preserving simulated heap
> addresses
> ---------------------------------------------------------------------------------
>
> Key: UIMA-4820
> URL: https://issues.apache.org/jira/browse/UIMA-4820
> Project: UIMA
> Issue Type: Bug
> Components: Core Java Framework
> Reporter: Marshall Schor
> Assignee: Marshall Schor
> Fix For: 3.0.0SDKexp
>
>
> UIMA supports various formats of delta deserialization, which is when a
> serialization is done (to, for example, a remote service), and then a delta
> serialization returns just the changes back to the original CAS.
> There are two approaches used to get the set of FSs to serialize.
> * One way, used for plain binary and form4 compressed, scans the "heap"
> sequentially, and sends all those FSs, including potentially FSs that are not
> "reachable".
> * The other way is to use the indexes plus following reference chains to
> locate all "reachable" FSs, and only send those. This is used for XCAS, XMI,
> JSON, and Form6 compressed.
> In V3, the plain and form4 serialization need to preserve simulated heap
> "addresses" (per CAS) for the FSs sent in order to enable future delta
> deserializations to have the proper "heap" addresses; it may not recalcuate
> this from the CAS FS contents, because intervening GCs may have garbage
> collected some unreachable FSs..
> Furthermore, plain and form4 non-delta deserialization where a delta
> serialization is to follow, must likewise preserve these simulated heap
> addresses (per CAS), for all deserialized FSs.
> This preservation is needed to insure that the simulated "addresses" of FSs
> are constant, even if unreachable FSs are reclaimed. In practice, this means
> that various maps involving simulated heap "addresses" need to be retained
> and not recreated.
> Because they are retained, their storage needs to be released when no longer
> needed: at CAS Reset time, after a services delta deserializer has completed
> deserializing (potentially multiple) delta CASes, or when a new non-delta
> serialization is started (this will re-create this storage). For services
> use, we may add a new API to release this storage; the service would call it
> after all delta deserializations for this CAS have been received (this use
> case is supporting having multiple remotes working on a common CAS and having
> their delta results merged back into the original CAS).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)