[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279200#comment-16279200 ]
Marshall Schor commented on UIMA-5662: -------------------------------------- There are two parts to this solution. One is having the FsIds of the deserialized items (retrievable using fs.hashcode()) match what was serialized. This is doable for Object serialization, Binary (non-compressed ) serialization, Xmi and XCAS serialization, but not for Binary compressed (form 4 and 6). The second part is to have low level fs retrieval using the id's (fs.hashcode()) work. These are independent; there already is a "global" switch (-Duima.enable_id_to_feature_structure_map_for_all_fss ) that will do the 2nd part (at the expense of disabling garbage collection). A use case for this would be v2 code that did things like a) get a ref to a feature structure b) extract the fsId (e.g. fs.hashcode()) c) later, try to use this id with a low level cas API to retrieve the fs. This use case doesn't require any serialization/deserialization. It can be supported in v3 by enabling that flag. I'm wondering if the right implementation for this issue is just to have it do part 1, for the subset of serialized forms where the fs id can be retrieved. Users using the low leve cas APIs would need to use the global -D switch, and then this change would make things work. I'm also wondering, if we can dispense with all configuration swtiches (except for the -Duima.enable_id_to_feature_structure_map_for_all_fss), by having the deserialization logic test if this -D... switch is on, and if so, installing the right fsIds when deserializing (those forms that have the fsIds)? That would seem to be a good set of trade-offs. WDYT? > uv3 support CAS deserialization subsequent low level access > ----------------------------------------------------------- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework > Affects Versions: 3.0.0SDK-beta > Reporter: Marshall Schor > Assignee: Marshall Schor > Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)