[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15674095#comment-15674095
 ] 

Daniel Gruhl commented on UIMA-5106:
------------------------------------

In systems with persistent analytics (that is, where CAS are stored long term 
and incrementally annotated, often by humans) it is very helpful to have a 
stabile UUID to a feature structure. For example, there may be a document in a 
CAS that is under analysis. Being able to refer to a span of that sofa and send 
it to a human for review or adjudication is very helpful. It also allows the 
use of CAS to hold "entity information", that is, frames of knowledge, or to 
represent higher level concepts (e.g., a web site CAS can be pointed to by all 
it's page CAS).

This was critical in large persistent UIMA system such as WebFountain and it 
would be nice to see it make its way into the standard.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> --------------------------------------------------------
>
>                 Key: UIMA-5106
>                 URL: https://issues.apache.org/jira/browse/UIMA-5106
>             Project: UIMA
>          Issue Type: New Feature
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Priority: Minor
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to