Author: schor
Date: Wed Mar 27 18:52:24 2013
New Revision: 1461791

URL: http://svn.apache.org/r1461791
Log:
[UIMA-2498] add some documentation on type mapping in compressed serialization/ 
deserialization

Added:
    
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tug.type_mapping.xml
   (with props)
Modified:
    
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tutorials_and_users_guides.xml

Added: 
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tug.type_mapping.xml
URL: 
http://svn.apache.org/viewvc/uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tug.type_mapping.xml?rev=1461791&view=auto
==============================================================================
--- 
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tug.type_mapping.xml
 (added)
+++ 
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tug.type_mapping.xml
 Wed Mar 27 18:52:24 2013
@@ -0,0 +1,142 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd";[
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tug.type_mapping">
+  <title>Managing different Type Systems</title>
+  <titleabbrev>Managing different TypeSystems</titleabbrev>
+  
+  <section id="ugr.tug.type_mapping.type_merging">
+    <title>Annotators, Type Merging, and Remotes</title>
+    
+         <para>UIMA supports combining Annotators that have different type 
systems.
+         This is normally done by "merging" the two type systems when the 
Annotators
+         are first loaded and instantiated. The merge process produces a 
logical
+         Union of the two; types having the same name have their feature sets 
combined.
+         The combining rules say that the range of same-named feature slots 
must be the same.
+         This combined type system is then used for the CAS that will be 
passed to
+         all of the annotators.   Details of type merging are described in
+         <olink targetdoc="%uima_docs_ref;" 
targetptr="ugr.ref.cas.typemerging"/>.
+         </para>
+         
+         <para>This approach (of merging the type systems together) works well 
for
+         annotators that are run together in one UIMA pipeline instantiation 
in one
+         machine.  Extensions are needed when UIMA is scaled out where the 
pipeline
+         includes remote annotators, acting as servers, serving
+         potentially multiple clients, each of which might have a different 
type system.
+         Clients, when initializing, query all their remote server parts to 
get their
+         type system definition, and merges them together with its own 
+         to make the type system for the CAS that will be sent among all of 
those
+         annotators. The Client's TypeSystem is the union of
+         all of its annotators, even when some of the them are remote.
+         </para>
+  </section>
+  
+  <section id="ugr.tug.type_mapping.remote_support">
+    <title>Supporting Remote Annotators</title>
+  
+         <para>Servers, in providing service to multiple clients, may receive 
CASes from
+         different Clients having different type systems.  UIMA has 
implemented several
+         different approaches to support this.</para>
+         
+         <para>
+         Base UIMA includes support for SOAP and VINCI
+         protocols.  These send the Client's type system definition (which is 
+         guaranteed to be a superset of the Server's), along with the CAS.  
The Server
+         Annotators will get a "typeSystemInit" call to let them reinitialize 
their type
+         system information to correspond to the new CAS coming in.  
+         </para>
+         
+         <para>When a server is a UIMA-AS server, the communication sends 
CASes without
+         type system information.  Several protocols and variations are 
possible in this case.
+         </para>
+         <para>
+         When using XMI serialization for sending/receiving CASes, the Client
+         sends all reachable Feature Structures to the server.  The Server can 
receive a CAS 
+         having instances of types it doesn't know about, or perhaps 
feature-slots it doesn't
+         know about within a type it does know about.  In these cases, the 
Server, while
+         deserializing, holds aside those type instances and/or feature 
instances that 
+         are not defined it the Server's type system.  When the Server returns 
the CAS 
+         back to the client, it combines
+         those held-out types and/or features with the serialized 
FeatureStructures it sends back.
+         </para>
+         <para>
+    This approach avoids the need to send the type system along with the CAS 
on every
+    invocation of a remote part of a UIMA Pipeline.
+         </para>
+  </section>
+  
+  <section id="ugr.tug.type_mapping.allowed_differences">
+    <title>Type filtering support in Binary Compressed 
Serialization/Deserialization</title>
+    
+    <para>The built-in support for Binary Compressed 
Serialization/Deserialization
+    supports filtering between non-identical type systems.  The filtering is 
designed
+    so that things (types and/or features) that are defined in one type system
+    but not in another are not sent (when serializing) nor received 
+    (when deserializing).  When deserializing, non-received features receive 0 
+    as their value.  For built-in types, like integer, float, etc., this is 
the 
+    number 0. </para>
+    
+    <para>Some kinds of type mappings cannot be supported, and will signal 
errors.
+    The two types being mapped between must be "mergable" according to the 
normal
+    type merger rules (see above); otherwise, errors are signaled.</para>
+  </section>
+  
+  <section id="ugr.tug.type_mapping.compressed">
+    <title>Remote Services support with Compressed Binary Serialization</title>
+    
+    <para>Using uncompressed Binary Serialization protocols for communicating 
to 
+    remote UIMA-AS services, requires that the Client and Server's type systems
+    be identical.  Compressed Binary Serialization protocols support
+    Server type systems which are a subset of the Clients.  Types and/or 
features 
+    not in the Server's type system are not sent to the Server.  Because of 
this, there's
+    no need to hold-aside types and features at the Server, as is the case 
with Xmi
+    transports (see above).       
+    </para>
+    
+    <para>Typically, for efficiency reasons, services use the Delta-CAS 
protocol to return the 
+    CAS back to the Client.  Delta protocols send only newly created Feature 
Structures, 
+    along with modifications made to existing Feature Structures.
+    </para>
+  </section>
+  
+  <section id="ugr.tug.type_filtering.compressed_file">
+    <title>Compressed Binary serialization to/from files</title>
+    
+    <para>When invoking compressed binary serialization to a file, you can 
specify
+    a target type system which is a subset of the original type system.  The
+    serialization will exclude types and features not in the target, when 
+    serializing.  You can use this to filter the CAS to serialize out just the 
parts
+    you want to.
+    </para>
+    
+    <para>When using binary compressed deserialization from a file, the target 
type system
+    must be the one that went with the target when it was serialized.  The 
source
+    type system can be different; if it is missing types/features, these will 
be 
+    filtered during deserialization.  If it has additional features, these 
will be 
+    set to 0 (the default value) in the CAS heap.  For numeric features, this 
means
+    the value will be 0 (including floating point 0); for feature structure 
references
+    and strings, the value will be null.
+    </para>
+  </section>
+</chapter>

Propchange: 
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tug.type_mapping.xml
------------------------------------------------------------------------------
    svn:eol-style = native

Modified: 
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tutorials_and_users_guides.xml
URL: 
http://svn.apache.org/viewvc/uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tutorials_and_users_guides.xml?rev=1461791&r1=1461790&r2=1461791&view=diff
==============================================================================
--- 
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tutorials_and_users_guides.xml
 (original)
+++ 
uima/uimaj/branches/filteredCompress-uima-2498/uima-docbook-tutorials-and-users-guides/src/docbook/tutorials_and_users_guides.xml
 Wed Mar 27 18:52:24 2013
@@ -33,5 +33,6 @@ under the License.
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"; 
href="tug.multi_views.xml"/>
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"; 
href="tug.cas_multiplier.xml"/>
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"; 
href="tug.xmi_emf.xml"/>
-  <!-- xi:include xmlns:xi="http://www.w3.org/2001/XInclude"; 
href="tug.configuration.xml"/-->  
+  <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"; 
href="tug.configuration.xml"/>  
+  <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"; 
href="tug.type_mapping.xml"/>
 </book>


Reply via email to