Author: eae
Date: Wed Oct 16 13:48:04 2013
New Revision: 1532764

URL: http://svn.apache.org/r1532764
Log:
UIMA-2682 More progress on DUCC application development

Modified:
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part3/ducc-applications.tex

Modified: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part3/ducc-applications.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part3/ducc-applications.tex?rev=1532764&r1=1532763&r2=1532764&view=diff
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part3/ducc-applications.tex
 (original)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part3/ducc-applications.tex
 Wed Oct 16 13:48:04 2013
@@ -129,7 +129,7 @@ which will connect back to the debug con
 A DUCC job is a UIMA application comprised of user code broken into a 
Collection
 Reader running in the Job Driver and an Agreggate Analysis Engine (analysis 
pipeline) running in one 
 or more Job Processes, with every Job Process running multiple instances of 
the pipeline, each in a different
-thread. The major components of this UIMA application are as follows:
+thread. The major components of the basic Job Process application are as 
follows:
 
 \begin{itemize}
   \item User Collection reader - segments the input collection in to Work Items
@@ -139,9 +139,46 @@ thread. The major components of this UIM
   \item DUCC built-in Flow Controller - routes Work Item CASes to the CM and 
optionally to the CC or AE \& CC.
 \end{itemize}
 
-It is best to develop and debug the interactions between these components as 
one, 
-single-threaded UIMA aggregate. DUCC provides an easy way to accomplish this, 
using
-the all\_in\_one specification parameter.
+\subsection{DUCC built-in Flow Controller}
+This flow controller provides separate flows for Work Item CASes and for CASes 
produced by the CM and/or AE.
+Its behavior is controlled by the existence of a CM component, and then 
further specified by the
+org.apache.uima.ducc.Workitem feature structure in the Work Item CAS.
+
+When no CM is defined the Work Item CAS is simply delivered to the AE, and 
then to the CC if defined. 
+Any CASes created by the AE will be routed to the CC.
+
+With a defined CM, the Work Item CAS is delivered only to the CM, and then 
returned from the JP when processing
+of all child CASes created by the CM and AE has completed. Work Item CAS flow 
can be further refined by the CR by
+creating a org.apache.uima.ducc.Workitem feature structure and setting the 
setSendToLast feature to true,
+or by setting the setSendToAll feature to true.
+
+\subsection{Workitem Feature Structure}
+This feature structure is defined in DuccJobFlowControlTS.xml, located in 
uima-ducc-common.jar.
+In addition to Work Item CAS flow control features, the WorkItem feature 
structure includes features that are useful
+for a DUCC job application. Here is the complete list of features:
+
+\begin{description}
+  \item[sendToLast] (Boolean) - indicates the Work Item CAS be sent to the CC
+  \item[sendToAll] (Boolean) - indicates Work Item CAS be sent to the AE and CC
+  \item[inputspec] (String) - reference to Work Item input data
+  \item[outputspec] (String) - reference to Work Item output data
+  \item[encoding] (String) - useful for reading Work Item input data
+  \item[language] (String) - used by the CM for setting document text language
+  \item[bytelength] (Integer) - size of Work Item
+  \item[blockindex] (Integer) - used if a Work Item is one of multiple pieces 
of an input resource
+  \item[blocksize] (Integer) - used to indicate block size for splitting an 
input resource
+  \item[lastBlock] (Boolean) - indicates this is the last block of an input 
resource
+\end{description}
+
+\subsection{Deployment Descriptor (DD) Jobs}
+Job Processes with arbitrary aggregate hierarchy, flow control and threading 
can be fully specified
+via a complete UIMA AS Deployment Descriptor. DUCC will modify the input queue 
to use DUCC's private
+broker and input queue name to correspond to the DUCC job ID.
+
+\subsection{Debugging}
+It is best to develop and debug the interactions between job application 
components as one, 
+single-threaded UIMA aggregate. DUCC provides an easy way to accomplish this, 
for both basic
+and DD job models, using the all\_in\_one specification parameter.
 
 \begin{description}
     \item[all\_in\_one=local] When set to local, all Job components are run in 
the same


Reply via email to