Author: challngr
Date: Fri May 31 15:07:49 2013
New Revision: 1488268
URL: http://svn.apache.org/r1488268
Log:
UIMA-2682 Update glossary.
Modified:
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part1/terminology.tex
Modified:
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part1/terminology.tex
URL:
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part1/terminology.tex?rev=1488268&r1=1488267&r2=1488268&view=diff
==============================================================================
---
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part1/terminology.tex
(original)
+++
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part1/terminology.tex
Fri May 31 15:07:49 2013
@@ -3,127 +3,95 @@
\else
\HCode{<a name='DUCC_TERMINOLOGY'></a>}
\fi
-\chapter{Terminology and Acronyms}
-
-\section{Terms }
- This section defines terms and phrases as used in the context of DUCC.
+\chapter{Glossary}
\begin{description}
-\item[Automatic Service ] An automatic service is a registered service that is
started automatically
+\item[Autostart Service] An autostart service is a registered service that is
started automatically
by DUCC when the DUCC system is booted.
-\item[Dependent service or job ] A dependent service or job is a job or
service that specifies one
- or more service endpoint in their job specification. The service or job is
dependent upon the
+\item[Dependent service or job] A dependent service or job is a service or job
that specifies one
+ or more service dependencies in their job specification. The service or job
is dependent upon the
referenced service being operational before being started by DUCC.
-\item[DUCC ] DUCC stands for "Distributed UIMA Cluster Computing."
+\item[DUCC] Distributed UIMA Cluster Computing.
-\item[Implicit service ] An emplicit service is a service that is started
externally to DUCC but
- referenced by some dependent service or job.
+\item[Implicit service] An emplicit service is a service that is started
externally to DUCC but
+ referenced by some dependent service or job. DUCC will attempt to contact
the service using
+ the dependency string. If contact is successful the job is started,
otherwise it is
+ terminated before resources are allocated to it.
-\item[Registered service ] A registered service is a service that is
registered with DUCC. DUCC
+\item[Registered service] A registered service is a service that is registered
with DUCC. DUCC
saves the service specification and fully manages the service, insuring it
is running when needed,
- and shutdown when not. DUCC manages the usage of the service and (in a
future verseion of DUCC)
- automatically increases and decreases the number of service instances as
dictated by demand.
+ and shutdown when not.
-\item[On-Demand Service] An on-demand service is a registered service that is
not started when DUCC
+\item[Start-by-Reference Service] An on-demand service is a registered service
that is not started when DUCC
is started. Instead, the service is started when referenced in some job or
services service
dependency, and stopped when the referencing entity exits.
-\item[Service Instance ] A service instance is one physical process which runs
a CUSTOM or UIMA-AS
- service.
+\item[Service Instance] A service instance is one physical process which runs
a CUSTOM or UIMA-AS
+ service. Note that UIMA-AS services may be scaled-out to comprise more than
one service instance.
-\item[Orchestrator (OR) ] The Orchestrator coordinates all work in the system.
All new work enters
- through the orchestrator which guides it through the various DUCC components.
+\item[Orchestrator (OR)] The Orchestrator manages the lifecycle of all
entities within DUCC.
\item[Process Manager (PM) ] The Process Manager coordinates distribution of
work among the Agents.
-\item[Resource Manager (RM) ] The Resource Manager allocates and schedules
physical resources among
- the jobs.
-
-\item[Service Class ] The service classes are
-
-implicit, referring to a service started independently from DUCC,
-submitted, referring to a service submitted as a job to DUCC, and
-registered, referring to a registered DUCC service.
-
+\item[Resource Manager (RM) ] The Resource Manager schedules physical
resources for DUCC work.
-\item[Service Endpoint ] In DUCC, the service endpoint provides a unique
identifier for a service
+\item[Service Endpoint] In DUCC, the service endpoint provides a unique
identifier for a service
and in the case of UIMA-AS services, a well-known address for contacting the
service. For CUSTOM
services, the endpoint is of the form CUSTOM:string where string is any
alphanumeric string
provided by the service owner. For UIMA-AS services, the endpoint is of the
form UIMA-AS:queue
- name:ActiveMQ broker URL.
+ name:ActiveMQ-broker-URL.
-\item[Service Manager (SM)] The Service Manager manages the life-cycles of
UIMA-AS and custom
+\item[Service Manager (SM)] The Service Manager manages the life-cycles of
UIMA-AS and CUSTOM
services. It coordinates registration of services, starting and stopping of
services, and ensures
- that services are available and remain available for the lifetime of the
jobs.
+ that services are available and remain available for the lifetime of the
jobs. Note that the
+ Orchestrator manages the individual service instances; the Service Manager
manages the collection
+ of instances which comprise a service.
\item[Agent] DUCC Agent processes run on every node in the system. The Agent
receives orders to
- start and stop processes on each node. Agents also monitor nodes, sending
heartbeat packets with
- node statistics to interested components (such as the RM and web-server).
All Job Driver and Job
- Process processes are managed as children of the agents.
-
-\item[Ducc-mon] Ducc-mon is the DUCC web-server. All DUCC state of import or
interest is presented
- here including job state, cluster state, DUCC daemon state, and
visualization of the system.
- Various controlling actions such as canceling jobs, submitting reservations,
and administrative
- functions are supported.
+ start and stop processes on each node. Agents monitors nodes, sending
heartbeat packets with node
+ statistics to interested components (such as the RM and web-server). If
CGroups are intstalled in
+ the cluster, the Agent is responsible for managing the CGroups for each job
process. All processes
+ other than the DUCC management processes are are managed as children of the
agents.
-\item[Job Driver (JD)]The Job Driver is a thin Java wrapper that encapsulates
a Job's Collection
+\item[DUCC-MON] DUCC-MON is the DUCC web-server.
+
+\item[Job Driver (JD)]The Job Driver is a thin wrapper that encapsulates a
Job's Collection
Reader. The JD executes as a process that is scheduled and deployed by DUCC.
-\item[Job Process (JP) ]The Job Process is a thin java wrapper that
encapsulates a job's Analysis
- Engine. The JP executes in a process that is scheduled and deployed by DUCC.
+\item[Job Process (JP)] The Job Process is a thin wrapper that encapsulates a
job's pipeline
+ components. The JP executes in a process that is scheduled and deployed by
DUCC.
-\item[Job specification ]The Job Specification is a collection of properties
that describe a job. It
- identifies the UIMA components (CR, AE, etc) that comprise the job, and it
specifies system-wide
- properties of the job (classpaths, RAM requirements, etc). The properties
may be provided as (key,
- value) pairs to the CLI/API, or in a Java propeties file.
+\item[Job specification] The Job Specification is a collection of properties
that describe work to be
+ scheduled and deployed by DUCC. It
+ identifies the UIMA components (CR, AE, etc) that comprise the job and the
ystem-wide
+ properties of the job (classpaths, RAM requirements, etc).
-\item[Job ] A DUCC job consists of the components required to deploy and
execute a UIMA pipeline over
+\item[Job] A DUCC job consists of the components required to deploy and
execute a UIMA pipeline over
a computing cluster. It consist of a JD to run the Collection Reader, a set
of JPs to run the UIMA
AEs, and a Job Specification to describe how the parts fit together.
-\item[Share Quantum ] In DUCC, a "share quantum" refers to some quantity of
memory; for example,
- 15GB. The RM schedules resources according to share quanta. The share
quantum is the smallest unit
- of memory that can be assigned. See the section describing the Resource
Manager for details.
-
- The terms "share" and "share quantum" are synonymous in DUCC.
-
-\item[Process ]A process is one physical process executing on a machine in the
DUCC cluster. DUCC
- jobs are comprised of one or more processes (JDs and JPs).
+\item[Share Quantum] The DUCC scheduler abstracts the nodes in the cluster as
a single large
+ congomerate of resources: memory, processor cores, etc. The scheduler
logically decomposes
+ the collection of resources into some number of equal-sized atomic units.
Each unit of work requiring
+ resources is apportioned one or more of these atomic units. The smallest
possible atomic
+ unit is called the {\em share quantum}, or simply, {\em share}.
+
+\item[Process]A process is one physical process executing on a machine in the
DUCC cluster. DUCC
+ jobs are comprised of one or more processes (JDs and JPs). Each process is
assigned one or
+ more {\em shares} by the DUCC scheduler.
+
+\item[Weighted Fair Share] A weighted fair share calculation is used to
apportion resources
+ equitably to the outstanding work in the system. In a non-weighted
fair-share system, all
+ work requests are given equal consideration to all resources. To provide
some (``more important'')
+ work more than equal resources, weights are used to give larger proportions
of the resources to
+ some classes of work.
- From the Resource Management view, a process is comprised of one or more
share quanta.
-
-\item[Weighted Fair Share ] The Weighted Fair Share calculation is used to
apportion resources in a
- "fair" manner to the outstanding work in the system. To account for some
work being more
- "important" than others, a weighting factor may be applied to bias the
fair-share calculations in
- favor of such work.
-
- See the Resource Manager section for more details on Weighted Fair Share in
DUCC.
-
-\item[Work Items ] A work item is one unit of work to be completed in a single
DUCC process. It is
+\item[Work Items] A DUCC work item is one unit of work to be completed in a
single DUCC process. It is
usually initiated by the submission of a single CAS from the CR to a UIMA
service. It could be
thought of as a single "question" to be answered by a UIMA analytic. Usually
each DUCC JP executes
many work items per job.
\end{description}
-\section{Acronyms}
-This section defines acronims as used in the context of DUCC.
-
-\begin{description}
-\item[AE:] UIMA Analysis Engine
-\item[CAS:] UIMA Common Analysis Structure
-\item[CC:] CAS Consumer
-\item[CM:] UIMA CAS Multiplier
-\item[CR:] UIMA Collection Reader
-\item[DUCC:] Distributed UIMA Cluster Computing
-\item[JD:] Job Driver
-\item[JP:] Job Process
-\item[OR:] Orchestrator
-\item[PM:] Process Manager
-\item[RM:] Resource Manager
-\item[SM:] Service Manager
-\item[UIMA:] Unstructured Information Management Architecture (see
http://uima.apache.org/)
-\item[UIMA-AS:] UIMA Asynchronous Scaleout (see
http://uima.apache.org/doc-uimaas-what.html)
-\end{description}