Author: challngr
Date: Wed Jan 13 21:09:43 2016
New Revision: 1724512

URL: http://svn.apache.org/viewvc?rev=1724512&view=rev
Log:
UIMA-4745 Database updates to duccbookk.

Added:
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-database.tex
Modified:
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/admin-commands.tex
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-properties.tex
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/ducc-aguide.tex
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/install.tex

Modified: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/admin-commands.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/admin-commands.tex?rev=1724512&r1=1724511&r2=1724512&view=diff
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/admin-commands.tex
 (original)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/admin-commands.tex
 Wed Jan 13 21:09:43 2016
@@ -35,6 +35,8 @@
     The command \ducchome/admin/start\_ducc is used to start DUCC processes. 
If run with no parameters
     it takes the following actions:
     \begin{itemize}
+      \item Starts the ActiveMQ server.
+      \item Starts the database.
       \item Starts the management processes Resource Manager, Orchestrator, 
Process Manager,      
       Services Manager, and Web Server on the local node (where start\_ducc is 
executed.       
       \item Starts an agent process on every node named in the default node 
list. 
@@ -76,6 +78,8 @@ start\_ducc -c sm -c pm -c rm -c or@bj22
             \item[sm]The Service Manager
             \item[ws]The Web Server
             \item[agent]Node Agents
+            \item[broker] ActiveMQ broker
+            \item[db] Database
           \end{description}
 
           \item[--nothreading] If specified, the command does not run in 
multi-threaded mode
@@ -138,9 +142,16 @@ start\_ducc -c sm -c pm -c rm -c or@bj22
 \label{subsec:admin.stop-ducc}
 
     \subsubsection{{\em Description:}}
-    Stop\_ducc is used to stop DUCC processes. If run with no parameters it 
takes the following 
-    actions:
-    \todo Garbled by maven or docbook, update this
+    Stop\_ducc is used to stop DUCC processes. At least one parameter is 
required.
+    When {\em -a} is specified, the following actions are taken:
+    \begin{itemize}
+       \item Uses the ActiveMQ broker to broadcast a shutdown request to all
+        DUCC compoments, other than the ActiveMQ broker itself, and the 
database.
+      \item Waits a bit, for all daemons to stop.
+      \item Stops the database.
+      \item Stops the ActiveMQ broker.
+    \end{itemize}
+
 
     \subsubsection{\em Usage:}
 
@@ -202,10 +213,19 @@ start_ducc -c rm
               \item[pm] The Process Manager.                 
               \item[sm] The Service Manager.                 
               \item[ws] The Web Server.                 
+              \item[db] The database.
               \item[broker] The ActiveMQ broker (only if the broker is 
auto-managed).
               \item[agent\@node] Node Agent on the specified node.
               \end{description}
 
+          \item[-w, --wait {[time in seconds]}] If given, this signals the 
time to wait
+            after broadcasting the shutdown signal, and before stopping the 
ActiveMQ broker itself.
+            If not specified, the default is 60 seconds.  
+
+            NOTE: In production systems, it is generally wise to use the 
default of 60 seconds.  For
+            test systems a shorter wait speeds cycle time.  Be sure to use 
{\em check\_ducc -k} after
+            {\em stop\_ducc} if you change the wait time to insure all 
processes are actually stopped.
+
           \item[--nothreading] If specified, the command does not run in 
multi-threaded mode
             even if it is supported on the local platform.
               
@@ -595,3 +615,90 @@ Nodepool power
         
     \paragraph{Notes:}
     None.
+
+\subsection{db\_create}
+\label{subsec:cli.db.create}
+
+    \paragraph{Description:}
+        This command is used to initialize the database.  Normally the 
database is initialized
+        during {\em ducc\_post\_install} but if this is an existing DUCC 
installation that is 
+        being migrated from a version that does not use the database, it will 
be necessary to
+        initialize the database with this command.
+
+        This command performs the following actions:
+        \begin{enumerate}
+          \item Starts the database.
+          \item Disables the default database superuser.
+          \item Installs a database superuser as ``ducc'' and sets the password
+            to a password of your choice, which you are prompted for.  The 
password is saved
+            in DUCC\_HOME/resources.private/ducc.private.properties.
+          \item Installs the DUCC database schema.
+          \item Stops the database.
+        \end{enumerate}
+        
+
+         This command takes no parameters.  It prompts interactively for your 
desired
+         database superuser password.  
+
+         NOTE: The database user and password are NOT RELATED to any login ID 
on the system,
+         they are used and maintained by the database only.
+
+\subsection{db\_loader}
+\label{subsec:cli.db.loader}
+
+    \paragraph{Description:}
+        This command is used to copy the data from DUCC's older file-based 
persistence
+        into the database.  The database schema must already exist, created 
either
+        with {\em ducc\_post\_install} or with {\em db\_create}.
+
+        This command performs the following actions:
+        \begin{enumerate}
+          \item Starts the database.
+          \item Drops some of the indexes in the database.
+          \item Loads the Orchestrator checkpoint file from {\em 
DUCC\_HOME/state/orchestrator.chkpt}.
+          \item Loads all job history from {\em DUCC\_HOME/history/jobs}.
+          \item Loads all reservation history from {\em 
DUCC\_HOME/history/reservations}.
+          \item Loads all service instance and AP history from {\em 
DUCC\_HOME/history/services}.
+          \item Loads the service registry from {\em 
DUCC\_HONE/state/services}.
+          \item Loads the service registry histroy from {\em 
DUCC\_HOME/history/service-registry}.  
+          \item Reloads the Orchestratory checkpoint, as a spot-check of the 
loader's instrumentation (to insure
+            load times stay reasonable.)
+          \item Re-installs the DUCC database schema.
+          \item Stops the database.
+          \item Optionally renames the file-based state so if you rerun the 
command, the data does not get reloaded.
+        \end{enumerate}
+        
+        When the command exits, DUCC should be ready to run with all its state 
in the database.
+
+        This command takes two parameters, a pointer to the DUCC\_HOME you 
want to load from, and
+        a flag to disable the rename of the file-based state.
+
+    \paragraph{Usage:}
+    \begin{description}
+    \item[db\_loader -i {\em some-ducc-home} {[--no-archive]}]
+      Load the database from the specified DUCC\_HOME, and optionally do not 
archive the original files
+      by renaming them.  
+    \end{description}
+
+    \paragraph{Options:}
+    \begin{description}
+        \item[$-i$ {\em some-ducc-home}]          en 
+          This specifies the DUCC\_HOME you wish to load.  Most of the time it 
is the DUCC\_HOME you
+          are running within, but it can be some other DUCC\_HOME if you have 
multiple installations and
+          want other history and state loaded.
+        \item[$--no-archive$] 
+          If specified, the original files are not renamed.  Note that only 
the directories in {\em history}
+          and {\em state} are renamed.  To restore these, simply rename them 
back without the {\em archive}
+          suffix.
+     \end{description}
+        
+    \paragraph{Example:}
+\begin{verbatim}
+db_loader -i /home/ducc/ducc_runtime
+db_loader -i /home/ducc.old/ducc_runtime --no-archive
+\end{verbatim}
+
+    \paragraph{Notes:}
+    The console shows progress of the loader.  Full details of the load are 
written to a log {\em db-loader-log}
+    in the usual DUCC log directory, for reference and potential problem 
determination if something goes wrong.
+    

Added: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-database.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-database.tex?rev=1724512&view=auto
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-database.tex
 (added)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-database.tex
 Wed Jan 13 21:09:43 2016
@@ -0,0 +1,108 @@
+% 
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements.  See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership.  The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License.  You may obtain a copy of the License at
+% 
+%   http://www.apache.org/licenses/LICENSE-2.0
+% 
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied.  See the License for the
+% specific language governing permissions and limitations
+% under the License.
+% 
+\section{DUCC Database Integration}
+\label{sec:ducc.database}
+
+    As of Version 2.1.0, DUCC uses the 
\href{https://cassandra.apache.org/}{Apache Cassandra}
+    database instead of the filesystem to manage
+    history and the service registry.  Additionally, the Resource Manager 
maintains
+    current scheduling and node state in the database.
+
+   \subsection{Overview}
+
+    During first-time installaion, the 
\hyperref[subsec:install.single-user]{\em ducc\_post\_install} utility
+    prompts for a (database) super-user password.  If a password is provided, 
the utility 
+    proceeds to configure the database and install the schema.  If a password
+    of ``bypass'' is given, database installation is bypassed and the file 
system
+    is used for history and services.  The Resource Manager does not attempt to
+    persist its state in the filesystem.
+    
+    If database integration is bypased during 
\hyperref[subsec:install.single-user]{\em ducc\_post\_install}, it may be
+    installed later with the utilities \hyperref[subsec:cli.db.create]{\em 
db\_create} and \hyperref[subsec:cli.db.loader]{\em db\_loader}.
+
+    If DUCC is being upgraded, generally 
\hyperref[subsec:install.single-user]{\em ducc\_post\_install} is not used, in 
+    which case, again, \hyperref[subsec:cli.db.create]{\em db\_create} and 
\hyperref[subsec:cli.db.loader]{\em db\_loader} may be used to
+    convert the older file-based state to the database.
+
+    \subsubsection{Orchestrator use of the Database}
+
+    The Orchestrator persists two types of work:
+    \begin{enumerate}
+      \item All work history.  This includes jobs, reservations, service 
instances, and 
+        arbitrary processes.  This history is what the webserver uses to 
display details
+        on previously run jobs.  Prior to the database, this data was saved in 
the
+        {\em DUCC\_HOME/history directory}.
+      \item Checkpoint.  On every state change, the Orchestrator saves the 
state of 
+        all running and allocated work in the system.  This is used to recover 
reservations
+        when DUCC is started, and to allow hot-start of the Orchestrator 
without losing work.
+        Prior to the database, this data was saved in the file {\em 
DUCC\_HOME/state/orchestrator.ckpt}.
+    \end{enumerate}
+    
+    \subsubsection{Service Manager use of the Database}
+    The service manager uses the database to store the service registy and all 
state
+    of active services.  Prior to the database, this data was saved in Java 
properties files
+    in the directory {\em DUCC\_HOME/state/services}.
+
+    When a service is ``unregistered'' it is not physically removed from the 
database.  Instead,
+    a bit is set indicating the service is no long active.  These 
registrations may be
+    recovered if needed by querying the database.  Prior to the database, this 
data was saved
+    in {\em DUCC\_HOME/history/service-registry}.
+
+    \subsubsection{Resource Manager use of the Database}
+    The resource manager saves its entire runtime state in the database.  
Prior to the
+    database, this dynamnic state was not saved or directly accessible.
+
+    \subsubsection{Webserver use of the Database}
+    The web server uses the database in read-only mode to fetch work history, 
service
+    registrations, and node status.  Previosly to the database most of this 
information
+    was fetched from the filesystem.  Node status was inferred using the Agent 
publications;
+    with the database, the webserver has direct access to the Resource 
Manager's view of the
+    DUCC nodes, providing a much more accurate picture of the system.
+  
+\subsection{Database Scripting  Utilities}
+    Database support is fully integrated with the DUCC start, stop, and check 
utilities as
+    well as the post installation scripting.
+
+    In addition two utilities are supplied to enable migration of older 
installations to
+    enable the database:
+
+    \begin{description}
+      \item[db\_create] The \hyperref[subsec:cli.db.create]{db\_create} 
utility creates the database schema, disables the
+        default database superuser, installs a read-only guest id, and 
installs the
+        main DUCC super user ID.  Note that database IDs are in no way related 
to 
+        operating system IDs.
+      \item[db\_loader] The \hyperref[subsec:cli.db.loader]{db\_loader} 
utility migrates an existing file-based DUCC
+        system to use the database.  It copies in the job history, 
orchestrator checkpoint,
+        and the service registry.
+    \end{description}
+      
+    Use the cross-references above for additional details on the utilities.
+    
+\subsection{Database Configuration}
+    Most database configuration is accomplished by setting approriate values 
into 
+    your local \hyperref[subsec:ducc.database.properties]{\em 
site.ducc.properties}.  See
+    the linked section for details.
+    
+    For first-time installations, the utility {\em ducc\_post\_install} prompts
+    for database installation, and if it is not bypassed, reasonable defaults 
for
+    all database-related properties are established.
+
+    For existing installations, the {\em db\_create} utility installs the
+    database scheme and updates your {\em site.ducc.properties} with reasonable
+    defaults.

Modified: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-properties.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-properties.tex?rev=1724512&r1=1724511&r2=1724512&view=diff
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-properties.tex
 (original)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-properties.tex
 Wed Jan 13 21:09:43 2016
@@ -1603,6 +1603,143 @@
       \end{description}
       
 
+\subsection{Database Configuration Properties}
+\label{subsec:ducc.database.properties}
+
+    \begin{description}
+
+      \item[ducc.database.host] \hfill \\
+        This is the name of the host where the database is run.  It usually 
defaults to the
+        same host as the ducc.head.  Those knowledgable of the database can 
install the 
+        database elsewhere.  Use this parameter to specify that location.
+
+        To disable use of the database, set this parameter to the string {\em 
--disabled--}.
+        \begin{description}
+          \item[Default Value] The same as your ducc.head.
+          \item[Type] Tuning
+        \end{description} 
+
+      \item[ducc.database.jmx.port] \hfill \\
+        This is the JMX port used by the database.  Normally it need not be 
changed.  This
+        port is ONLY available on the host where the database runs.  To allow 
access from
+        other hosts, check the Cassandra documentation.  DUCC only directly 
supports
+        access from the configured database host.
+        \begin{description}
+          \item[Default Value] 7199
+          \item[Type] Tuning
+        \end{description} 
+
+      \item[ducc.database.mem.heap] \hfill \\
+        This is the value used to set {\em Xmx and Xms} when the database 
starts.  The
+        Cassandra database makes an attempt to determine the best value of 
this.  The
+        default is one-half of real memory, up to a maximum of 8G.  It is 
reccomended that
+        the default be used.  However, small installations may reduce this to 
as little
+        as 512M.  Note that both Xmx and Xms are set.
+        \begin{description}
+          \item[Default Value] Determined by Cassandra, up to 8G max.
+          \item[Type] Tuning
+        \end{description} 
+
+      \item[ducc.database.mem.new] \hfill \\
+        This is the default for the ``young'' generation when the JVM needs 
more memory.
+        In general, the default is correct.  If you're not familiar with 
Java's memory
+        management it is safest to not modify this.
+        \begin{description}
+          \item[Default Value] 100M
+          \item[Type] Tuning
+        \end{description} 
+
+      \item[ducc.service.persistence.impl] \hfill \\
+        This specifies the class used to implement persistence for the Service 
Manager's registry.  
+        The installation procedures for the database automatically update your 
{\em site.ducc.properties}
+        to use the correct default.
+
+        There
+        are two supported values:
+\begin{verbatim}
+org.apache.uima.ducc.common.persistence.services.StateServices
+org.apache.uima.ducc.database.StateServicesDb
+\end{verbatim}
+
+        The first value implements the service registry in the file system in 
the directory
+        {\tt DUCC\_HOME/state/services}.
+
+        When the database is installed, the service registry is implemented 
over the database.
+
+        \begin{description}
+          \item[Default Value] When the database is enabled:
+\begin{verbatim}
+   org.apache.uima.ducc.database.StateServicesDb
+\end{verbatim}
+
+            When the database is not enabled:
+\begin{verbatim}
+   org.apache.uima.ducc.common.persistence.services.StateServices
+\end{verbatim}
+          \item[Type] Private
+        \end{description} 
+
+
+      \item[ducc.job.history.impl] \hfill \\
+        This specifies the class used to implement persistence for job history 
and the 
+        Orchestrator checkpoint.  
+        The installation procedures for the database automatically update your 
{\em site.ducc.properties}
+        to use the correct default.
+
+        The two supported values are:
+\begin{verbatim}
+org.apache.uima.ducc.transport.event.common.history.HistoryPersistenceManager
+org.apache.uima.ducc.database.HistoryManagerDb
+\end{verbatim}
+
+        The first value causes job history to be stroed in {\tt 
DUCC\_HOME/history}
+        and the Orchestrator checkpoint to be stored in {\tt 
DUCC\_HOME/orchestrator.ckpt}.
+
+        The second causes both history and checkpoint to be saved in the 
database.
+
+        \begin{description}
+          \item[Default Value] If the database is enabled:
+\begin{verbatim}
+   org.apache.uima.ducc.database.HistoryManagerDb
+\end{verbatim}
+            If the database is not enabled:
+\begin{verbatim}
+   
org.apache.uima.ducc.transport.event.common.history.HistoryPersistenceManager
+\end{verbatim}
+          \item[Type] Private
+        \end{description} 
+
+
+      \item[ducc.rm.persistence.impl] \hfill \\
+        This specifies the class used to implement persistence for the 
Resource Manager's
+        dynamic state.  
+        The installation procedures for the database automatically update your 
{\em site.ducc.properties}
+        to use the correct default.
+
+        The two supported values are:
+\begin{verbatim}
+org.apache.uima.ducc.database.RmStatePersistence
+org.apache.uima.ducc.common.persistence.rm.NullRmStatePersistence
+\end{verbatim}
+
+        The first value implements RM's use of the database to store its 
dynamic state.  The second
+        disables RM state persistence.  There is no implementation that 
persists RM state
+        in the filesystem.
+
+        \begin{description}
+          \item[Default Value] If the
+            database is enabled:
+\begin{verbatim}
+   org.apache.uima.ducc.database.RmStatePersistence
+\end{verbatim}
+            If the database is not enabled:
+\begin{verbatim}
+   org.apache.uima.ducc.common.persistence.rm.NullRmStatePersistence
+\end{verbatim}
+          \item[Type] Private
+        \end{description} 
+      \end{description}
+
 \section{ducc.private.properties}
 \label{sec:ducc.private.properties}
 
@@ -1618,4 +1755,20 @@
             \item[Type] Local
           \end{description}
     \end{description}    
+        
+
+\subsection{Database Properties}
+
+    \begin{description}
+    
+        \item[db\_password] \hfill \\
+          This is the database superuser password.  It is set during {\em 
ducc\_post\_install} or {\em db\_create}.  Both
+          these procedures prompt for the password.  There is no default; the 
value must be supplied by the DUCC administrator.
+
+          NOTE: The database superuser ID is always ``ducc'', and is set 
during database installation.
+          \begin{description}
+            \item[Default Value] None, must be set by the administrator.
+            \item[Type] Local
+          \end{description}
+    \end{description}    
         

Modified: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/ducc-aguide.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/ducc-aguide.tex?rev=1724512&r1=1724511&r2=1724512&view=diff
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/ducc-aguide.tex
 (original)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/ducc-aguide.tex
 Wed Jan 13 21:09:43 2016
@@ -38,6 +38,7 @@
 \input{part4/admin/ducc-classes.tex}
 \input{part4/admin/ducc-nodes.tex}
 \input{part4/admin/ducc-users.tex}
+\input{part4/admin/ducc-database.tex}
 
 %% This is a section
 \input{part4/admin/admin-commands.tex}
@@ -56,10 +57,11 @@
 \input{part4/sm.tex}
 
 %% A chapter
-\input {part4/sim.tex}
+\input {part4/web.tex}
 
 %% A chapter
-\input {part4/web.tex}
+\input {part4/sim.tex}
+
 
 \chapter{Understanding the DUCC logs}
 \input{part4/system-logs.tex}

Modified: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/install.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/install.tex?rev=1724512&r1=1724511&r2=1724512&view=diff
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/install.tex
 (original)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/install.tex
 Wed Jan 13 21:09:43 2016
@@ -226,6 +226,9 @@ The post-installation script performs th
     \item Verifies that the correct level of Java and Python are installed and 
available.
     \item Creates a default nodelist, \duccruntime/resources/ducc.nodes, 
containing the name of the node you are installing on.
     \item Defines the ``ducc head'' node to be to node you are installing from.
+    \item Initializes the database. A prompt for the database password is 
given; to bypass
+      database installation give the password {\em bypass}. (The database can 
be installed
+      at a later date.)
     \item Sets up the default https keystore for the webserver.
     \item Installs the DUCC documentation ``ducc book'' into the DUCC 
webserver root.
     \item Builds and installs the C program, ``ducc\_ling'', into the default 
location.


Reply via email to