Author: challngr Date: Mon Feb 1 17:36:08 2016 New Revision: 1727979 URL: http://svn.apache.org/viewvc?rev=1727979&view=rev Log: UIMA-4777 Internals documentation.
Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/ uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/db-structure.png (with props) uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-1.png (with props) uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-2.png (with props) uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.png (with props) uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.vsd (with props) uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/sm-structure.png (with props) uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/internals-book.tex uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part5/ducc-pops-component-database.tex uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part5/ducc-pops-component-rm.tex uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part5/ducc-pops-component-sm.tex Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/db-structure.png URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/db-structure.png?rev=1727979&view=auto ============================================================================== Binary file - no diff available. Propchange: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/db-structure.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-1.png URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-1.png?rev=1727979&view=auto ============================================================================== Binary file - no diff available. Propchange: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-1.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-2.png URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-2.png?rev=1727979&view=auto ============================================================================== Binary file - no diff available. Propchange: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure-2.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.png URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.png?rev=1727979&view=auto ============================================================================== Binary file - no diff available. Propchange: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.vsd URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.vsd?rev=1727979&view=auto ============================================================================== Binary file - no diff available. Propchange: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/rm-structure.vsd ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/sm-structure.png URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/sm-structure.png?rev=1727979&view=auto ============================================================================== Binary file - no diff available. Propchange: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/images/ducc-internals/sm-structure.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/internals-book.tex URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/internals-book.tex?rev=1727979&view=auto ============================================================================== --- uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/internals-book.tex (added) +++ uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/internals-book.tex Mon Feb 1 17:36:08 2016 @@ -0,0 +1,90 @@ +% +% Licensed to the Apache Software Foundation (ASF) under one +% or more contributor license agreements. See the NOTICE file +% distributed with this work for additional information +% regarding copyright ownership. The ASF licenses this file +% to you under the Apache License, Version 2.0 (the +% "License"); you may not use this file except in compliance +% with the License. You may obtain a copy of the License at +% +% http://www.apache.org/licenses/LICENSE-2.0 +% +% Unless required by applicable law or agreed to in writing, +% software distributed under the License is distributed on an +% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +% KIND, either express or implied. See the License for the +% specific language governing permissions and limitations +% under the License. +% +\documentclass[oneside]{book} + +% space between paragraphs +\usepackage{parskip} + +% import graphics +%\usepackage[pdftex]{graphicx} +\usepackage{graphicx} + +% Better control of figure placement +\usepackage{float} + +% hyperlinks +\usepackage[colorlinks,linkcolor=blue]{hyperref} + +% Conditionally execute based on PDF or HTML output +\usepackage{ifpdf} + +% Margins +\usepackage[top=1in, bottom=.75in, left=.75in, right=.75in ]{geometry} + +\usepackage{xcolor} + +\usepackage{caption} +\captionsetup[table]{skip=18pt} + +%list margins +\usepackage{enumitem} + +% better control over date formatting +\usepackage{datetime} + +% get version number from a maven-updated file +\input{version} +\title{\Huge \textbf{DUCC Internals Documentation}} +\author{Written and maintained by the Apache\\ +UIMA\texttrademark Development Community \\ +\\ +\\ +\\ +Version \versionnumber} + +\date{} + +\begin{document} + +\frontmatter +\maketitle + +\input{legal.tex} + +%% \setcounter{tocdepth}{4} +% Call it Table Of Contents, same as other UIMA books do +\renewcommand\contentsname{Table of Contents} +\tableofcontents +\listoffigures +% \listoftables + +\mainmatter + +\input{common.tex} + +\chapter{Database} +\input{part5/ducc-pops-component-database.tex} + +\chapter{Resource Manager} +\input{part5/ducc-pops-component-rm.tex} + +\chapter{Service Manager} +\input{part5/ducc-pops-component-sm.tex} + +\end{document} Added: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part5/ducc-pops-component-database.tex URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part5/ducc-pops-component-database.tex?rev=1727979&view=auto ============================================================================== --- uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part5/ducc-pops-component-database.tex (added) +++ uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part5/ducc-pops-component-database.tex Mon Feb 1 17:36:08 2016 @@ -0,0 +1,436 @@ +% +% Licensed to the Apache Software Foundation (ASF) under one +% or more contributor license agreements. See the NOTICE file +% distributed with this work for additional information +% regarding copyright ownership. The ASF licenses this file +% to you under the Apache License, Version 2.0 (the +% "License"); you may not use this file except in compliance +% with the License. You may obtain a copy of the License at +% +% http://www.apache.org/licenses/LICENSE-2.0 +% +% Unless required by applicable law or agreed to in writing, +% software distributed under the License is distributed on an +% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +% KIND, either express or implied. See the License for the +% specific language governing permissions and limitations +% under the License. +% + +\section{DUCC Database Integration} + + DUCC is integrated with the Apache Cassandra database (\url{https://cassandra.apache.org/}. As of + DUCC release 2.1.0 the database is used for the following functions: + \begin{itemize} + \item History. Previously a history file for all work in the system was written to the + DUCC {\em history} directory. These files are now written to the database. As of this + writing, we write the serialized DUCC objects as blobs for future reference with + several tables summarizing the contents of the blob for use by command-line utilities + and the webserver. + \item Service registry. Previously, the service registry was maintained as a collection of + Java {\em properties} files in the DUCC {\em state} directory. As of 2.1.0, the registry + is maintained in a set of database tables. + \item Service registry history. Previously, when a service was unregistered, it's registry + files were moved to the DUCC {\em history} directory. As of 2.1.0, a property in + the database tables for the registry is updated to indicate the entry is archived. + \item Orchestrator checkpoint. Previously, the DUCC Orchestrator would write a file + containing the state of all active work in the system, used for restart of the system. + As of 2.1.0, this checkpoint file is written as a BLOB to the database. + \item Resource Manager dynamic state. Previously, this was not persisted. As of 2.1.0, + the current state of all hosts in the system, and all work scheduled on these hosts + is maintained in the database. This state is deleted when the RM starts, and is rebuilt + or updated as nodes check in to the RM and as work enters and leaves. + \end{itemize} + +\section{Code Organization} + + \paragraph{Dependencies} All code that interfaces with the database resides in a single project, + in a single directory in the DUCC source, {\em uima-ducc-database}. All access to this function + by the DUCC daemons is through interfaces. There are no compile-time dependencies on this + project by other DUCC projects; conversely, this project has compile-time dependencies only on + the low-level common code: {\em uima-ducc-common, uima-ducc-transport, and uima-ducc-user}. + + Figure ~\ref{fig:db-structure} provides a visual overview of the Database component structure. + \begin{figure}[H] + \centering + \includegraphics[width=5.5in]{images/ducc-internals/db-structure.png} + \caption{Database Structure} + \label{fig:db-structure} + \end{figure} + + + Runtime dependencies are resolved with reflection. Entries in ducc.properties + are used to specify the classes which interface with the database. The DUCC scripting + insures that the CLASSPATHs of components using the database contain the + necessary entries. + + \paragraph{Factories} + The Cassandra Java client is thread-safe and manages connection pooling. Only a single + Cassandra {\em session} should be acquired for all threads in a process. To enforce this + the {\em Factory} pattern is used to acquire database handles. The factory creates + a single static handle and returns this singleton on every call. + + The objects returned by the factories are referenced through their interfaces that describe all + legal actions against the persistent store. + + There are three factories: + \begin{description} + \item[HistoryFactory.java] This resides in +\begin{verbatim} +uima-ducc-transport/src/main/java/org/apache/uima/ducc/transport/event/common/history +\end{verbatim} + This is used by the Orchestrator to write history files and its checkpoint, and to restore + the checkpoint on startups. It is used by the Web Server to read work history. + \item[StateServicesFactory.java] This resides in +\begin{verbatim} +uima-ducc-common/src/main/java/org/apache/uima/ducc/common/persistence/services +\end{verbatim} + This is used by the Service Manager to maintaining service registrations, service metadata, + and service registration history. It is used by the Web Server to show the service + registration and meta details. + \item[RmPersistenceFactory.java] This resides in +\begin{verbatim} +uima-ducc-common/src/main/java/org/apache/uima/ducc/common/persistence/rm +\end{verbatim} + This is used by the Resource Manager to maintain its internal scheduling state for the + purpose of inspection by other agents. This is used by the Web Server to show machine + details. This is used by the admin CLI to show the state of all machines and work in the + system. + \end{description} + + \paragraph{Interfaces} + All higher-level communication to the database is done through objects returned from the + factories which must conform to specific interfaces. There are three interfaces: + + \begin{description} + \item[IHistoryPersistenceManager.java] This resides in +\begin{verbatim} +uima-ducc-transport/src/main/java/org/apache/uima/ducc/transport/event/common/history +\end{verbatim} + See its Javadoc for details of its calling sequence. + \item[IStateServices.java] This resides in +\begin{verbatim} +uima-ducc-common/src/main/java/org/apache/uima/ducc/common/persistence/services +\end{verbatim} + See its Javadoc for details of its calling sequence. + \item[IRmPersistence.java] This resides in +\begin{verbatim} +uima-ducc-common/src/main/java/org/apache/uima/ducc/common/persistence/rm +\end{verbatim} + See its Javadoc for details of its calling sequence. + \end{description} + + In addition to the calling sequences, these interfaces contain Java {\em enum} structures that + describe the database schema. See below for how these enums are designed. + + \paragraph{Implementations} + Multiple implementations of each interface are provided. In all cases, a ``null'' + implementation for which all methods are empty stubs is used as a fallback in the event that a + more functional interface cannot be provided. There are both {\em file-based} and {\em + database-based} implementations for Orchestrator state and for the Service registry. Resource + manager state is provided via the database only. See the DuccBook for details on how to select + a specific implementation at runtime. + + In the case of the implementations that interface with the database, an additional method is + required, but is not part of the public interface: +\begin{verbatim} + static RETURN-TYPE mkSchema(); +\end{verbatim} + The specific type of object returned by this method varies with database implementations. It + must return a collection of objects that the database creation methods can use to create the + database schema. + +\section{Database Schema} + The schema for all tables is controlled by Java {\em enum} objects in the various interfaces. These enums must adhere to a specific + interface, defined in +\begin{verbatim} + uima-ducc-common/src/main/java/org/apache/uima/ducc/common/persistence/IDbProperty.java +\end{verbatim} + + There are five methods defined in this interface, used by the database package to automatically generate + the schema. These interfaces may also be used by applications when querying the database to determine + the types and actual database column names for each table. + + Most elements in the enum define columns of a table in the database. Methods on the enum + contain meta-data required to correctly create and interpret the data in a column. Some elements in the + enum are meta-data about the column itself. + + These methods are: + \begin{description} + \item[String pname()] This the name of a column as known by DUCC and may contain any ASCII + characters. Note this need not be the name of the column in the database. + \item[String columnName()] This is the name of the column as used in the database. It must + conform to the column-naming standards of the database being used. + \item[Type type()] This specifies the type of data in the column. Rather than specifying the + database-specific type names, we supply an abstract name in the Type object which the + database package translates to the correct form for the specific database implementation. + \item[isPrimaryKey()] If true, the data in the column defined by this enum is a primary key. + It is legal to specify multiple columns as primary keys, in which case, the database + component will create a compound primary key. The keys are generated in the order + they occur in the enum. + \item[boolean isPrivate()] This enum element is used by the database package only and + should never be passed back to applications. It allows the + database package to maintain table-specific information that is not accidentally + translated into a return element. For example this is used when a row corresponds to + a collection of Java properties, but the enum does not correspond to one of the + returned properties. + \item[boolean isMeta()] This is the converse of isPrivate(). This allows + an application to pass information to the database component that does not get + placed into the schema or database. For example, the name of the table may be + defined in the enum, but this should not become the name of a column in any table. + \item[isIndex()] If {\em true}, this column is indexed in the schema. Multiple columns + may be specified for indexing. + \end{description} + + \paragraph{Types} We maintain a level of indirection between DUCC and specific database types, to enable + disparate database implementation from a common meta-schema. The DUCC-defined database + types are: + + \begin{description} + \item[String] The database implementation translates this into the appropriate + type for the database, for example, {\em varchar} for a DB2 database. + \item[Blob] This specifies a binary large object, e.g. a serialized + Java object. + \item[Integer] This specifies a 32-bit integer. + \item[Long] This specifies a 64-bit integer. + \item[Double] This specifies a Java object of type {\em double} + \item[UUID] Some modern databases have native support for Java UUIDs. This specifies + an object conforming to that type. Older databases may translate this to {\em char} + or {\em varchar}. + \end{description} + +\section{The uima-ducc-database package} + This package is intended to be isolated as much as possible from the rest of DUCC. The + design-point is that it should be mostly straightforward to change the database implementation, + or to create additional persistence implementations, as long as the functions described + in the previous sections are maintained. + + \paragraph{Database Core} Most of the database interface is contained in two classes: + \begin{description} + \item[DbManager] This object is responsible for directly interfacing with the + specific database implementation. It knows how to manage the + database URL, how to contact the database, how to execute commands (e.g. SQL) + against the database, how to create users and manage security, and the + general structure of the DB API. + + This object is to be used only to initiate database communication. It generally does not + know much about the specific query language used (e.g. CQL vs SQL), which is left to the + DbHandle. + \item[DbHandle] This provides a level of indirection between {\em clients} of the + database, and the {\em implementation} of the database. A {\em client} + instantiates a DbManager and then requests a {\em DbHandle} whenever it actually + needs to communicate to the DB. If session pooling is supported, the DbHandle should + transparently enable this so higher-level layers need not be concerned with it. + + The handle rarely communicates directly with the database itself. Instead, it + requests the DbManager that created it to do actual communication. + \end{description} + + \paragraph{Bootstrap modules} Some specialized functions are separated into discrete classes: + \begin{description} + \item[DbAlive] This module communicates directly with the database, bypassing both + the DbManager and DbHandle. It is considered a {\em bootstrap} object. It assumes + the database has been started, and attempts to contact it, determine if the + {\em ducc} and {\em guest} userids are defined, and queries the schema. This + implements retry logic as the database can take time to start up. It bypasses + the DbManager because if the database is in some way compromised, it may not be + possible to successfully instantiate a DbManager or DbHandle. + \item[DbCreate] This module also bypasses the DbManager and communicates directly + with the database. It creates the {\em ducc} superuser id, disables the + default superuser, and creates a restricted {\em guest} userid. + + This is also considered a bootstrap object. + \item[DbLoader] This module is used to load an existing DUCC file-based {\em history, + checkpoint,} and {\em service registry} into the database. It is considered a bootstrap + module and communicates directly to the database when it can for best performance, and and + indirectly through the implementations of the DUCC persistence objects to create summary + tables of the various objects. + \end{description} + + \paragraph{Schema Creation} + Each of the DUCC-component-specific database implementations must implement a method +\begin{verbatim} + static RETURN-TYPE mkSchema(); +\end{verbatim} + where the {\em RETURN-TYPE} depends on the specific database implementation. In the case + of Cassandra, the full signature is +\begin{verbatim} + static List<PreparedStatement mkSchema(); +\end{verbatim} + + The DUCC-component implementations inspect their schema definitions, as defined in the + IDbProperty enums in their interfaces, and create, in the case of Cassandra, a collection + of PreparedStatements which the {\em DbCreate} then uses to generate schema. + + \paragraph{Utility Modules} + \begin{description} + \item[DbUtil] This contains common, static, methods that know how to manipulate + the IDbProperty enums to create schemas, indexes, convert property files into + {\em INSERT / UPDATE} statements, and so on. + \item[RmNodeState] This is example code that demonstrates one way to query the database + and generate a {\em json} object of current Resource Manager state for clients. + \item[RmQLoad] This is example code that demonstrates one way to query the database + and generate a {\em json} object of current Resource Manager demand for clients. + \end{description} + + \paragraph{DUCC component-specific implementations} + These modules implement persistence for the Orchestrator, Service Manager, and Resource + Manager, implementing their indicated interfaces as well as the required {\em mkSchema} + methods. They should never be directly accessed outside of the database package. Instead, + they must be instantiated by the correct {\em Factory} as described in earlier sections. + + \begin{description} + \item[HistoryManagerDb]This implements persistence for Orchestrator-generated + history and checkpoint. + \item[StateServicesDb] This implements persistence for the Service Manager's + registry and history. + \item[RmStatePersistence] This implements persistence for the Resource Manager's + dynamic state. + + Note that this state is always deleted whenever RM initializes or reconfigures, + and is rebuilt as the RM itself builds or recovers its dynamic state. + \end{description} + +\section{Tables} + This section describes all of the tables. + + + \paragraph{HistoryManagerDb} The {\em HistoryManagerDb} module is responsible for the + schema and maintenance of the tables used for most of the history objects and the + Orchestrator checkpoint. + + \begin{description} + \item[job\_history] This contains the serialized objects for all Job history as {\em BLOB}s. + \item[res\_history] This contains the serialized objects for all Reservation history as {\em BLOB}s. + \item[svc\_history] This contains the serialized objects for all Service history as {\em BLOB}s. + \item[orckpt] This contains the Orchestrator checkpoint. There are two {\em BLOB}s in this object: + the current OR map, and the job-to-process map. + \item[jobs] This contains details for all jobs, extracted from the {\em BLOB}s that are written + to {\em job\_history}. It does not include any process history however. + \item[processes] This contains details for all objects that get allocated space by the RM: + job processes, service processes, AP processes. + \item[reservations] This contains details for all reservations, extracted from the {\em BLOB}s that are written + to {\em res\_history}. + \end{description} + + \paragraph{StateServicesDb} The {\em StateServicesDb} module is responsible for the + service registry. + + \begin{description} + \item[smreg] This contains the service registrations as submitted by users. + \item[smmeta] This contains active state of services. + \end{description} + + \paragraph{RmStatePersistence} The {\em RmStatePersistence} module is responsible for all the + dynamic state produced by the RM. + + \begin{description} + \item[rmnodes] This contains the state of all nodes known to the RM. + \item[rmshares] This contains details on all the shares currently allocated by RM. + \item[rmload] This contains the ``demand'' on the RM: counts of all services that are + requested by jobs, and counts of services RM is able to satisfy. (The intended purpose + of this table is to allow external agents to inspect RM load and in conjunction with + rmnodes and rmshares, determine whether RM is under-, over-, or sufficiently provisioned.) + \end{description} + +\section{Scripting and Configuration} + The goal of the DUCC scripting support for the database is to make database start-up, shutdown, + schema initialization, migration, and configuration as transparent as possible. + + \paragraph{Configuration} Here we define {\em configuration} to refer to the files + that define the database URL, the hosts it may be running on, the + location of the physical data, etc. A number of these values are determined + by virtue of the way DUCC and the database are designed to work together. + + There are two relevant files. Pre-configured versions of these file reside in the + DUCC source base in the directory +\begin{verbatim} +src/main/resources +\end{verbatim} + + During system build these files are copied into the database configuration + directory. + + Note that if the database is updated or replaced it will generally be required + to obtain re-configure these files and replace them in the build directory. + + The files are: + \begin{description} + \item[cassandra.yaml] This is the primary configuration file. Details of its + contents are found in the standard Cassandra documentation. We prepare this + configuration thus: + \begin{itemize} + \item Set the database {\em cluster name} to DUCC. + \item Set the hostname where the Cassandra server resides in three places: + the {\em seed\_provider}, the {\em listen\_address}, and the {\em rpc\_address}. + The reconfigured {\em cassandra.yaml} sets these all to the constant string + {\em DUCCHOST}; the DUCC startup scripting changes these to the value of + {\em ducc.head} before starting Cassandra. + \item Set the authentication scheme to {\em PasswordAuthenticator} to force + userid and password access. + \item Set the authorizer module to {\em CassandraAuthorizer} to enable specific + permissions to be set on the configured userids. + \item Set the location of the database files in {\em data\_file\_directories}. + \item Set the location of the database commitlog in {\em commitlog\_directory}. + \item Set the location of the database saved caches in {\em saved\_caches\_directory}. + \end{itemize} + \item[cassandra.env.sh] This is a shell script that is run by Cassandra as it is + starting up to detect the environment and set its internal parameters. The following + DUCC changes are applied: + \begin{itemize} + \item Alter checks for the JVM vendor so it will start with the IBM JVM. + \item Parameterize some things so the can be pulled from the environment, and + thus enable Cassandra to be customized from {\em ducc.properties}. The following + items have been modified in this file for this purpose: JMX\_PORT. (Note that + {\em Xmx and Xms} are already customizable by setting environment variables.) + \end{itemize} + \end{description} + + \paragraph{Scripting} + + + \paragraph{} + + The following updates to the DUCC scripting support the database: + \begin{description} + \item[ducc.head Configuration] When DUCC is started, a small bit of code is executed to insure + the {\em ducc.head} node is properly configured in the Cassandra configuration {\em + cassandra.yaml}. If not, a message is emitted and the configuration file is updated + before attempting to start DUCC. + \item[ducc\_util.py] This contains common database routines used by all scripts that extend + the base class {\em DuccUtil}. These methods perform these functions: + \begin{description} + \item[Enable DB] db\_configure reads {\em ducc.database.host} and if it is set to + ``--disabled--'', a global variable is set to indicate the DB is disabled. Otherwise + it reads the database password from {\em ducc.private.properties} and sets that into + a global variable. If there is no password set, the DB is set disabled regardless of + the value of {\em ducc.database.host}. + \item[DB process running] db\_process\_alive attempts to determine if the database process + is running (which is not equivalent to the database being functional). + \item[DB functional] db\_alive attempts to contact the database by calling the Java bootstrap + routine ``DbAlive'' (see previous section). It returns true or false to indicate whether + the DB appears functional. + \item[DB stop] db\_stop uses the Cassandra pid to send {\em kill -TERM} to stop the DB process. + \end{description} + \item[db\_util.py] This contains database methods that can be called from any scripting that + need not extend {\em DuccUtil}. It contains methods to stop the database, update + {\em cassandra.yaml} with the value of {\em ducc.head}, and assist parsing and formatting + the results of executing {\em cqlsh}. + \item[ducc.py] This contains a method to start Cassandra. It is called from {\em start\_ducc} + and {\em startsim}. + \item[start\_ducc] This contains calls to {\em ducc\_util.py} and {\em ducc.py} to configure + and start the DB, and then insure it comes up before starting the rest of DUCC. + \item[stop\_ducc] This contains calls to {\em ducc\_util.py} to stop the DB. + \item[check\_ducc] This contains calls to {\em ducc\_util.py} to determine if the DB is running or not. + \item[ducc\_post\_install] This contains calls to {\em db\_util.py} to configure the + {\em ducc.head}, prompt for the database password, and initialize the schema. + \item[db\_create] This contains methods to define a database, independently of + {\em ducc\_post\_install} intended for migration purposes. + \item[db\_loader] This contains calls to the java utility {\em DbLoader} to + migrate existing state and history files to the database. + \item[startsim] This contains calls to {\em ducc\_util.y} to start the database and + insure it starts correctly. + \item[stop\_sim] This contains calls to {\em ducc\_util.y} to stop the databse. + \end{description} + +