[2/5] incubator-trafodion git commit: Replaced logo, added drafts of new manuals, updated SQL Reference, corrected issues in other manuals.

dbirdsall Thu, 03 Mar 2016 12:13:59 -0800

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/cqd_reference/src/asciidoc/_chapters/schema_controls.adoc
----------------------------------------------------------------------
diff --git a/docs/cqd_reference/src/asciidoc/_chapters/schema_controls.adoc 
b/docs/cqd_reference/src/asciidoc/_chapters/schema_controls.adoc
new file mode 100644
index 0000000..0112b9a
--- /dev/null
+++ b/docs/cqd_reference/src/asciidoc/_chapters/schema_controls.adoc
@@ -0,0 +1,49 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[schema-controls]]
+= Schema Controls
+
+This section describes CQDs that are used for schema controls.
+
+[[schema]]
+== Schema
+
+[cols="25%h,75%"]
+|===
+| *Category*                  | Influencing Query Plans 
+| *Description*               | Sets the default schema for the session. 
+| *Values*                    | SQL identifier. +
+ +
+The default is *'TRAFODION'*.
+| *Usage*                     | A SET SCHEMA statement or a CONTROL QUERY 
DEFAULT SCHEMA statement can be used to override the default schema name.
+| *Production Usage*          | It is a convenience so you do not have to type 
in two-part names. 
+| *Impact*                    | Not applicable.
+| *Level*                     | Any.
+| *Conflicts/Synergies*       | Alternately you can use the SET SCHEMA 
statement.
+| *Real Problem Addressed*    | Not applicable.
+| *Introduced In Release*     | Trafodion 1.3.0.
+| *Deprecated In Release*     | Not applicable.
+|===


http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/cqd_reference/src/asciidoc/_chapters/transactions.adoc
----------------------------------------------------------------------
diff --git a/docs/cqd_reference/src/asciidoc/_chapters/transactions.adoc 
b/docs/cqd_reference/src/asciidoc/_chapters/transactions.adoc
new file mode 100644
index 0000000..5f590fb
--- /dev/null
+++ b/docs/cqd_reference/src/asciidoc/_chapters/transactions.adoc
@@ -0,0 +1,94 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[transaction-control-and-locking]]
+= Transaction Control and Locking
+
+This section describes CQDs that are used for transaction control and locking.
+
+[[isolation-level]]
+== ISOLATION_LEVEL
+
+[cols="25%h,75%"]
+|===
+| *Category*                  | Transaction Control and Locking
+| *Description*               | Specifies the default transaction isolation 
level that queries use.
+| *Values*                    | *'READ UNCOMMITTED'*, *'READ COMMITTED'*, 
*'REPEATABLE READ'*, or *'SERIALIZABLE'*. +
+ +
+The default value is *'READ COMMITTED'* (ANSI).
+| *Usage*                     | If you use uncommitted access (reading "dirty" 
data when queries are accessing data that is
+being simultaneously updated), then you can set the default isolation level as 
READ UNCOMMITTED. The default isolation level
+of READ COMMITTED can cause concurrency issues because reads would wait on 
locked rows. If rows are locked by long-running
+transactions with infrequent commits, this can cause severe concurrency issues 
for SELECT queries. See Conflicts/Synergies.
+| *Production Usage*          | Not applicable.
+| *Impact*                    | Using this CQD has implications on locking and 
concurrency. +
+ +
+If set to READ UNCOMMITTED, then select queries read through locks and don't 
have to wait on locks. But they won't see committed consistent data. +
+ +
+If set to READ COMMITTED (the default setting), then the reads wait on locked 
rows before they proceed with the scan.
+The read can proceed only when the rows locked by another transaction are 
released after that transaction commits. The reader does not lock rows. +
+ +
+If set to REPEATABLE READ or SERIALIZABLE, then it has severe implications on 
concurrency because every row read is also locked.
+| *Level*                     | While you can use this at a query or a service 
level, the most common use is a system-wide setting.
+If query tools are being used, then the query level setting cannot be used. +
+ +
+A service level setting may provide uncommitted access to certain users while 
providing the default committed access to the other users,
+depending which users need to see consistent data. +
+ +
+If however, access to tables during updates is well controlled and read 
uncommitted is acceptable, then this can be set at the system level.
+| *Conflicts/Synergies*       | The problem with using READ UNCOMMITTED as the 
isolation level default value is that in a SET TRANSACTION statement,
+the only possible access mode is READ ONLY. Any query that attempts to update 
the database would fail. +
+ +
+To facilitate updates and DDL statements while the isolation level is set to 
READ UNCOMMITTED, a new default attribute ISOLATION_LEVEL_FOR_UPDATES
+is provided. This default attribute specifies the isolation level for update 
and DDL statements. If not specified, or if not present in the
+SYSTEM_DEFAULTS table, the default value is the same as the ISOLATION_LEVEL 
default attribute. However, if specified or present in the SYSTEM_DEFAULTS 
table,
+then its value is used as the isolation level for updates and DDL statements. 
UPDATE in ISOLATION_LEVEL_FOR_UPDATES refers to INSERT, UPDATE, and DELETE 
statements.
+| *Real Problem Addressed*    | Not applicable.
+| *Introduced In Release*     | Trafodion 1.3.0.
+| *Deprecated In Release*     | Not applicable.
+|===
+
+<<<
+[[isolation-level-for-updates]]
+== ISOLATION_LEVEL_FOR_UPDATES
+
+[cols="25%h,75%"]
+|===
+| *Category*                  | Transaction Control and Locking
+| *Description*               | Specifies the default transaction isolation 
level for these update operations: INSERT, UPDATE, or DELETE.
+| *Values*                    | *'READ UNCOMMITTED'*, *'READ COMMITTED'*, 
*'REPEATABLE READ'*, or *'SERIALIZABLE'*. +
+ +
+The default value is *'READ COMMITTED'* (ANSI).
+| *Usage*                     | Set this CQD to READ UNCOMMITTED to prevent 
users from performing any updates. 
+| *Production Usage*          | Not applicable.
+| *Impact*                    | When set, this CQD prevents users from doing 
any of these update operations: INSERT, UPDATE, or DELETE.
+| *Level*                     | Service.
+| *Conflicts/Synergies*       | Works with the ISOLATION_LEVEL setting. Both 
settings are READ COMMITTED by default.
+ISOLATION_LEVEL can be set to READ UNCOMMITTED. This CQD still remains READ 
COMMITTED. You can change it to READ UNCOMMITED to prevent queries
+running at the service level to not perform any updates.
+| *Real Problem Addressed*    | Not applicable.
+| *Introduced In Release*     | Trafodion 1.3.0.
+| *Deprecated In Release*     | Not applicable.
+|===

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/cqd_reference/src/asciidoc/_chapters/update_stats.adoc
----------------------------------------------------------------------
diff --git a/docs/cqd_reference/src/asciidoc/_chapters/update_stats.adoc 
b/docs/cqd_reference/src/asciidoc/_chapters/update_stats.adoc
new file mode 100644
index 0000000..51c0204
--- /dev/null
+++ b/docs/cqd_reference/src/asciidoc/_chapters/update_stats.adoc
@@ -0,0 +1,92 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[update-statistics-and-reorg]]
+= Update Statistics and Reorg
+
+[[ustat-max-read-age-in-min]]
+== USTAT_MAX_READ_AGE_IN_MIN
+
+[cols="25%h,75%"]
+|===
+| *Category*                  | Update Statistics and Reorg
+| *Description*               | When performing update statistics with the 
NECESSARY keyword or with automation, this is
+the number of minutes that are allowed to have elapsed since a histogram was 
marked as read for it to be regenerated. +
+ +
+Histograms that were marked more than USTAT_MAX_READ_AGE_IN_MIN minutes ago 
are not regenerated.
+| *Values*                    | 0 through max unsigned integer. +
+ +
+Setting this CQD to a value less than 2 * USTAT_AUTOMATION_INTERVAL is 
silently ignored and
+2 * USTAT_AUTOMATION_INTERVAL (default value of 2880) is used. +
+ +
+The default value is 5760 (4 days).
+| *Usage*                     | Influences how frequently the histograms for a 
table are regenerated.
+If a table is being used frequently, then chances are that its histograms are 
also be considered for update frequently.
+However, if a table is not used frequently, then this CQD influences how 
frequently the histograms for that table are updated. +
+ +
+A smaller setting reduces the number of histograms being updated if there are 
many tables that have not been used within that interval.
+A larger setting updates histogram for many more tables that are not being 
accessed that often.
+| *Production Usage*          | Not applicable.
+| *Impact*                    | Influences the number of histograms that need 
to be regenerated and therefore the time it takes for
+update statistics automation to regenerate histograms for all the tables that 
so qualify.
+| *Level*                     | System.
+| *Conflicts/Synergies*       | USTAT_AUTOMATION_INTERVAL sets a lower bound 
on this CQD. +
+ +
+It is influenced by USTAT_AUTO_READTIME_UPDATE_INTERVAL, which influences how 
often READ_TIME is updated for the histogram.
+This CQD qualifies the histogram to be regenerated; it is the maximum time 
since READ_TIME was updated. $$$
+| *Real Problem Addressed*    | Not applicable.
+| *Introduced In Release*     | Trafodion 1.3.0.
+| *Deprecated In Release*     | Not applicable.
+|===
+
+<<<
+[[ustat-min-rowcount-for-sample]]
+== USTAT_MIN_ROWCOUNT_FOR_SAMPLE
+
+[cols="25%h,75%"]
+|===
+| *Category*                  | Update Statistics and Reorg
+| *Description*               | Sets the minimum rows that need to be in a 
table before sampling is used to update statistics for that table.
+If a table has a fewer rows than the value of this CQD, then the SAMPLE option 
is silently ignored when performing update statistics.
+| *Values*                    | 1 through max unsigned integer. +
+ +
+The default value is 10000.
+| *Usage*                     | Influences for what tables sampling is used 
for histogram statistics generation. If the setting is smaller,
+then more tables qualify for sampling. If the setting is larger, then fewer 
tables qualify for sampling. Sampling can result in faster update
+statistics run times. But for smaller tables, it can also result in poor 
histogram accuracy which could result in poor query plans. 
+| *Production Usage*          | Not applicable.
+| *Impact*                    | Setting this CQD to a smaller value means that 
sampling is used for tables with fewer rows, when the SAMPLE option is
+specified as part of update statistics. This can result in less accurate 
histograms and poor query plans, because the sample size may be too small to
+generate good estimates for histograms. +
+ +
+Setting this CQD to a larger value can result in sampling not being used for 
many tables and therefore longer update statistics run times.
+However, these tables may also have more accurate histograms.
+| *Level*                     | System.
+| *Conflicts/Synergies*       | Not applicable.
+| *Real Problem Addressed*    | Not applicable.
+| *Introduced In Release*     | Trafodion 1.3.0.
+| *Deprecated In Release*     | Not applicable.
+|===
+

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/cqd_reference/src/asciidoc/index.adoc
----------------------------------------------------------------------
diff --git a/docs/cqd_reference/src/asciidoc/index.adoc 
b/docs/cqd_reference/src/asciidoc/index.adoc
new file mode 100644
index 0000000..f67e482
--- /dev/null
+++ b/docs/cqd_reference/src/asciidoc/index.adoc
@@ -0,0 +1,69 @@
+////
+* @@@ START COPYRIGHT @@@                                                      
   
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@ 
+////
+
+= Control Query Default (CQD) Reference Guide
+:doctype: book
+:numbered:
+:toc: left
+:toclevels: 3
+:toc-title: Table of Contents
+:icons: font
+:iconsdir: icons
+:experimental:
+:source-language: text
+:revnumber: {project-version}
+:title-logo-image: ../../../shared/trafodion-logo.jpg
+
+:images: ../images
+:leveloffset: 1
+
+// The directory is called _chapters because asciidoctor skips direct
+// processing of files found in directories starting with an _. This
+// prevents each chapter being built as its own book.
+
+**License Statement**
+
+Licensed to the Apache Software Foundation (ASF) under one or more contributor 
license agreements.  See the NOTICE file
+distributed with this work for additional information regarding copyright 
ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the "License"); you may not use 
this file except in compliance with
+the License.  You may obtain a copy of the License at 
http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software 
distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.  See the License for the
+specific language governing permissions and limitations under the License.
+
+[cols="2",options="header"]
+|===
+| Version    | Date
+| 1.3.0      | January, 2016  
+|===
+
+include::asciidoc/_chapters/about.adoc[]
+include::asciidoc/_chapters/introduction.adoc[]
+include::asciidoc/_chapters/query_plans.adoc[]
+include::asciidoc/_chapters/histograms.adoc[]
+include::asciidoc/_chapters/transactions.adoc[]
+include::asciidoc/_chapters/runtime_controls.adoc[]
+include::asciidoc/_chapters/schema_controls.adoc[]
+include::asciidoc/_chapters/update_stats.adoc[]
+include::asciidoc/_chapters/operational_controls.adoc[]

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/pom.xml
----------------------------------------------------------------------
diff --git a/docs/load_transform/pom.xml b/docs/load_transform/pom.xml
new file mode 100644
index 0000000..ef00fd5
--- /dev/null
+++ b/docs/load_transform/pom.xml
@@ -0,0 +1,289 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+ <!-- 
+* @@@ START COPYRIGHT @@@                                                      
 
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+-->
+  <modelVersion>4.0.0</modelVersion>
+  <groupId>org.apache.trafodion</groupId>
+  <artifactId>load-transform-guide</artifactId>
+  <version>${env.TRAFODION_VER}</version>
+  <packaging>pom</packaging>
+  <name>Trafodion Load and Transform Guide</name>
+  <description>This guide describes how to load and transform data into 
Trafodion.</description>
+  <url>http://trafodion.incubator.apache.org</url>
+  <inceptionYear>2015</inceptionYear>
+
+  <parent>
+    <groupId>org.apache.trafodion</groupId>
+    <artifactId>trafodion</artifactId>
+    <relativePath>../../pom.xml</relativePath>
+    <version>1.3.0</version>
+  </parent>
+
+
+  <licenses>
+    <license>
+      <name>The Apache Software License, Version 2.0</name>
+      <url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
+      <distribution>repo</distribution>
+      <comments>A business-friendly OSS license</comments>
+    </license>
+  </licenses>
+
+  <organization>
+    <name>Apache Software Foundation</name>
+    <url>http://www.apache.org</url>
+  </organization>
+
+  <issueManagement>
+    <system>JIRA</system>
+    <url>http://issues.apache.org/jira/browse/TRAFODION</url>
+  </issueManagement>
+
+  <scm>
+    
<connection>scm:git:http://git-wip-us.apache.org/repos/asf/incubator-trafodion.git</connection>
+    
<developerConnection>scm:git:https://git-wip-us.apache.org/repos/asf/incubator-trafodion.git</developerConnection>
+    
<url>https://git-wip-us.apache.org/repos/asf?p=incubator-trafodion.git</url>
+    <tag>HEAD</tag>
+  </scm>
+
+  <ciManagement>
+    <system>Jenkins</system>
+    <url>https://jenkins.esgyn.com</url>
+  </ciManagement>
+
+  <properties>
+    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+    
<asciidoctor.maven.plugin.version>1.5.2.1</asciidoctor.maven.plugin.version>
+    <asciidoctorj.pdf.version>1.5.0-alpha.11</asciidoctorj.pdf.version>
+    <asciidoctorj.version>1.5.4</asciidoctorj.version>
+    <rubygems.prawn.version>2.0.2</rubygems.prawn.version>
+    <jruby.version>9.0.4.0</jruby.version>
+  </properties>
+
+  <repositories>
+    <repository>
+      <id>rubygems-proxy-releases</id>
+      <name>RubyGems.org Proxy (Releases)</name>
+      <url>http://rubygems-proxy.torquebox.org/releases</url>
+      <releases>
+        <enabled>true</enabled>
+      </releases>
+      <snapshots>
+        <enabled>false</enabled>
+      </snapshots>
+    </repository>
+  </repositories>
+  
+  <dependencies>
+    <dependency>
+      <groupId>rubygems</groupId>
+      <artifactId>prawn</artifactId>
+      <version>${rubygems.prawn.version}</version>
+      <type>gem</type>
+      <scope>provided</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.jruby</groupId>
+      <artifactId>jruby-complete</artifactId>
+      <version>${jruby.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.asciidoctor</groupId>
+      <artifactId>asciidoctorj</artifactId>
+      <version>${asciidoctorj.version}</version>
+    </dependency>
+  </dependencies>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>de.saumya.mojo</groupId>
+        <artifactId>gem-maven-plugin</artifactId>
+        <version>1.0.10</version>
+        <configuration>
+          <!-- align JRuby version with AsciidoctorJ to avoid redundant 
downloading -->
+          <jrubyVersion>${jruby.version}</jrubyVersion>
+          <gemHome>${project.build.directory}/gems</gemHome>
+          <gemPath>${project.build.directory}/gems</gemPath>
+        </configuration>
+        <executions>
+          <execution>
+            <goals>
+              <goal>initialize</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-resources-plugin</artifactId>
+        <version>2.7</version>
+        <configuration>
+          <encoding>UTF-8</encoding>
+          <attributes>
+            <generateReports>false</generateReports>
+          </attributes>
+        </configuration>
+      </plugin>
+      <plugin>
+        <groupId>org.asciidoctor</groupId>
+        <artifactId>asciidoctor-maven-plugin</artifactId>
+        <version>${asciidoctor.maven.plugin.version}</version> 
+        <dependencies>
+          <dependency>
+            <groupId>org.asciidoctor</groupId>
+            <artifactId>asciidoctorj-pdf</artifactId>
+            <version>${asciidoctorj.pdf.version}</version>
+          </dependency>
+          <dependency>
+            <groupId>org.asciidoctor</groupId>
+            <artifactId>asciidoctorj</artifactId>
+            <version>${asciidoctorj.version}</version>
+          </dependency>
+        </dependencies>
+        <configuration>
+          <sourceDirectory>${basedir}/src</sourceDirectory>
+        </configuration>
+        <executions>
+          <execution>
+            <id>generate-html-doc</id> 
+            <goals>
+              <goal>process-asciidoc</goal> 
+            </goals>
+            <phase>site</phase>
+            <configuration>
+              <doctype>book</doctype>
+              <backend>html5</backend>
+              <sourceHighlighter>coderay</sourceHighlighter>
+              <outputDirectory>${basedir}/target/site</outputDirectory>
+              <requires>
+                
<require>${basedir}/../shared/google-analytics-postprocessor.rb</require>
+              </requires>
+              <attributes>
+                <!-- Location of centralized stylesheet -->
+                
<stylesheet>${basedir}/../shared/trafodion-manuals.css</stylesheet>
+                <project-version>${env.TRAFODION_VER}</project-version>
+                <build-date>${maven.build.timestamp}</build-date>
+                
<google-analytics-account>UA-72491210-1</google-analytics-account>
+              </attributes>
+            </configuration>
+          </execution>
+          <execution>
+            <id>generate-pdf-doc</id>
+            <phase>site</phase>
+            <goals>
+              <goal>process-asciidoc</goal>
+            </goals>
+            <configuration>
+              <doctype>book</doctype>
+              <backend>pdf</backend>
+              <sourceHighlighter>coderay</sourceHighlighter>
+              <outputDirectory>${basedir}/target</outputDirectory>
+              <attributes>
+                <project-version>${env.TRAFODION_VER}</project-version>
+                <build-date>${maven.build.timestamp}</build-date>
+                <pdf-stylesdir>${basedir}/../shared</pdf-stylesdir>
+                <pdf-style>trafodion</pdf-style>
+                <icons>font</icons>
+                <pagenums/>
+                <toc/>
+                <idprefix/>
+                <idseparator>-</idseparator>
+              </attributes>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin> 
+      <!-- Rename target/site/index.pdf to client-install-guide.pdf -->
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-antrun-plugin</artifactId>
+        <version>1.8</version>
+        <inherited>false</inherited>
+        <executions>
+          <execution>
+            <id>populate-release-directories</id>
+            <phase>post-site</phase>
+            <configuration>
+              <target name="Populate Release Directories">
+                <!-- The website uses the following organization for the 
docs/target/docs directory:
+                  - To ensure a known location, the base directory contains 
the LATEST version of the web book and the PDF files.
+                  - The know location is docs/target/docs/<document>
+                  - target/docs/<version>/<document> contains version-specific 
renderings of the documents.
+                  - target/docs/<version>/<document> contains the PDF version 
and the web book. The web book is named index.html
+                --> 
+                <!-- Copy the PDF file to its target directories -->
+                <copy file="${basedir}/target/index.pdf" 
tofile="${basedir}/../target/docs/load_transform_guide/Trafodion_Load_Transform_Guide.pdf"
 />
+                <copy file="${basedir}/target/index.pdf" 
tofile="${basedir}/../target/docs/${project.version}/load_transform_guide/Trafodion_Load_Transform_Guide.pdf"
 />
+                <!-- Copy the Web Book files to their target directories -->
+                <copy todir="${basedir}/../target/docs/load_transform_guide">
+                  <fileset dir="${basedir}/target/site">
+                    <include name="**/*.*"/>  <!--All sub-directories, too-->
+                  </fileset>
+                </copy>
+                <copy 
todir="${basedir}/../target/docs/${project.version}/load_transform_guide">
+                  <fileset dir="${basedir}/target/site">
+                    <include name="**/*.*"/>  <!--All sub-directories, too-->
+                  </fileset>
+                </copy>
+              </target>
+            </configuration>
+            <goals>
+              <goal>run</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+    </plugins>
+  </build>
+
+  <!-- Included because this is required. No reports are generated. -->
+  <reporting>
+    <excludeDefaults>true</excludeDefaults>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-project-info-reports-plugin</artifactId>
+        <version>2.8</version>
+        <reportSets>
+          <reportSet>
+            <reports>
+            </reports>
+          </reportSet>
+        </reportSets>
+      </plugin>
+    </plugins>
+  </reporting>
+
+  <distributionManagement>
+    <site>
+      <id>trafodion.incubator.apache.org</id>
+      <name>Trafodion Website at incubator.apache.org</name>
+      <!-- On why this is the tmp dir and not trafodion.incubator.apache.org, 
see
+      
https://issues.apache.org/jira/browse/HBASE-7593?focusedCommentId=13555866&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13555866
+      -->
+      <url>file:///tmp</url>
+    </site>
+  </distributionManagement>
+</project>

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/src/asciidoc/_chapters/about.adoc
----------------------------------------------------------------------
diff --git a/docs/load_transform/src/asciidoc/_chapters/about.adoc 
b/docs/load_transform/src/asciidoc/_chapters/about.adoc
new file mode 100644
index 0000000..4ef4bd0
--- /dev/null
+++ b/docs/load_transform/src/asciidoc/_chapters/about.adoc
@@ -0,0 +1,186 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+= About This Document
+This guide describes how to load and transform data into a Trafodion database. 
+
+The information herein is complementary to the following Trafodion 
documentation:
+
+* http://trafodion.apache.org/docs/sql_reference/index.html[Trafodion SQL 
Reference Manual]
+* http://trafodion.apache.org/docs/odb/index.html[Trafodion odb User Guide]
+
+== Intended Audience
+This guide targets anyone wanting to load data into a Trafodion database. You 
need to have the
+following skill sets to make full of the information in this guide:
+
+* SQL DDL and DML.
+* Install and configure Linux software.
+* Trafodion administration.
+* Depending on your data source, Java and/or Hadoop ecosystem usage.
+
+== New and Changed Information
+This is a new guide.
+
+== Notation Conventions
+This list summarizes the notation conventions for syntax presentation in this 
manual.
+
+* UPPERCASE LETTERS
++
+Uppercase letters indicate keywords and reserved words. Type these items 
exactly as shown. Items not enclosed in brackets are required. 
++
+```
+SELECT
+```
+
+* lowercase letters
++
+Lowercase letters, regardless of font, indicate variable items that you 
supply. Items not enclosed in brackets are required.
++
+```
+file-name
+```
+
+* &#91; &#93; Brackets 
++
+Brackets enclose optional syntax items.
++
+```
+DATETIME [start-field TO] end-field
+```
++
+A group of items enclosed in brackets is a list from which you can choose one 
item or none.
++
+The items in the list can be arranged either vertically, with aligned brackets 
on each side of the list, or horizontally, enclosed in a pair of brackets and 
separated by vertical lines.
++
+For example: 
++
+```
+DROP SCHEMA schema [CASCADE]
+DROP SCHEMA schema [ CASCADE | RESTRICT ]
+```
+
+* { } Braces 
++
+Braces enclose required syntax items.
++
+```
+FROM { grantee [, grantee ] ... }
+```
++ 
+A group of items enclosed in braces is a list from which you are required to 
choose one item.
++
+The items in the list can be arranged either vertically, with aligned braces 
on each side of the list, or horizontally, enclosed in a pair of braces and 
separated by vertical lines.
++
+For example:
++
+```
+INTERVAL { start-field TO end-field }
+{ single-field } 
+INTERVAL { start-field TO end-field | single-field }
+``` 
+
+* | Vertical Line 
++
+A vertical line separates alternatives in a horizontal list that is enclosed 
in brackets or braces.
++
+```
+{expression | NULL} 
+```
+
+* &#8230; Ellipsis
++
+An ellipsis immediately following a pair of brackets or braces indicates that 
you can repeat the enclosed sequence of syntax items any number of times.
++
+```
+ATTRIBUTE[S] attribute [, attribute] ...
+{, sql-expression } ...
+```
++ 
+An ellipsis immediately following a single syntax item indicates that you can 
repeat that syntax item any number of times.
++
+For example:
++
+```
+expression-n ...
+```
+
+* Punctuation
++
+Parentheses, commas, semicolons, and other symbols not previously described 
must be typed as shown.
++
+```
+DAY (datetime-expression)
+@script-file 
+```
++
+Quotation marks around a symbol such as a bracket or brace indicate the symbol 
is a required character that you must type as shown.
++
+For example:
++
+```
+"{" module-name [, module-name] ... "}"
+```
+
+* Item Spacing
++
+Spaces shown between items are required unless one of the items is a 
punctuation symbol such as a parenthesis or a comma.
++
+```
+DAY (datetime-expression) DAY(datetime-expression)
+```
++
+If there is no space between two items, spaces are not permitted. In this 
example, no spaces are permitted between the period and any other items:
++
+```
+myfile.sh
+```
+
+* Line Spacing
++
+If the syntax of a command is too long to fit on a single line, each 
continuation line is indented three spaces and is separated from the preceding 
line by a blank line.
++
+This spacing distinguishes items in a continuation line from items in a 
vertical list of selections. 
++
+```
+match-value [NOT] LIKE _pattern
+   [ESCAPE esc-char-expression] 
+```
+
+== Publishing History
+[cols="2*",options="header"]
+|===
+| Product Version | Publication Date
+| Trafodion Release 1.3.0 | January 2016.
+|===
+
+== Comments Encouraged
+The Trafodion community encourages your comments concerning this document. We 
are committed to providing documentation that meets your
+needs. Send any errors found, suggestions for improvement, or compliments to:
+
[email protected]
+
+Include the document title and any comment, error found, or suggestion for 
improvement you have concerning this document. Or, even
+better, join our community and help us improve our documentation. Please refer 
to 
+http://trafodion.incubator.apache.org/contributing_redirect.html[Trafodion 
Contributor Guide] for details.

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/src/asciidoc/_chapters/bulk_load.adoc
----------------------------------------------------------------------
diff --git a/docs/load_transform/src/asciidoc/_chapters/bulk_load.adoc 
b/docs/load_transform/src/asciidoc/_chapters/bulk_load.adoc
new file mode 100644
index 0000000..03f1385
--- /dev/null
+++ b/docs/load_transform/src/asciidoc/_chapters/bulk_load.adoc
@@ -0,0 +1,291 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[bulk-load]]
+= Bulk Load
+
+The LOAD statement enables batch loading large volumes of data efficiently in 
a scalable manner.
+
+See the 
http://trafodion.apache.org/docs/sql_reference/index.html#load_statement[Trafodion
 SQL Reference Manual]
+for a full description of this SQL statement.
+
+You can bulk-load data using one of the following methods:
+
+* <<bulk-load-data-from-trafodion-tables,Load Data From Trafodion Tables>>
+* <<bulk-load-data-from-hdfs-files, Load Data From HDFS Files>>
+* <<bulk-load-data-from-hive-tables,Load Data From Hive Tables>>
+* <<bulk-load-data-from-external-databases,Load Data From External Databases>>
+
+[[bulk-load-from-trafodion-tables]]
+== Load Data From Trafodion Tables
+
+You copy data between two Trafodion tables by using the appropriate SELECT 
statement in the LOAD command.
+
+=== Example
+
+```
+```
+
+[[bulk-load-data-from-hdfs-files]]
+== Load Data From HDFS Files
+
+You can copy your data into an HDFS folder and then create an external Hive 
tables with the correct fields that
+you point to the HDFS folder containing the data. You can also specify a WHERE 
clause on the source data as a filter.
+See the 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ExternalTables[External
 Tables]
+page on the https://cwiki.apache.org/confluence/display/Hive[Hive Wiki] for 
more information.
+
+When Hive tables are accessed through Trafodion columns with integral, string, 
and char types are accessible.
+See the 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types[LanguageManual
 Types]
+page on the https://cwiki.apache.org/confluence/display/Hive[Hive Wiki] for 
the data types available in Hive.
+
+Overall, you do the following:
+
+1. Export the data on the remote cluster.
+2. Transport files to Trafodion cluster via FTP, scp, or some other method.
+3. Use LOAD referencing HIVE external tables.
+
+=== Example
+
+
+NEED A DESCRIPTION OF THE USE CASE, THE LOAD FILE, AND ITS LAYOUT.
+
+Do the following:
+
+1. Using trafci, define the Trafodion table where you want to load the data.
++
+```
+CREATE TABLE customer_demographics_salt
+(
+    cd_demo_sk            INT NOT NULL
+  , cd_gender             CHAR(1)
+  , cd_marital_status     CHAR(1)
+  , cd_education_status   CHAR(20)
+  , cd_purchase_estimate  INT
+  , cd_credit_rating      CHAR(10)
+  , cd_dep_count          INT
+  , cd_dep_employed_count INT
+  , cd_dep_college_count  INT
+  , PRIMARY KEY (cd_demo_sk)
+)
+SALT USING 4 PARTITIONS ON (cd_demo_sk)
+;
+```
+
+2. Copy the data into HDFS 
++
+```
+hadoop fs -copyFromLocal $HOME/data/customer_demographics 
/hive/tpcds/customer_demographics
+```
+
+3. Using the Hive shell, create an external Hive table:
++
+```
+CREATE EXTERNAL TABLE customer_demographics
+(
+    cd_demo_sk            INT
+  , cd_gender             STRING
+  , cd_marital_status     STRING
+  , cd_education_status   STRING
+  , cd_purchase_estimate  INT
+  , cd_credit_rating      STRING
+  , cd_dep_count          INT
+  , cd_dep_employed_count INT
+  , cd_dep_college_count  INT
+)
+ROW FORMAT DELIMITED
+FIELDS TERMINATED BY '|'
+LOCATION '/hive/tpcds/customer_demographics'
+;
+```
+
+4. Using trafci, load the Trafodion `customer_demographics_salt` table from 
the Hive table named
+`hive.hive.customer_demographics`:
++
+```
+>>LOAD INTO customer_demographics_salt
++>SELECT * FROM hive.hive.customer_demographics WHERE cd_demo_sk <= 5000;
+Task: LOAD Status: Started Object: TRAFODION.HBASE.CUSTOMER_DEMOGRAPHICS_SALT
+Task: DISABLE INDEX Status: Started Object: 
TRAFODION.HBASE.CUSTOMER_DEMOGRAPHICS_SALT
+Task: DISABLE INDEX Status: Ended Object: 
TRAFODION.HBASE.CUSTOMER_DEMOGRAPHICS_SALT
+Task: PREPARATION Status: Started Object: 
TRAFODION.HBASE.CUSTOMER_DEMOGRAPHICS_SALT
+       Rows Processed: 5000
+Task: PREPARATION Status: Ended ET: 00:00:03.199
+Task: COMPLETION Status: Started Object: 
TRAFODION.HBASE.CUSTOMER_DEMOGRAPHICS_SALT
+Task: COMPLETION Status: Ended ET: 00:00:00.331
+Task: POPULATE INDEX Status: Started Object: 
TRAFODION.HBASE.CUSTOMER_DEMOGRAPHICS_SALT
+Task: POPULATE INDEX Status: Ended ET: 00:00:05.262
+```
+
+[[bulk-load-data-from-hive-tables]]
+== Load Data From Hive Tables
+
+You can import data from Hive using the TrafCI or sqlci command interface. Do 
the following:
+
+1. Set these required Control Query Defaults (CQDs) to improve load 
performance:
++
+```
+CQD HIVE_MAX_STRING_LENGTH '1000'; // if the widest column is 1KB
+```
++
+This setting is required if there are time-related column types in the target 
Trafodion table.
++
+```
+CQD ALLOW_INCOMPATIBLE_ASSIGNMENT 'on'; 
+```
+
+2. Issue the LOAD statement to load data into Trafodion tables from Hive. For 
example:
++
+```
+LOAD WITH NO POPULATE INDEXES INTO trafodion.sch.demo SELECT * FROM 
hive.hive.demo;
+```
+
+See the 
http://trafodion.incubator.apache.org/docs/sql_reference/index.html#load_statement[Trafodion
 SQL Reference Manual].
+for the completed syntax of the LOAD statement.
+
+If multiple invocations of the LOAD statement are used to incrementally load 
sets of data to a single target table, then several HFiles
+for each partition of the target table are created. This causes inefficient 
access during SELECT queries and may also cause a compaction
+to be triggered based on the policies configured in the HBase settings.
+
+To avoid this issue, it is good practice to perform a major compaction on a 
table that has been the target of more than two LOAD statements
+in a short period of time. To perform compaction, use this `hbase shell` 
command:
+
+```
+major_compact 'TRAFODION.SCH.DEMO'
+```
+
+NOTE: The `major_compact` command returns immediately since it's not waited. 
Typically, compaction of a large table takes a long time
+(several minutes to hours) to complete. You can monitor the progress of 
compaction from the HBase Master Web user interface.
+
+=== Example
+
+```
+>> CQD HIVE_MAX_STRING_LENGTH '1000' ;
+>> CQD ALLOW_INCOMPATIBLE_ASSIGNMENT 'on' ;
+>> LOAD WITH NO POPULATE INDEXES INTO trafodion.sch.demo SELECT * FROM 
hive.hive.demo ;
+
+<NEED MORE HERE, INCLUDING CREATING THE TRAFODION TABLE AND WHAT TO DO ABOUT 
INDEXES>
+```
+
+[[bulk-load-data-from-external-databases]]
+== Load Data From External Databases
+
+You need to import data into Hive when loading data from external databases.
+Use http://sqoop.apache.org/[Apache Sqoop], an open-source tools to move the 
data from the external database
+into Hive tables on the Trafodion cluster.
+
+Source data can be in the following formats:
+
+[cols="40%,60%", options="header"]
+|===
+| Format | Examples
+| *Structured* | Relational databases such as Oracle or MySQL.
+| *Semi-Structured* | Cassandra or HBase
+| *Unstructured* | HDFS
+|===
+
+You use the Sqoop command-line shell for interactive commands and basic 
scripting.
+
+Sqoop basics: 
+
+* Generic JDBC Connector: supports JDBC T-4 Driver.
+* Configuration Language for FROM/TO jobs that specify in SQL terms.
+* Partitioner:  Divide/parallelize the data streams; uses primary key by 
default.
+* Extractor:  Uses FROM configuration for SQL statements, plus partitioner 
information to query data subsets.
+* Loader:  Uses TO job configuration; INSERT INTO could be generated from col 
list or explicitly specified.
+* Destroyer:  Copies staging table to final table and deletes staging table.
+
+See the http://sqoop.apache.org/docs/1.99.6/Sqoop5MinutesDemo.html[Sqoop 5 
Minutes Demo] for a quick
+introduction to Sqoop.
+ 
+
+=== Install Required Software
+
+By default, Sqoop is not installed on Trafodion clusters. Do the following:
+
+* Install and start Sqoop on the Trafodion cluster using either the Ambari or 
Cloudera Manager GUI.
+See the http://sqoop.apache.org/docs/1.99.3/Installation.html[Sqoop 
installation instructions].
+
+* Install 
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html[JDK
 1.8]
+
+* Install the 
http://www.oracle.com/technetwork/database/features/jdbc/index-091264.html[Oracle
 JDBC driver] 
+
+* Set the following environment variables:
++
+```
+export JAVA_HOME=/opt/java/jdk1.8.0_11
+export 
JAVA_OPTIONS=-Dmapred.child.java.opts=\-Djava.security.egd=file:/dev/urandom+
+```
+
+=== Sample Sqoop Commands
+
+==== List All Oracle Tables
+
+```
+sqoop list-tables --driver oracle.jdbc.OracleDriver --connect 
jdbc:oracle:thin:@<Oracle host name>:<port>/<database>
+--username <user-name> --password <password>
+```
+
+==== Import Data to Hive
+
+*Syntax*
+
+```
+sqoop import --connect jdbc:oracle:thin:@<Oracle host name:port>/<database>
+--username <user-name> --password <password> --table <tablename>
+--split-by <column-name> --hive-import --create-hive-table
+--hive-table <hive-table-name> --hive-overwrite --null-string ''
+--null-non-string '' --hive-drop-import-delims--verbose
+```
+
+[cols="40%,60%",options="header"]
+|===
+| Parameter | Guidelines
+| `--split-by <column-name>`
+| By default, if not specified, sqoop uses the primary key column as a 
splitting column, which is not optimal most of the time.
+If the table does not contain a primary key, then you must manually specify 
the splitting column.
+| `--null-string <null-string>`
+| This is the string to be written for a null value in a string column.
+| `--null-non-string <null-string>`
+| This is the string to be written for a null value in a non-string column.
+| `--hive-drop-import-delims`
+| This drops `\n`, `\r`, and `\01` string fields when importing to Hive. +
+ +
+*NOTE*: If the data contains \n or \r and if you do not use the 
hive-drop-import-delims option, then data is truncated.
+You need to use additional Sqoop options during migration by specifying the 
delimiter that you would like to use,
+which does not exist in the data itself.
+|===
+
+=== Example
+
+ADD EXAMPLE HERE.
+
+```
+```
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/src/asciidoc/_chapters/bulk_unload.adoc
----------------------------------------------------------------------
diff --git a/docs/load_transform/src/asciidoc/_chapters/bulk_unload.adoc 
b/docs/load_transform/src/asciidoc/_chapters/bulk_unload.adoc
new file mode 100644
index 0000000..5ba2349
--- /dev/null
+++ b/docs/load_transform/src/asciidoc/_chapters/bulk_unload.adoc
@@ -0,0 +1,73 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[bulk-unload]]
+= Bulk Unload
+
+The 
http://trafodion.apache.org/docs/sql_reference/index.html#unload_statement[UNLOAD]
 statement is a
+Trafodion extension that exports data from Trafodion tables into an HDFS 
location that you specify. 
+Refer to the 
http://trafodion.apache.org/docs/sql_reference/index.html#unload_statement[Trafodion
 SQL Reference Manual]
+for complete documentation.
+
+The extracted data can be either compressed or uncompressed based on what you 
choose. 
+UNLOAD performs multiple steps based on the options you give; for example:
+
+* If using snapshot scan:
+** Get list of Trafodion tables from the query plan.
+** Create/verify snapshots.
+
+* Purge target location, if specified
+
+* Extract: 
+** Copy table data to data files.
+** Non-compressed is straight copy.
+** Compressed means compression takes place while writing data (no extra step)
+
+* Merge Data Files if specified
+
+*Example*
+
+This example shows how the UNLOAD statement extracts data from a
+Trafodion table, `TRAFODION.HBASE.CUSTOMER_DEMOGRAPHICS`, into an HDFS
+folder, `/bulkload/customer_demographics`:
+
+```
+>>UNLOAD
++>WITH PURGEDATA FROM TARGET
++>MERGE FILE 'merged_customer_demogs.gz' OVERWRITE
++>COMPRESSION GZIP
++>INTO '/bulkload/customer_demographics'
++>SELECT * FROM trafodion.hbase.customer_demographics
++><<+ cardinality 10e10 ,+ cardinality 10e10 >>;
+Task: UNLOAD Status: Started
+Task: EMPTY TARGET Status: Started
+Task: EMPTY TARGET Status: Ended ET: 00:00:00.014
+Task: EXTRACT Status: Started
+       Rows Processed: 200000
+Task: EXTRACT Status: Ended ET: 00:00:04.743 Task: MERGE FILES Status: Started
+Task: MERGE FILES Status: Ended ET: 00:00:00.063
+
+--- 200000 row(s) unloaded.
+```

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/src/asciidoc/_chapters/introduction.adoc
----------------------------------------------------------------------
diff --git a/docs/load_transform/src/asciidoc/_chapters/introduction.adoc 
b/docs/load_transform/src/asciidoc/_chapters/introduction.adoc
new file mode 100644
index 0000000..2bbc3e1
--- /dev/null
+++ b/docs/load_transform/src/asciidoc/_chapters/introduction.adoc
@@ -0,0 +1,117 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[introduction]]
+= Introduction
+
+[[introduction-load-methods]]
+== Load Methods
+There are two methods used to load data into a Trafodion table. Both methods 
can run while the database is concurrently queried: 
+
+[cols="20%,40%,40%",options="header"]
+|===
+| Type           | Description         | Methods/Tools
+| *Bulk Load* | Large data volumes +
+Stage data and load in the batches | Trafodion Bulk Loader
+| *Trickle Load* | Small data volumes +
+Insert data at it arrives | odb tool +
+ETL tool or custom ODBC/JDBC application
+|===
+
+These two methods use four types of SQL insert statements
+
+* *Bulk Load*
+** 
http://trafodion.apache.org/docs/sql_reference/index.html#load_statement[LOAD]
+
+* *Trickle Load*
+** 
http://trafodion.incubator.apache.org/docs/sql_reference/index.html#insert_statement[INSERT]
+** 
http://trafodion.apache.org/docs/sql_reference/index.html#upsert_statement[UPSERT]
+** 
http://trafodion.apache.org/docs/sql_reference/index.html#upsert_statement[UPSERT
 USING LOAD]
+
+The 
http://trafodion.incubator.apache.org/docs/sql_reference/index.html[Trafodion 
SQL Reference Manual]
+provides syntax descriptions for these statements.
+
+The data source defines what type of load approach and method you use:
+
+* *Bulk Load* (LOAD statement)
+** _Text Files_: Map an external Hive table.
+** _JDBC-Compliant Database_: Load into Hive on the Trafodion cluster using 
`sqoop`.
+** _Hive Tables_: Direct load.
+** _Native HBase Tables_: Direct load.
+** _Disparate Data Source_: Write Java/C++ UDF to read data from source and 
pass rows to LOAD.
+
+* *Trickle Load* (odb utility)
+** _Text Files_: Direct access
+** _pipes_: Via `stdin`
+** _ODBC-Compliant Database_: odb COPY command, no intermediate storage
+
+For more information, refer to:
+
+* <<bulk-load,Bulk Load>>
+* <<trickle-load, Trickle Load>>
+
+[[introduction-insert-types]]
+=== Insert Types
+
+The following insert types are supported in Trafodion:
+
+* `INSERT INTO T &#8230;`
+* `UPSERT INTO T &#8230;`
+* `UPSERT USING LOAD INTO T &#8230;`
+* `LOAD INTO T &#8230;`
+
+The following table compares the different insert types:
+
+[cols="20%,20%,20%,20%,20%",options="header"]
+|===
+| Characteristic | INSERT | UPSERT | UPSERT USING LOAD | LOAD
+| *Transaction* | Yes | Yes | No, uses HBase WAL for recovery | No, uses 
snapshot for recovery
+| *Method of Operation* | Uses the standard HBase write path through its 
`CheckAndPut` call. Rows are held in transaction co-processor memory until the 
transaction is committed. | Uses the standard HBase write path through its 
`Put` call. | Uses the standard HBase write path through its `Put` call. | Uses 
the HBase bulk load write path and creates HFiles directly, bypassing HBase 
RegionServers for most of its operation.
+| *Uniqueness Constraint* | Enforced | Not enforced. New row with the same key 
value overwrites previous row. | Not enforced. New row with same key value 
overwrites the previous row. | Enforced only within the set of rows in a single 
statement. Not enforced with rows already in the table.
+| *Index* | Can be used on a table with an index. | Can be used on a table 
with an index. | When used on a table with an index, it reverts to UPSERT. | 
Can be used on a table with an index. Index is off-line during the LOAD.
+| *Max Size/Invocation* | 10,000 * n^1^ rows | 10,000 * n^1^ rows | 5 million 
* n^1^ rows | 2 billion * n^1^ rows
+| *Min Size/Invocation* | 1 row | 1 row | 1 row | Suitable for greater than 1 
million * n^1^ rows
+| *Speed* | Slowest | Faster than INSERT | Faster than UPSERT | Fastest
+|===
+
+^1^ *n* is the number of nodes in each invocation.
+
+Throughput, max/min sizes depends on multiple factors:
+
+* Format of rows in Trafodion table (aligned format or not).
+* Length of row.
+* Number of columns in row.
+* Data type of columns.
+* Network between nodes in cluster.
+* WAL setting.
+* Number of clients.
+* Use of rowsets.
+
+== Unload
+
+The Trafodion UNLOAD statement exports data from Trafodion tables into an HDFS 
directory. Refer to <<bulk-unload,Bulk Unload>> for more information.
+
+
+

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/src/asciidoc/_chapters/monitor.adoc
----------------------------------------------------------------------
diff --git a/docs/load_transform/src/asciidoc/_chapters/monitor.adoc 
b/docs/load_transform/src/asciidoc/_chapters/monitor.adoc
new file mode 100644
index 0000000..ab66bf8
--- /dev/null
+++ b/docs/load_transform/src/asciidoc/_chapters/monitor.adoc
@@ -0,0 +1,77 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[monitor-progress]]
+= Monitor Progress
+
+== INSERT and UPSERT
+
+For an INSERT statement, rows are written to the HBase table that represents 
the Trafodion table when the transaction commits. 
+It is more difficult to see query progress here.
+
+== UPSERT USING LOAD
+
+For an UPSERT USING LOAD statement, rows added are visible in the Trafodion 
table after each `ListOfPut` call succeeds. 
+You can use a `SELECT COUNT(*)` statement to monitor progress. That way, you 
know how many rows are already in the table when the 
+statement starts executing. 
+
+```
+SELECT COUNT(*) FROM trafodion.sch.demo ;
+```
+
+== LOAD
+For LOAD, query progress goes through a few phases, which sometimes overlap:
+
+1. Hive scan.
+2. Sort.
+3. Create prep HFiles in HDFS bulkload staging directory (`/bulkload` by 
default).
+4. Move HFiles into HBase.
+
+You can monitor progress in step 2, sort, with this shell command:
+
+```
+lsof +L1 | grep SCR | wc -l
+```
+
+This command returns a count of the number of overflow files for sort. Each 
file is 2GB in size.
+You need to have an approximate idea of  the volume of data being loaded to 
know how much more
+data needs to be sorted. On a cluster, sort is done on all nodes with a 
pdsh-like utility.
+Trafodion data volume can also be larger than Hive data volume by a factor of 
2 or 3. 
+
+In step 3, create prep HFiles, use the following command to monitor the volume 
of data written
+out to the staging directory:
+
+```
+hadoop fs -dus /bulkload 
+```
+
+The `hadoop fs` command must be run from one node and does not have to be 
repeated across the cluster.
+
+If compression and encoding are used, then the size should be similar to the 
Hive source data volume.
+There may be some remnant data in the staging directory from previous 
commands, so we have to 
+take that into account. This step will start only when sort has completed.
+
+Step 4 is usually the shortest and typically does not exceed a few minutes.
+

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/src/asciidoc/_chapters/tables_indexes.adoc
----------------------------------------------------------------------
diff --git a/docs/load_transform/src/asciidoc/_chapters/tables_indexes.adoc 
b/docs/load_transform/src/asciidoc/_chapters/tables_indexes.adoc
new file mode 100644
index 0000000..71ea7f7
--- /dev/null
+++ b/docs/load_transform/src/asciidoc/_chapters/tables_indexes.adoc
@@ -0,0 +1,208 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[tables-indexes]]
+= Tables and Indexes
+
+The following guidance helps you set up your tables and indexes for better 
load performance.
+
+[[choose-primary-key]]
+== Choose Primary Key
+
+The primary key for a Trafodion table must be chosen based on the workload 
that  accesses the table.
+
+Keyed access to Trafodion tables is very efficient since HBase is a key-value 
store. You need to analyze the queries
+that are used to access the tables to understand their predicates and join 
conditions. Once identified, you can 
+choose a primary key that ensures that the leading key columns have highly 
selective predicates applied to them.
+
+This technique limits the number of rows that need to scanned in the HBase. 
Trafodion uses MDAM (Multi Dimensional Access Method) to limit
+the rows scanned when predicates are present to only trailing key columns and 
not the leading key column. MDAM works best when the
+unique entry count of leading key columns (on which predicates are absent) is 
low.
+
+[[salting]]
+== Salting 
+
+With range partitioned data in some workloads, certain key ranges of data may 
see more access than other key ranges. This can lead to an
+unbalanced usage pattern with some HBase RegionServers handling most of the 
load. This behavior is referred to as "hot-spotting."
+
+With Native HBase tables, hot-spotting is often addressed by designing 
appropriate keys. In Trafodion, once you choose the key to a table, as
+discussed in <<choose-primary-key,Choose Primary Key>>, you can use *salting* 
to distribute the data evenly. Salting applies a
+hash function to the salt keys and distributes data to partitions based on 
this hash value. The hash value is physically stored in the
+table as the leading key value. Each split of the table will have only one 
salt key value. 
+
+The salting key can be any subset (including the whole set) of the primary 
key. It is a good practice to keep the salting key as small 
+as possible. The key should provide an even distribution of data, which can be 
achieved when the key values have a large unique entry 
+count and no significant skew.
+
+The number of partitions must also be specified during table creation. You 
choose the number of partition depending on the size of the 
+cluster and the expected size of the table. A salted table can split if more 
data is added to it than initially estimated. If this
+happens, then more than one partition having rows with the same salt value, 
which may result in suboptimal execution plans for the table.
+
+You can also choose not to salt Trafodion tables. This is similar to range 
partitioning in a traditional database. The number of partitions
+grows with the size of the table, and range boundaries are determined by HBase 
based on the specified split policy.
+
+[[compression-encoding]]
+== Compression and Encoding
+
+Large Trafodion tables must be encoded and compressed. Trafodion tables that 
have a large key or several columns grow in size to 10X or more
+when compared to a Hive table with equivalent data since HBase stores the key 
separately for every column in a row. 
+
+HBase provides several types of encoding to avoid storing the same key value 
to disk for every column in the row. HBase also supports various
+types of compression of the entire data block, regardless whether it is 
encoded or not.
+
+See http://hbase.apache.org/book.html#compression[Appendix E: Compression and 
Data Block Encoding In HBase] in the
+http://hbase.apache.org/book.html[Apache HBase Reference Guide] for a 
comparison of various compression and encoding algorithms. 
+
+For a discussion on the space savings achieved by various types of encoding 
and compression schemes, please see the
+https://blogs.apache.org/hbase/category/General["The Effect of ColumnFamily, 
RowKey and KeyValue Design on HFile Size"] HBase blog entry,
+dated Friday April 11, 2014. 
+
+The performance of an algorithm is also a relevant factor when deciding on the 
type of compression. For more information, see slides 6, 7, 
+and 8 of Govind Kamat's and Sumeet Singh's presentation,
+http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2["Compression
 Options in Hadoop - A Tale of Tradeoffs"].
+
+[[create-trafodion-tables-and-indexes]]
+== Create Tables and Indexes 
+
+Create Trafodion tables using the CREATE TABLE statements with the `SALT USING 
<num> PARTITIONS` clause for salting and
+the `HBASE_OPTIONS` clause for compression and encoding.
+
+*Example*
+
+```
+CREATE TABLE trafodion.sch.demo
+( demo_sk INT NOT NULL
+, name VARCHAR(100)
+, PRIMARY KEY (demo_sk)
+)
+HBASE_OPTIONS
+( DATA_BLOCK_ENCODING = 'FAST_DIFF'
+, COMPRESSION = 'SNAPPY'
+, MEMSTORE_FLUSH_SIZE = '1073741824'
+)
+SALT USING 8 PARTITIONS ON (demo_sk);
+```
+
+ANY indexes on the table may be salted or not. However, if they are salted, 
their salting key and number of partitions must be the same as the table.
+
+*Example*
+
+```
+CREATE INDEX demo_ix ON sch.demo(name)
+HBASE_OPTIONS
+( DATA_BLOCK_ENCODING = 'FAST_DIFF'
+, COMPRESSION = 'GZ'
+)
+SALT LIKE TABLE;
+```
+
+== Update Statistics
+
+To generate good plans that allow queries to execute quickly and use resources 
wisely, the Trafodion Optimizer must have a good idea about how the
+values of columns are distributed, the number of distinct values, and so on. 
Trafodion supplies this information to the optimizer in the
+form of histograms generated by executing the UPDATE STATISTICS statement. See 
the
+http://trafodion.apache.org/docs/sql_reference/index.html#update_statistics_statement[Trafodion
 SQL Reference Manual] for a full
+description of this statement.
+
+=== Default Sampling
+While accurate statistics are important, the time required to generate them by 
reading every row in the table may be prohibitive and is
+usually unnecessary. Random sampling of the rows of the table can give 
adequate results in a fraction of the time required to read all
+the values in the table. For most situations, the best option is to simply 
specify SAMPLE at the end of the UPDATE STATISTICS statement,
+which will use the default sampling protocol. For example, to use default 
sampling in the construction of histograms for each column of
+table T1, you would execute the following statement:
+
+```
+UPDATE STATISTICS FOR TABLE t1 ON EVERY COLUMN SAMPLE;
+```
+
+This default sampling protocol uses a high sampling rate for small tables, 
reducing the rate with a steep gradient until hitting 1% and
+capping the sample size at one million rows. The specific details of default 
sampling are as follows:
+
+* Use the full table for tables up to 10,000 rows.
+* For table sizes from 10,000 up to a million rows, 10,000 rows are randomly 
sampled. In effect, this causes the sampling rate to decline
+from 100% to 1% as a function of increasing table size.
+* For tables with one million to 100 million rows, use a 1% random sample.
+* For tables exceeding 100 million rows, the sampling rate is calculated as 1 
million divided by the number of rows in the table.
+This limits the overall sample size to 1 million rows while ensuring uniform 
random sampling across the entire table.
+
+== Generate Single-Column and Multi-Column Histograms From One Statement
+
+If you use the ON EVERY COLUMN syntax in an UPDATE STATISTICS statement, then 
it is important to realize that multi-column histograms can be
+requested in the same statement. For example, if you wanted to generate a 
histogram for each single column of table T1, as well as
+multi-column histograms for column sets (c1, c2) and (c5, c6, c7), then you 
could use the following statement:
+
+```
+UPDATE STATISTICS FOR TABLE t1 ON EVERY COLUMN, (c1,c2), (c5,c6,c7) SAMPLE;
+```
+
+In terms of the end result, this is equivalent to the following pair of 
statements:
+
+```
+UPDATE STATISTICS FOR TABLE t1 ON EVERY COLUMN SAMPLE;
+UPDATE STATISTICS FOR TABLE t1 ON (c1, c2), (c5, c6, c7) SAMPLE;
+```
+
+However, the performance is superior when they are combined into a single 
statement because a multi-column histogram depends
+on the single-column histograms of its component columns. Therefore, 
separating the generation of single-column and multi-column histograms
+for a table into two statements leads to redundantly calculating some of the 
single-column histograms. Even though the
+relevant single-column histograms already exist, they are recomputed at the 
time the multi-column histograms are generated.
+
+=== Enable Update Statistics Automation
+
+If a standard set of queries is run on a regular basis, then one way to 
generate only those histograms that are needed for efficient execution
+of those queries is to enable update statistics automation, and then PREPARE 
each of the queries:
+
+```
+CONTROL QUERY DEFAULT USTAT_AUTOMATION_INTERVAL '1440';
+PREPARE s FROM SELECT...;
+```
+
+The value of the CQD USTAT_AUTOMATION_INTERVAL is intended to determine the 
automation interval (in minutes) for update statistics
+automation. The PREPARE statement causes the Trafodion Compiler to compile and 
optimize a query without executing it. In the process
+of doing so with automation enabled, any histograms needed by the optimizer 
that are missing causes those columns to be marked
+as needing histograms. Then, the following UPDATE STATISTICS statement can be 
run against each table to generate the needed histograms:
+
+```
+UPDATE STATISTICS FOR TABLE <table-name> ON NECESSARY COLUMNS SAMPLE;
+```
+
+=== Regenerate Histograms
+
+Histograms can become "stale" as the underlying data changes and possibly 
reflects a different distribution of values, although
+it is possible that data turnover or accumulation can be high while 
maintaining the same distribution. To ensure that statistics
+remain accurate, you should regenerate histograms for a table once significant 
changes have been made to that table since its
+histograms were last generated. To refresh existing histograms without adding 
new ones, use the following statement:
+
+```
+UPDATE STATISTICS FOR TABLE <table-name> ON EXISTING COLUMNS SAMPLE;
+```
+
+The critical set of histograms that were previously generated with the ON 
NECESSARY COLUMNS syntax can be periodically regenerated
+using ON EXISTING COLUMNS. Note that using ON NECESSARY COLUMNS will only 
identify those columns that have been previously
+requested by the optimizer but do not exist. The current implementation of 
automation does not know which existing histograms might be stale.
+
+
+
+LocalWords:  c7

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/33d796d6/docs/load_transform/src/asciidoc/_chapters/trickle_load.adoc
----------------------------------------------------------------------
diff --git a/docs/load_transform/src/asciidoc/_chapters/trickle_load.adoc 
b/docs/load_transform/src/asciidoc/_chapters/trickle_load.adoc
new file mode 100644
index 0000000..3d3bd1b
--- /dev/null
+++ b/docs/load_transform/src/asciidoc/_chapters/trickle_load.adoc
@@ -0,0 +1,496 @@
+////
+/**
+* @@@ START COPYRIGHT @@@
+*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*   http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*
+* @@@ END COPYRIGHT @@@
+  */
+////
+
+[[trickle-load]]
+= Trickle Load
+
+Trafodion Trickle Load allows data to be committed in batches, with sizes 
ranging from 1 row to a several
+thousand rows in each commit. Trickle Load uses the following SQL statements 
(defined in the
+http://trafodion.incubator.apache.org/docs/sql_reference/index.html[Trafodion 
SQL Reference Manual]:
+
+* 
http://trafodion.incubator.apache.org/docs/sql_reference/index.html#insert_statement[INSERT]
+* 
http://trafodion.apache.org/docs/sql_reference/index.html#upsert_statement[UPSERT]
+* 
http://trafodion.apache.org/docs/sql_reference/index.html#upsert_statement[UPSERT
 USING LOAD]
+
+Contrary to <<bulk-load,Bulk Load>>, committed rows are immediately visible 
from other transactions
+thereby leading to minimal latency in making newly ingested rows visible to 
applications and end users. 
+
+You use Trickle Load in the following situations:
+
+* Inserting and/or updating data on an ongoing basis. Typically, you create a 
custom JDBC or ODBC
+application for this approach.
+
+* You want to migrate a smaller amount of data (a few millions rows). 
Typically, you use JDBC- or
+ODBC-based ETL tools for this approach; for example:
+** <<trickle-load-odb,Trafodion odb^1^>>
+** http://squirrel-sql.sourceforge.net[SQuirrel-SQL]
+** http://www.pentaho.com/[Pentaho]
+** http://www.informatica.com/us/[Informatica].
+
+^1^ Trafodion obd typically achieves better load throughput than third-party 
ETL tools.
+
+[[trickle-load-improving-throughput]]
+== Improving Throughput
+
+Trickle Load uses the HBase write path, with every row being written to the 
WAL (Write-Ahead Log) and HBase MemStore.
+When memstore is full data is flushed to HStorefile in background.
+
+Throughput can be improved by use of:
+
+* Rowsets or Batch Updates.
+* UPSERT instead of INSERT statements, if applicable.
+* Multiple simultaneous connections.
+
+In addition, when using INSERT or UPSERT USING LOAD with the objective of 
maximizing data ingestion throughput,
+increasing the HBase table attribute MEMSTORE_FLUSHSIZE from its default value 
helps.
+
+The actual value you use depends on the heap size allocated to each Region 
Server, the concurrent query workload, and the
+number of tables for which simultaneous fast data ingestion is needed. With a 
heap size of 31 GB for each Region Server in
+an environment with heavy concurrent query workload, setting this attribute 1 
GB gives good performance.
+
+You can specify this attribute in the HBASE_OPTIONS clause when creating the 
table. Alternatively, you can also set it from
+the `hbase shell` through an `ALTER 'TRAFODION.<schema-name>.<table-name>', 
MEMSTORE_FLUSHSIZE &#62;&#61; '1073741824'` command.
+
+[[trickle-load-odb]]
+== odb
+
+odb is a Linux and Windows Trafodion client that is:
+
+* ODBC based
+* Database agnostic query driver
+* Query interpreter
+* Loader and extractor
+
+odb may be installed on:
+
+* The Trafodion cluster.
+* The machine that contains source data
+* An intermediate machine that is being used for data loading.
+
+Source data can be in any database (local or remote) that supports ODBC or in 
flat files local to the machine hosting
+the odb tool.
+
+odd uses threads to achieve parallelism, rowsets to improve throughput. You 
can specify INSERT, UPSERT or UPSERT USING LOAD
+insert types.
+
+NOTE: odb does not use the bulk load command LOAD, and, therefore, throughput 
when using odb may be lower than what can be achieved
+with the bulk loader. However, when using the odb tool, source data need not 
be moved to the Trafodion cluster in a separate step.
+
+odb allows you to access Hadoop data using one of the following methods:
+
+1.  *Use Hive and its ODBC Driver*: odb can access HIVE like any other 
relational database.
+For example, you can copy to from HIVE and other databases using odb's copy 
option.
+2.  *Add the hdfs.*` prefix to the input or output file during 
loads/extracts*: The file is read/written
+from/to Hadoop. odb interacts directly with the HDFS file system using 
*libhdfs*.
++
+This option is currently available only under Linux.
+
+The following odb commands/features are discussed in this guide:
+
+* <<trickle-load-odb-load, odb Load>>
+* <<trickle-load-odb-copy, odb Copy>>
+* <<trickle-load-odb-extract, odb Extract>>
+* <<trickle-load-odb-transform, odb Transform>>
+
+See the 
http://trafodion.incubator.apache.org/docs/odb_user/index.html[Trafodion odb 
User Guide]
+for installation instructions and usage syntax for the odb tool.
+
+The following subsections assume that you've installed odb.
+
+[[trickle-load-odb-throughput]]
+=== odb Throughput 
+
+You achieve the best throughput with odb if using the UPSERT USING LOAD option.
+
+The default insert type used by odb is INSERT; to use UPSERT USING LOAD, 
please specify `:loadcmd=UL` in odb's `load` or
+`copy` command.
+
+*Example*
+
+Copy the table `mytable` from `<source_catalog>.<source_schema>` on the source 
database to `trafodion.my
+schema.mytable`
+on Trafodion.
+
+```
+odb64luo -u <src_username>:<tgt_username> -p <src_pswd>:<tgt_pswd>
+-d <src_dsn>:<tgt_dsn>
+-cp src:<source_catalog>.<source_schema>.mytable tgt:trafodion.myschema.mytable
+:splitby=<col-name>:parallel=4:loadcmd=UL
+```
+
+[cols="25%,75%", options="header"]
+|===
+| Option             | Defines
+| `src_username`     | User name for the source database.
+| `src_pswd`         | Password for the source database.
+| `src_dsn`          | ODBC DSN for the source database.
+| `tgt_username`     | User name for the Trafodion database.
+| `tgt_pswd`         | Password for the Trafodion database.
+| `tgt_dsn`          | ODBC DSN for the Trafodion database.
+| `splitby`          | Defines the column used to evenly distributed values 
for parallelism. Consider using a leading key column.
+| `parallel=4`       | Use four connections to extract data from the source 
database and another four connections to write data to the target Trafodion 
database.
+| `loadcmd=UL`       | Use UPSERT USING LOAD syntax to write data. 
+|===
+
+[[trickle-load-odb-load]]
+=== odb Load
+
+Refer to the http://trafodion.apache.org/docs/odb/index.html#_load_files[Load 
Files] section
+in the http://trafodion.apache.org/docs/odb/index.html[Trafodion odb User 
Guide] for complete
+documentation of this option.
+
+You use the `-l` option to load into a table from:
+
+* File or standard input (pipe)
+* gzip compressed files with no external dependencies
+* HDFS 
+* Load XML files 
+* Delimited and fixed format files
+* "Binary" files (example images)
+* Generic SQL scripts before/after loads
+
+The `-l` option provides:
+
+* Data generation (constant, sequences, random values, from external datasets)
+* Configurable rowsets
+
+You can load single tables or list of tables in the same session using 
single/parallel threads.
+Limited "ETL like" functionalities are provided; for example:
+SUBSTR, TRANSLITERATION, TRUNCATE target, DATE/TIME format conversion, and 
TOUPPER. 
+
+*Important Options*
+
+[cols="25%,75%", options="header"]
+|===
+| Option             | Defines
+| `src`              | Source file. If empty, then odb generates sample data.
+| `fs`               | Field separator.
+| `tgt`              | Target table, required.
+| `map`              | Map file. A text file describing which input column is 
mapped to which target table column. See
+<<trickle-load-odb-transform, odb Transform>> below.
+| `rows`             | Rowset size to be used.
+| `parallel`         | Number of connections/threads to be used.`
+| `loadcmd`          | `IN`, `UP` or `UL`. INSERT, UPSERT or UPSERT USING 
LOAD. Use UL for best throughput.
+|===
+
+*Example*
+
+```
+$ odb64luo -u user -p xx -d dsn -l 
src=customer.tbl:tgt=TRAFODION.MAURIZIO.CUSTOMER \
+:fs=\|:rows=1000:loadcmd=UL:truncate:parallel=4
+```
+
+This command:
+
+* Loads the file named `customer.tbl` (`src=customer.tbl`)
+* in the table `TRAFODION.MAURIZIO.CUSTOMER` 
(`tgt=TRAFODION.MAURIZIO.CUSTOMER`)
+* using `|` (vertical bar) as a field separator (`fs=\|`)
+* using `1000 rows` as row-set buffer (`rows=1000`)
+* using UPSERT USING LOAD syntax to achieve better throughput
+* truncating the target table before loading (`truncate`)
+* using `4 parallel threads` to load the target table (`parallel=4`)
+
+```
+./odb64luo -u xx -p yy -d traf_sqws125 -l 
src=myfile:fs=|:tgt=TRAFODION.SEABASE.REGION:map=region.map:max=10000:rows=500:parallel=2:loadcmd=UL
+```
+
+You can load multiple files using different `-l` options. By default odb 
creates as many threads (and ODBC connections) as the sum
+of parallel load threads.
+
+*Example*
+
+Truncates and load the CUSTOMER, ORDERS and LINEITEM tables in parallel.
+
+```
+odb64luo -u user -p xx -d dsn -T 5 \
+-l src=./data/%t.tbl.gz:tgt=TRAFODION.MAURO.CUSTOMER:fs=\
+|:rows=m2:truncate:norb:parallel=4 \
+-l src=./data/%t.tbl.gz:tgt=TRAFODION.MAURO.ORDERS:fs=\
+|:rows=1000:truncate:norb:parallel=4 \
+-l src=./data/%t.tbl.gz:tgt=TRAFODION.MAURO.LINEITEM:fs=\
+|:rows=m10:truncate:norb:parallel=4
+```
+
+[[trickle-load-odb-copy]]
+=== odb Copy
+
+Refer to the
+http://trafodion.apache.org/docs/odb/index.html#_copy_tables_from_one_database_to_another[Copy
 Tables From One Database to Another]
+section in the http://trafodion.apache.org/docs/odb/index.html[Trafodion odb 
User Guide] for complete documentation of this option.
+
+Use the `-cp` option to copy tables *directly* from one data-source to another 
using ODBC (for example, from Trafodion to Teradata
+or vice-versa):
+
+* Single/Multiple table(s) copy from one database to another
+* Data never lands to disk (ODBC buffers moved from source to target)
+* Multi-threaded copy: single/multiple tables in parallel using 
single/multiple "data streams"/table
+* Each "data stream" consists of one "extractor" and one or more "loaders"
+* Table subsets copy (columns and/or rows)
+* No data conversion required
+* Other functionalities: sequence creation, limit text col length, max rows to 
copy, . . .
+* Each data stream is "multiple buffered" with loaders and extractors working 
in parallel (no need to extract before loading).
+
+The target table has to be be created in advance and should have a compatible 
structure.
+
+*Important Options*
+
+[cols="25%,75%", options="header"]
+|===
+| Option             | Defines
+| `src`              | Source file. If empty, then odb generates sample data.
+| `fs`               | Field separator.
+| `tgt`              | Target table, required.
+| `parallel`         | Number of connections/threads to be used.`
+| `splitby`          | Source column to parallelize copy operation on.
+| `pwhere`           | `where` condition on source 
+| `loadcmd`          | `IN`, `UP` or `UL`. INSERT, UPSERT or UPSERT USING 
LOAD. Use UL for best throughput.
+|===
+
+When copying data from one data source to another, odb needs user/password/dsn 
for both source and target system.
+User credentials and DSN for the target system are specified this way:
+
+```
+$ odb64luo -u src_user:tgt_user -p src_pwd:tgt:pwd -d src_dsn:tgt_dsn ... -cp 
src=...:tgt=...
+```
+
+You can use odb to copy a list of tables from one database to another.
+
+*Example*
+
+```
+$ cat tlist.txt 
+# List of tables to extract
+src=TRAFODION.MAURIZIO.ORDERS
+src=TRAFODION.MAURIZIO.CUSTOMER
+src=TRAFODION.MAURIZIO.PART
+src=TRAFODION.MAURIZIO.LINEITEM
+```
+
+You can extract all these tables by running:
+
+```
+$ odb64luo -u user1:user2 -p xx:yy -d dsn1:dsn2 \
+-cp src=-tlist.txt:tgt=tpch.stg_%t:rows=m2:truncate:parallel=4
+```
+
+Please note the `src=-tlist.txt`. This command copies:
+
+[cols="50%,50%",options="header",]
+|===
+| Source                        | Target 
+| `TRAFODION.MAURIZIO.ORDERS`   | `tpch.stg_orders`
+| `TRAFODION.MAURIZIO.CUSTOMER` | `tpch.stg_customer`
+| `TRAFODION.MAURIZIO.PART`     | `tpch.stg_part`
+| `TRAFODION.MAURIZIO.LINEITEM` | `tpch.stg_lineitem`
+|===
+
+Optionally, you can define any other command-line options in the input file.
+
+*Example*
+
+Using different _splitby columns_.
+
+```
+$ cat tlist2.txt
+# List of tables to extract and their "splitby columns" 
+src=TRAFODION.MAURIZIO.ORDERS:splitby=O_ORDERKEY 
+src=TRAFODION.MAURIZIO.CUSTOMER:splitby=C_CUSTOMERKEY 
+src=TRAFODION.MAURIZIO.PART:splitby=P_PARTKEY 
+src=TRAFODION.MAURIZIO.LINEITEM:splitby=L_PARTKEY
+```
+
+[[trickle-load-odb-extract]]
+=== odb Extract
+
+Refer to the 
http://trafodion.apache.org/docs/odb/index.html#_extract_tables[Extract Tables]
+section in the http://trafodion.apache.org/docs/odb/index.html[Trafodion odb 
User Guide] for complete documentation of this option.
+
+Use then -e option to extract from data a table and write it to standard files 
or named pipes.
+
+You can:
+
+* Export single tables, list of tables or generic SQL output.
+* Export table subsets (columns and/or rows).
+* Exports one or multiple tables in parallel using one or multiple data 
streams for each table
+* Invoke other functionalities (trim, remote trim, cast, limit text col 
length, max rows to export,. . .)
+
+You can write the extracted data to:
+
+* Single/multiple files or standard output (pipe).
+* gzip compressed files (no external libraries required).
+* XML formatted files (no external libraries required).
+* Hadoop File System (requires libhdfs).
+
+Other useful features:
+
+* Configurable NULL/EMPTY strings, field/record separators
+* Configurable rowset
+* Possibility to run generic SQL scripts before/after extracts
+* Multi-threaded export
+
+*Important Options*
+
+[cols="25%,75%", options="header"]
+|===
+| Option             | Defines
+| `src`              | Source file. If empty, then odb generates sample data.
+| `fs`               | Field separator.
+| `tgt`              | Target table, required.
+| `parallel`         | Number of connections/threads to be used.`
+| `splitby`          | Source column to parallelize extract operation on.
+| `pwhere`           | `where` condition on source 
+|===
+
+*Example*
+
+```
+$ odb64luo -u user -p xx -d dsn -T 3 \
+-e 
src=TRAFODION.MAURIZIO.LIN%:tgt=$\{DATA}/ext_%t.csv.gz:rows=m10:fs=\|:trim:gzip:
 \
+-e 
src=TRAFODION.MAURIZIO.REGION:tgt=$\{DATA}/ext_%t.csv.gz:rows=m10:fs=\|:trim:gzip
 \
+-e 
src=TRAFODION.MAURIZIO.NATION:tgt=$\{DATA}/ext_%t.csv.gz:rows=m10:fs=\|:trim:gzip
+```
+
+The example above:
+
+* Extracts tables `REGION`, `NATION`, and all tables starting with `LIN` from 
the `TRAFODION.MAURIZIO` schema.
+* Saves data into files `ext_%t.csv.gz` (`%t` is expanded to the real table 
name).
+* Compresses the output file (gzip) on the fly (uncompressed data never lands 
to disk).
+* Trims text fields.
+* Uses a 10 MB IO buffer.
+* Uses three threads (ODBC connection) for the extraction process.
+
+*Example*
+
+Use odb to extract all tables listed in a file.
+
+```
+$ cat tlist.txt
+
+# List of tables to extract src=TRAFODION.MAURIZIO.ORDERS
+src=TRAFODION.MAURIZIO.CUSTOMER src=TRAFODION.MAURIZIO.PART
+src=TRAFODION.MAURIZIO.LINEITEM
+
+```
+
+Extract all these tables by running:
+
+```
+$ odb64luo -u user -p xx -d dsn -e src=-tlist.txt:tgt=%t_%d%m:rows=m20:sq=\"
+```
+
+The example above:
+
+* Reads the list of source tables from `tlist.txt`.
+* Extracts the data into file using the table name in lowercase (`%t`).
+appending extraction data and time (`_%d%m`) for the target file name.
+* Uses a 20MB I/O buffer for each extraction thread.
+* Encloses strings with double-quote characters (`sq=\"`).
+
+
+[[trickle-load-odb-transform]]
+=== odb Transform
+
+Refer to the 
http://trafodion.apache.org/docs/odb/index.html#load_map_fields[Map Source File 
Fields to Target Table Columns]
+section in the http://trafodion.apache.org/docs/odb/index.html[Trafodion odb 
User Guide] for complete documentation of
+odb's mapping/transformation capabilities.
+
+odb provides mapping/transformation capabilities though mapfiles. By 
specifying `map=<mapfile>` load option you can:
+
+* Associate any input file field to any table column
+* Skip input file fields
+* Generate sequences
+* Insert constants
+* Transform dates/timestamps formats
+* Extract substrings
+* Replace input file strings. For example: insert Maurizio Felici when you 
read MF
+* Generate random values
+* And much more
+
+A generic mapfile contains:
+
+* *Comments* (line starting with #)
+* *Mappings* to link input file fields to the corresponding target table 
columns.
+
+Mappings use the following syntax:
+
+```
+<colname>:<field>[:transformation operator]
+```
+
+*Example*
+
+Suppose you have a target table like this:
+
+```
++------+---------------+----+-------+------------+
+|COLUMN|TYPE           |NULL|DEFAULT|INDEX       |
++------+---------------+----+-------+------------+
+|ID    |INTEGER SIGNED |NO  |       |mf_pkey 1 U |
+|NAME  |CHAR(10)       |YES |       |            |
+|AGE   |SMALLINT SIGNED|YES |       |            |
+|BDATE |DATE           |YES |       |            |
++------+---------------+----+-------+------------+
+```
+
+And an input file like this:
+
+***
+uno,00,*51*,due,_Maurizio_,tre,[underline]#07 Mar 1959#, ignore,remaining, 
fields +
+uno,00,*46*,due,_Lucia_,tre,[underline]#13 Oct 1964#, ignore, this +
+uno,00,*34*,due,_Giovanni_,tre,[underline]#30 Mar 1976# +
+uno,00,*48*,due,_Antonella_,tre,[underline]#24 Apr 1962#
+***
+
+* *Bold text* represents age.
+* _Italics  text_ represents name.
+* [underline]#Underline text# represents birth date.
+
+You want to load the marked fields into the appropriate column, generate a 
unique key for ID and ignore the remaining fields,
+In addition, you need to convert the date format and replace all occurrences 
of `Lucia` with `Lucy`.
+
+The following map file accomplishes these goals:
+
+***
+$ cat test/load_map/ml1.map +
+# Map file to load TRAFODION.MFTEST.FRIENDS from friends.dat +
+ID:seq:1                  # Inserts into ID column a sequence starting from 1 +
+NAME:4:REPLACE:Lucia:Lucy # Loads field #4 into NAME and replace all 
occurrences of Lucia with Lucy +
+AGE:2                     # Loads field #2 (they start from zero) into AGE +
+BDATE:6:DCONV:d.b.y       # Loads field #6 into BDATE converting date format 
from dd mmm yyyy
+***
+
+Load as follows:
+
+```
+$ odb64luo -u user -p xx -d dsn \
+  -l src=friends.dat:tgt=TRAFODION.MFTEST.FRIENDS:map=ml1.map:fs=,
+```
+
+The above example:
+
+* Reads data from `friends.dat` (`src`).
+* Writes data to the `TRAFODION.MFTEST.FRIENDS` Trafodion table (`tgt`).
+* Uses `ml1.map` to define transformation specifications (`map`).
+* Uses comma as a field separator (`fs`).

[2/5] incubator-trafodion git commit: Replaced logo, added drafts of new manuals, updated SQL Reference, corrected issues in other manuals.

Reply via email to