Repository: sqoop Updated Branches: refs/heads/branch-1.99.4 f5d29aec8 -> bd7bed1fe
SQOOP-1655: SQOOP2 DOC: Document getSchema() and its use in the connector dev guide (Gwen Shapira via Jarek Jarcec Cecho) Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/bd7bed1f Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/bd7bed1f Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/bd7bed1f Branch: refs/heads/branch-1.99.4 Commit: bd7bed1fe012ed0b975ae3a4c35c07813796d3dd Parents: f5d29ae Author: Jarek Jarcec Cecho <[email protected]> Authored: Sun Nov 2 15:17:05 2014 -0800 Committer: Jarek Jarcec Cecho <[email protected]> Committed: Sun Nov 2 15:18:01 2014 -0800 ---------------------------------------------------------------------- docs/src/site/sphinx/ConnectorDevelopment.rst | 23 +++++++++++++++------- docs/src/site/sphinx/index.rst | 2 +- 2 files changed, 17 insertions(+), 8 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/sqoop/blob/bd7bed1f/docs/src/site/sphinx/ConnectorDevelopment.rst ---------------------------------------------------------------------- diff --git a/docs/src/site/sphinx/ConnectorDevelopment.rst b/docs/src/site/sphinx/ConnectorDevelopment.rst index d700e4c..e4b5402 100644 --- a/docs/src/site/sphinx/ConnectorDevelopment.rst +++ b/docs/src/site/sphinx/ConnectorDevelopment.rst @@ -70,7 +70,7 @@ Connectors can optionally override the following methods: The ``getFrom`` method returns From_ instance which is a placeholder for the modules needed to read from a data source. -The ``getTo`` method returns Exporter_ instance +The ``getTo`` method returns Extractor_ instance which is a placeholder for the modules needed to write to a data source. Methods such as ``getBundle`` , ``getConnectionConfigurationClass`` , @@ -170,11 +170,22 @@ Connectors can define the design of ``Partition`` on their own. Initializer and Destroyer ------------------------- +.. _Initializer: +.. _Destroyer: Initializer is instantiated before the submission of MapReduce job -for doing preparation such as adding dependent jar files. +for doing preparation such as connecting to the data source, creating temporary tables or adding dependent jar files. -Destroyer is instantiated after MapReduce job is finished for clean up. +In addition to the Initialize() method where the preparation activities occur, the Initializer must implement a getSchema() method. +This method is used by the framework to match the data extracted by the ``From`` connector with the data as the ``To`` connector expects it. +In case of a relational database or columnar database, the returned Schema object will include collection of columns with their data types. +If the data source is schema-less, such as a file, an empty Schema object can be returned (i.e a Schema object without any columns). + +Note that Sqoop2 currently does not support ETL between two schema-less sources. We expect for each job that either the connector providing +the ``From`` instance or the connector providing the ``To`` instance will have a schema. If both instances have a schema, Sqoop2 will load data by column name. +I.e, data in column "A" in data source will be loaded to column "A" in target. + +Destroyer is instantiated after MapReduce job is finished for clean up, for example dropping temporary tables and closing connections. To @@ -226,10 +237,8 @@ Loader must iterate in the ``load`` method until the data from ``DataReader`` is Initializer and Destroyer ------------------------- -Initializer is instantiated before the submission of MapReduce job -for doing preparation such as adding dependent jar files. - -Destroyer is instantiated after MapReduce job is finished for clean up. +Initializer_ and Destroyer_ of a ``To`` instance are used in a similar way to those of a ``From`` instance. +Refer to the previous section for more details. Connector Configurations http://git-wip-us.apache.org/repos/asf/sqoop/blob/bd7bed1f/docs/src/site/sphinx/index.rst ---------------------------------------------------------------------- diff --git a/docs/src/site/sphinx/index.rst b/docs/src/site/sphinx/index.rst index e9bfd51..1bea5c3 100644 --- a/docs/src/site/sphinx/index.rst +++ b/docs/src/site/sphinx/index.rst @@ -59,7 +59,7 @@ Developer Guide - `Building Sqoop2 <BuildingSqoop2.html>`_ - `Development Environment Setup <DevEnv.html>`_ - `Java Client API Guide <ClientAPI.html>`_ -- `Developping Connector <ConnectorDevelopment.html>`_ +- `Developing a Connector <ConnectorDevelopment.html>`_ - `REST API Guide <RESTAPI.html>`_ Overview
