Repository: sqoop Updated Branches: refs/heads/sqoop2 d013d94db -> 53ecc01b0
SQOOP-1766: Sqoop2: Add Kite Connector Usage into User Manual (Abraham Elmahrek via Jarek Jarcec Cecho) Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/53ecc01b Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/53ecc01b Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/53ecc01b Branch: refs/heads/sqoop2 Commit: 53ecc01b0e499ce7568cd786dd49145c12709e8a Parents: d013d94 Author: Jarek Jarcec Cecho <[email protected]> Authored: Wed May 27 10:21:42 2015 -0700 Committer: Jarek Jarcec Cecho <[email protected]> Committed: Wed May 27 10:21:42 2015 -0700 ---------------------------------------------------------------------- docs/src/site/sphinx/Connectors.rst | 92 ++++++++++++++++++++++++++++++++ 1 file changed, 92 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/sqoop/blob/53ecc01b/docs/src/site/sphinx/Connectors.rst ---------------------------------------------------------------------- diff --git a/docs/src/site/sphinx/Connectors.rst b/docs/src/site/sphinx/Connectors.rst index 721e92a..af54467 100644 --- a/docs/src/site/sphinx/Connectors.rst +++ b/docs/src/site/sphinx/Connectors.rst @@ -388,6 +388,98 @@ Loader During the *loading* phase, Kafka is written to directly from each loader. The order in which data is loaded into Kafka is not guaranteed. ++++++++++++++ +Kite Connector +++++++++++++++ + +----- +Usage +----- + +To use the Kite Connector, create a link for the connector and a job that uses the link. For more information on Kite, checkout the kite documentation: http://kitesdk.org/docs/1.0.0/Kite-SDK-Guide.html. + +**Link Configuration** +++++++++++++++++++++++ + +Inputs associated with the link configuration include: + ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=======================================================================+============================+ +| authority | String | The authority of the kite dataset. | hdfs://example.com:8020/ | +| | | *Optional*. See note below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ + +**Notes** +========= + +1. The authority is useful for specifying Hive metastore or HDFS URI. + +**FROM Job Configuration** +++++++++++++++++++++++++++ + +Inputs associated with the Job configuration for the FROM direction include: + ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=======================================================================+============================+ +| URI | String | The Kite dataset URI to use. | dataset:hdfs:/tmp/ns/ds | +| | | *Required*. See notes below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ + +**Notes** +========= + +1. The URI and the authority from the link configuration will be merged to create a complete dataset URI internally. +2. Only *hdfs* and *hive* are supported currently. + +**TO Job Configuration** +++++++++++++++++++++++++ + +Inputs associated with the Job configuration for the TO direction include: + ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=======================================================================+============================+ +| URI | String | The Kite dataset URI to use. | dataset:hdfs:/tmp/ns/ds | +| | | *Required*. See note below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| File format | Enum | The format of the data the kite dataset should write out. | PARQUET | +| | | *Optional*. See note below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ + +**Notes** +========= + +1. The URI and the authority from the link configuration will be merged to create a complete dataset URI internally. +2. Only *hdfs* and *hive* are supported currently. + +----------- +Partitioner +----------- + +The kite connector only creates one partition currently. + +--------- +Extractor +--------- + +During the *extraction* phase, Kite is used to query a dataset. Since there is only one dataset to query, only a single reader is created to read the dataset. + +**NOTE**: The avro schema kite generates will be slightly different than the original schema. This is because avro identifiers have strict naming requirements. + +------ +Loader +------ + +During the *loading* phase, Kite is used to write several temporary datasets. The number of temporary datasets is equivalent to the number of *loaders* that are being used. + +---------- +Destroyers +---------- + +The Kite connector TO destroyer merges all the temporary datasets into a single dataset. + +++++++++++++++ SFTP Connector ++++++++++++++
