Repository: sqoop Updated Branches: refs/heads/sqoop2 25b0df5c8 -> 3613843a7
http://git-wip-us.apache.org/repos/asf/sqoop/blob/3613843a/docs/src/site/sphinx/user/connectors/Connector-Kite.rst ---------------------------------------------------------------------- diff --git a/docs/src/site/sphinx/user/connectors/Connector-Kite.rst b/docs/src/site/sphinx/user/connectors/Connector-Kite.rst new file mode 100644 index 0000000..414ad8a --- /dev/null +++ b/docs/src/site/sphinx/user/connectors/Connector-Kite.rst @@ -0,0 +1,110 @@ +.. Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +============== +Kite Connector +============== + +.. contents:: + :depth: 3 + +----- +Usage +----- + +To use the Kite Connector, create a link for the connector and a job that uses the link. For more information on Kite, checkout the kite documentation: http://kitesdk.org/docs/1.0.0/Kite-SDK-Guide.html. + +**Link Configuration** +++++++++++++++++++++++ + +Inputs associated with the link configuration include: + ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=======================================================================+============================+ +| authority | String | The authority of the kite dataset. | hdfs://example.com:8020/ | +| | | *Optional*. See note below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ + +**Notes** +========= + +1. The authority is useful for specifying Hive metastore or HDFS URI. + +**FROM Job Configuration** +++++++++++++++++++++++++++ + +Inputs associated with the Job configuration for the FROM direction include: + ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=======================================================================+============================+ +| URI | String | The Kite dataset URI to use. | dataset:hdfs:/tmp/ns/ds | +| | | *Required*. See notes below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ + +**Notes** +========= + +1. The URI and the authority from the link configuration will be merged to create a complete dataset URI internally. If the given dataset URI contains authority, the authority from the link configuration will be ignored. +2. Only *hdfs* and *hive* are supported currently. + +**TO Job Configuration** +++++++++++++++++++++++++ + +Inputs associated with the Job configuration for the TO direction include: + ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=======================================================================+============================+ +| URI | String | The Kite dataset URI to use. | dataset:hdfs:/tmp/ns/ds | +| | | *Required*. See note below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| File format | Enum | The format of the data the kite dataset should write out. | PARQUET | +| | | *Optional*. See note below. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ + +**Notes** +========= + +1. The URI and the authority from the link configuration will be merged to create a complete dataset URI internally. If the given dataset URI contains authority, the authority from the link configuration will be ignored. +2. Only *hdfs* and *hive* are supported currently. + +----------- +Partitioner +----------- + +The kite connector only creates one partition currently. + +--------- +Extractor +--------- + +During the *extraction* phase, Kite is used to query a dataset. Since there is only one dataset to query, only a single reader is created to read the dataset. + +**NOTE**: The avro schema kite generates will be slightly different than the original schema. This is because avro identifiers have strict naming requirements. + +------ +Loader +------ + +During the *loading* phase, Kite is used to write several temporary datasets. The number of temporary datasets is equivalent to the number of *loaders* that are being used. + +---------- +Destroyers +---------- + +The Kite connector TO destroyer merges all the temporary datasets into a single dataset. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/sqoop/blob/3613843a/docs/src/site/sphinx/user/connectors/Connector-SFTP.rst ---------------------------------------------------------------------- diff --git a/docs/src/site/sphinx/user/connectors/Connector-SFTP.rst b/docs/src/site/sphinx/user/connectors/Connector-SFTP.rst new file mode 100644 index 0000000..d25ea3f --- /dev/null +++ b/docs/src/site/sphinx/user/connectors/Connector-SFTP.rst @@ -0,0 +1,91 @@ +.. Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +============== +SFTP Connector +============== + +The SFTP connector supports moving data between a Secure File Transfer Protocol (SFTP) server and other supported Sqoop2 connectors. + +Currently only the TO direction is supported to write records to an SFTP server. A FROM connector is pending (SQOOP-2218). + +.. contents:: + :depth: 3 + +----- +Usage +----- + +Before executing a Sqoop2 job with the SFTP connector, set **mapreduce.task.classpath.user.precedence** to true in the Hadoop cluster config, for example:: + + <property> + <name>mapreduce.task.classpath.user.precedence</name> + <value>true</value> + </property> + +This is required since the SFTP connector uses the JSch library (http://www.jcraft.com/jsch/) to provide SFTP functionality. Unfortunately Hadoop currently ships with an earlier version of this library which causes an issue with some SFTP servers. Setting this property ensures that the current version of the library packaged with this connector will appear first in the classpath. + +To use the SFTP Connector, create a link for the connector and a job that uses the link. + +**Link Configuration** +++++++++++++++++++++++ + +Inputs associated with the link configuration include: + ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=======================================================================+============================+ +| SFTP server hostname | String | Hostname for the SFTP server. | sftp.example.com | +| | | *Required*. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| SFTP server port | Integer | Port number for the SFTP server. Defaults to 22. | 2220 | +| | | *Optional*. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Username | String | The username to provide when connecting to the SFTP server. | sqoop | +| | | *Required*. | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ +| Password | String | The password to provide when connecting to the SFTP server. | sqoop | +| | | *Required* | | ++-----------------------------+---------+-----------------------------------------------------------------------+----------------------------+ + +**Notes** +========= + +1. The SFTP connector will attempt to connect to the SFTP server as part of the link validation process. If for some reason a connection can not be established, you'll see a corresponding error message. +2. Note that during connection, the SFTP connector explictly disables *StrictHostKeyChecking* to avoid "UnknownHostKey" errors. + +**TO Job Configuration** +++++++++++++++++++++++++ + +Inputs associated with the Job configuration for the TO direction include: + ++-----------------------------+---------+-------------------------------------------------------------------------+-----------------------------------+ +| Input | Type | Description | Example | ++=============================+=========+=========================================================================+===================================+ +| Output directory | String | The location on the SFTP server that the connector will write files to. | uploads | +| | | *Required* | | ++-----------------------------+---------+-------------------------------------------------------------------------+-----------------------------------+ + +**Notes** +========= + +1. The *output directory* value needs to be an existing directory on the SFTP server. + +------ +Loader +------ + +During the *loading* phase, the connector will create uniquely named files in the *output directory* for each partition of data received from the **FROM** connector. \ No newline at end of file
