SQOOP-1556: Sqoop2: Add documentation clarifying connectors vs. engines (Gwen Shapira via Jarek Jarcec Cecho)
Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/196346d5 Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/196346d5 Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/196346d5 Branch: refs/heads/SQOOP-1367 Commit: 196346d5ccbb7cd6f4f2627cbd937728c61301d2 Parents: 35a060e Author: Jarek Jarcec Cecho <jar...@apache.org> Authored: Fri Sep 26 16:12:22 2014 -0700 Committer: Abraham Elmahrek <abra...@elmahrek.com> Committed: Thu Oct 9 17:59:24 2014 -0700 ---------------------------------------------------------------------- docs/src/site/sphinx/ConnectorDevelopment.rst | 14 ++++++++++++++ 1 file changed, 14 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/sqoop/blob/196346d5/docs/src/site/sphinx/ConnectorDevelopment.rst ---------------------------------------------------------------------- diff --git a/docs/src/site/sphinx/ConnectorDevelopment.rst b/docs/src/site/sphinx/ConnectorDevelopment.rst index 5121382..ae4f721 100644 --- a/docs/src/site/sphinx/ConnectorDevelopment.rst +++ b/docs/src/site/sphinx/ConnectorDevelopment.rst @@ -31,6 +31,20 @@ Connector reads data from databases for import, and write data to databases for export. Interaction with Hadoop is taken cared by common modules of Sqoop 2 framework. +When do we add a new connector? +=============================== +You add a new connector when you need to extract data from a new data source, or load +data to a new target. +In addition to the connector API, Sqoop 2 also has an engine interface. +At the moment the only engine is MapReduce,but we may support additional engines in the future. +Since many parallel execution engines are capable of reading/writing data +there may be a question of whether support for specific data stores should be done +through a new connector or new engine. + +**Our guideline is:** Connectors should manage all data extract/load. Engines manage job +life cycles. If you need to support a new data store and don't care how jobs run - +you are looking to add a connector. + Connector Implementation ++++++++++++++++++++++++