Jarek Jarcec Cecho created SQOOP-2201:
-----------------------------------------
Summary: Sqoop2: Add possibility to read Hadoop configuration
files to HFDS connector
Key: SQOOP-2201
URL: https://issues.apache.org/jira/browse/SQOOP-2201
Project: Sqoop
Issue Type: Bug
Affects Versions: 1.99.5
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Fix For: 1.99.6
Currently the HDFS connector is not explicitly reading Hadoop configuration
files. During
[Initialization|https://github.com/apache/sqoop/blob/sqoop2/connector/connector-hdfs/src/main/java/org/apache/sqoop/connector/hdfs/HdfsToInitializer.java]
phase it doesn't do anything, so the configuration files are not needed.
During other parts of the workflow, we're [explicitly
casting|https://github.com/apache/sqoop/blob/sqoop2/connector/connector-hdfs/src/main/java/org/apache/sqoop/connector/hdfs/HdfsExtractor.java#L61]
the general {{Context}} object to Hadoop {{Configuration}}.
This is unfortunate because:
* It couples HDFS connector to Mapreduce execution engine. It will break with
adding non mapreduce based execution engine.
* We can't do any HDFS specific checks in {{Initializer}} as the Hadoop
{{Configuration}} object is not available there.
As a result I would like to propose breaking this coupling between HDFS
connector and Mapreduce execution engine and add configuration option to HDFS
Link to specify directory from which we should read the appropriate Hadoop
configuration files (with reasonable defaults such as {{/etc/conf/hadoop}}).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)