> On July 10, 2014, 8:22 a.m., Venkat Ranganathan wrote:
> > src/java/org/apache/sqoop/manager/MainframeManager.java, line 75
> > <https://reviews.apache.org/r/22516/diff/1/?file=608148#file608148line75>
> >
> >     Is import into Hbase and Accumulo supported by this tool?  It looks 
> > like the only target supported is HDFS text files from the command help.
> 
> Mariappan Asokan wrote:
>     Each record in a mainframe dataset is treated as a single field (or 
> column.)  So, theoretically HBase, Accumulo, and Hive are supported but with 
> limited usability.  So, I did not add them in the documentation.  If you feel 
> strongly that they should be documented, I can work on that in the next 
> version of the patch.
> 
> Venkat Ranganathan wrote:
>     I feel it would be good to say we import only as text files and leave 
> further processing, loading into hive/hbase upto the user as the composition 
> of the records and needed processing differ and the schema can't be inferred.
> 
> Mariappan Asokan wrote:
>     I agree with you.  To avoid confusion, I plan to remove support for 
> parsing input format, output format, hive, hbase, hcatalog, and codegen 
> options.  This will synchronize the document with the code. What do you think?
>
> 
> Venkat Ranganathan wrote:
>     Sorry for the delay.   I was wondering whether the mainframe connector 
> can just define connector specific extra args and not create another tool.   
> Please see NetezzaManager or DirectNetezzaManager as an example.   May be you 
> have to invent a new synthetic  URI format say jdbc:mfftp:<host 
> address>:<port>/dataset and choose your Connection Manager when --connect 
> option with the above uri format is given.  That should simplify a whole lot 
> in my opinion.   What do you think?

Thanks for your suggestions.  Sorry, I did not get back sooner.  In Sqoop 1.x, 
there is a strong assumption that input source is always a database table.  Due 
to this the sqoop import tool has many options that are relevant to a source 
database table.  A mainframe source is totally different from a database table. 
 I think it is better to create a separate tool for mainframe import rather 
than just a new connection manager.  The mainframe import tool will not support 
many options that the database import tool supports.  It will have its own 
options that the database import tool does not support.  At present, these are 
the host name and partitioned dataset name.  In the future, the mainframe 
import tool may be enhanced with metadata specific or connection specific 
arguments unique to mainframe.  Creating a synthetic URI for a connection seems 
to be somewhat artificial to me.

Contrary to what I stated before, considering possible future enhancements, I 
think it is better to retain the support for parsing input format, output 
format, Hive, HBase, HCatalog, and codegen options.  The documentation will be 
enhanced in the future to reflect this support.


- Mariappan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22516/#review47555
-----------------------------------------------------------


On June 14, 2014, 10:46 p.m., Mariappan Asokan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22516/
> -----------------------------------------------------------
> 
> (Updated June 14, 2014, 10:46 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> This is to move mainframe datasets to Hadoop.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/manager/MainframeManager.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MainframeDatasetFTPRecordReader.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MainframeDatasetImportMapper.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MainframeDatasetInputFormat.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MainframeDatasetInputSplit.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MainframeDatasetRecordReader.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MainframeImportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/MainframeImportTool.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/SqoopTool.java dbe429a 
>   src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/manager/TestMainframeManager.java PRE-CREATION 
>   
> src/test/org/apache/sqoop/mapreduce/TestMainframeDatasetFTPRecordReader.java 
> PRE-CREATION 
>   src/test/org/apache/sqoop/mapreduce/TestMainframeDatasetInputFormat.java 
> PRE-CREATION 
>   src/test/org/apache/sqoop/mapreduce/TestMainframeDatasetInputSplit.java 
> PRE-CREATION 
>   src/test/org/apache/sqoop/mapreduce/TestMainframeImportJob.java 
> PRE-CREATION 
>   src/test/org/apache/sqoop/tool/TestMainframeImportTool.java PRE-CREATION 
>   src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22516/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Mariappan Asokan
> 
>

Reply via email to