[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-24 Thread Chris Teoh (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490249#comment-16490249
 ] 

Chris Teoh commented on SQOOP-3224:
---

Pull request from GitHub

> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sqoop pull request #46: SQOOP-3224: Mainframe FTP transfer should have an op...

2018-05-24 Thread christeoh
GitHub user christeoh opened a pull request:

https://github.com/apache/sqoop/pull/46

SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode 
for transfer

Added --as-binaryfile and --buffersize for FTP binary mode transfers.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/christeoh/sqoop 3224-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/46.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #46


commit b5cadeebb6f05df29c6018b391e5965a2caecbdb
Author: Chris Teoh 
Date:   2018-05-23T06:00:43Z

Fixed merge conflict

commit c0de096f9dcb3109b1e8200e9494ee19c4bf203f
Author: Chris Teoh 
Date:   2018-05-23T06:01:50Z

Refactored com.cloudera namespace to org.apache.sqoop

commit 49108cdf6185ad1881a857911351d4cc81fd34dd
Author: Chris Teoh 
Date:   2018-05-23T07:03:33Z

Added --as-binary flag

commit cce81e53b82eb95ac5946d6092ece7178e281726
Author: Chris Teoh 
Date:   2017-11-15T02:36:07Z

Moved mainframe FTP transfermode default setting to initDefaults()

commit 19cb4a11051f7bc094ff13f54f0ecb47a91677cd
Author: Chris Teoh 
Date:   2017-11-15T02:38:02Z

Replaced import java.io.* with single class imports

commit ad54c7caf205ec510feac0708ff130cce3d8970e
Author: Chris Teoh 
Date:   2017-11-16T02:07:07Z

Removed excessive logging per record to improve performance

commit e48820ea21598b44630e4331be4ee04bb2842d5e
Author: Chris Teoh 
Date:   2017-11-16T02:07:42Z

Added comment to document why we need to add custom class for binary 
transfers

commit 288412b7db4d731506b97eb2be2229ba1bcad639
Author: Chris Teoh 
Date:   2017-11-16T03:27:48Z

Converted to use BufferedInputStream instead of InputStream

commit e4a1f3a5a4a6f1fcc562b26eeda109d773b854e1
Author: Chris Teoh 
Date:   2017-11-17T00:57:48Z

Added unit tests for MainframeDatasetFTPRecordReader.getNextBinaryRecord

commit 290d5895b37ef9ca515d14e7e5d4d13730684e15
Author: Chris Teoh 
Date:   2017-11-17T01:19:04Z

Updated unit tests and used helper classes

commit 51e6d75767e56d481467d6d6c7de0bf0c76fba1d
Author: Chris Teoh 
Date:   2017-11-17T01:22:06Z

Updated unit tests to use a method of org.junit.Assert

commit 8e5ea6f8d993a4c479ac20e87f5b4b7cf2e9c8df
Author: Chris Teoh 
Date:   2017-11-17T01:35:04Z

Updated unit test for compilation

commit c737ea28e57c517d8f28a81802978e11e768ec3b
Author: Chris Teoh 
Date:   2017-11-17T05:28:57Z

Used StringUtils to do comparisons and corrected bulk imports

commit 48602eb6e5d70ca86456a30862e583ad82e863e0
Author: Chris Teoh 
Date:   2017-11-28T03:51:07Z

Replaced star import with specific class import

commit e81b400ec2dc4c47d46d1db198ab665c0a85de3c
Author: Chris Teoh 
Date:   2017-11-28T03:51:33Z

Updated to use current class instead of deprecated class

commit 3fd76409108184eace1ae1b60cad0d739af474bd
Author: Chris Teoh 
Date:   2017-11-28T03:52:10Z

Refactored common functionality to another function

commit c50bd2183717f3b2a393c92711ef19fdab4dbbd2
Author: Chris Teoh 
Date:   2017-11-28T03:52:30Z

Adjusted comment

commit 6a66f3e88e7150d0dceba2a6accf120ea4498199
Author: Chris Teoh 
Date:   2017-11-29T04:36:44Z

Moved tests from TestMainframeDatasetFTPRecordReader to separate class

commit c3f1de55bd2dabc1efcdf798cd31c6d981e23c0f
Author: Chris Teoh 
Date:   2017-11-29T04:37:15Z

Adjusted class for unit test support:

commit bd487e9804fd58188f2a48ec5db504a678e7bf8c
Author: Chris Teoh 
Date:   2017-11-29T04:38:43Z

Adjusted exceptions to print full stack

commit 81416f08ae11f67854be89d2abc6a652fd28c3f8
Author: Chris Teoh 
Date:   2017-11-29T04:39:05Z

Moved unit tests to another class

commit 40e77151d65d8f8b3cd9dd983283ef7f53fad73e
Author: Chris Teoh 
Date:   2017-11-29T06:17:48Z

Updated unit tests

commit 2ad7205ad1272b3b5965b8918a2ba69672d8c8d2
Author: Chris Teoh 
Date:   2017-11-29T11:27:58Z

Tidied up unit tests

commit bc7c43338d21eaa67ea5d92ef4a1fff8efd5783f
Author: Chris Teoh 
Date:   2017-11-29T11:43:28Z

Updated getNextBinaryRecord logic to be simpler

commit eed86f87a904697098ccac952efab9ecbd98db84
Author: Chris Teoh 
Date:   2017-12-12T22:42:54Z

Added license information

commit ed7c2d5ed8ee7a2631a453c87a40cdf3a8d194cc
Author: Chris Teoh 
Date:   2017-12-12T22:57:47Z

Refactored BinaryKeyOutputFormat to RawKeyTextOutputFormat

commit 5f032726a36cff473c4456596876a79c8139b187
Author: Chris Teoh 
Date:   2017-12-12T22:58:56Z

Added license information

commit 

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490248#comment-16490248
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

GitHub user christeoh opened a pull request:

https://github.com/apache/sqoop/pull/46

SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode 
for transfer

Added --as-binaryfile and --buffersize for FTP binary mode transfers.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/christeoh/sqoop 3224-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/46.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #46


commit b5cadeebb6f05df29c6018b391e5965a2caecbdb
Author: Chris Teoh 
Date:   2018-05-23T06:00:43Z

Fixed merge conflict

commit c0de096f9dcb3109b1e8200e9494ee19c4bf203f
Author: Chris Teoh 
Date:   2018-05-23T06:01:50Z

Refactored com.cloudera namespace to org.apache.sqoop

commit 49108cdf6185ad1881a857911351d4cc81fd34dd
Author: Chris Teoh 
Date:   2018-05-23T07:03:33Z

Added --as-binary flag

commit cce81e53b82eb95ac5946d6092ece7178e281726
Author: Chris Teoh 
Date:   2017-11-15T02:36:07Z

Moved mainframe FTP transfermode default setting to initDefaults()

commit 19cb4a11051f7bc094ff13f54f0ecb47a91677cd
Author: Chris Teoh 
Date:   2017-11-15T02:38:02Z

Replaced import java.io.* with single class imports

commit ad54c7caf205ec510feac0708ff130cce3d8970e
Author: Chris Teoh 
Date:   2017-11-16T02:07:07Z

Removed excessive logging per record to improve performance

commit e48820ea21598b44630e4331be4ee04bb2842d5e
Author: Chris Teoh 
Date:   2017-11-16T02:07:42Z

Added comment to document why we need to add custom class for binary 
transfers

commit 288412b7db4d731506b97eb2be2229ba1bcad639
Author: Chris Teoh 
Date:   2017-11-16T03:27:48Z

Converted to use BufferedInputStream instead of InputStream

commit e4a1f3a5a4a6f1fcc562b26eeda109d773b854e1
Author: Chris Teoh 
Date:   2017-11-17T00:57:48Z

Added unit tests for MainframeDatasetFTPRecordReader.getNextBinaryRecord

commit 290d5895b37ef9ca515d14e7e5d4d13730684e15
Author: Chris Teoh 
Date:   2017-11-17T01:19:04Z

Updated unit tests and used helper classes

commit 51e6d75767e56d481467d6d6c7de0bf0c76fba1d
Author: Chris Teoh 
Date:   2017-11-17T01:22:06Z

Updated unit tests to use a method of org.junit.Assert

commit 8e5ea6f8d993a4c479ac20e87f5b4b7cf2e9c8df
Author: Chris Teoh 
Date:   2017-11-17T01:35:04Z

Updated unit test for compilation

commit c737ea28e57c517d8f28a81802978e11e768ec3b
Author: Chris Teoh 
Date:   2017-11-17T05:28:57Z

Used StringUtils to do comparisons and corrected bulk imports

commit 48602eb6e5d70ca86456a30862e583ad82e863e0
Author: Chris Teoh 
Date:   2017-11-28T03:51:07Z

Replaced star import with specific class import

commit e81b400ec2dc4c47d46d1db198ab665c0a85de3c
Author: Chris Teoh 
Date:   2017-11-28T03:51:33Z

Updated to use current class instead of deprecated class

commit 3fd76409108184eace1ae1b60cad0d739af474bd
Author: Chris Teoh 
Date:   2017-11-28T03:52:10Z

Refactored common functionality to another function

commit c50bd2183717f3b2a393c92711ef19fdab4dbbd2
Author: Chris Teoh 
Date:   2017-11-28T03:52:30Z

Adjusted comment

commit 6a66f3e88e7150d0dceba2a6accf120ea4498199
Author: Chris Teoh 
Date:   2017-11-29T04:36:44Z

Moved tests from TestMainframeDatasetFTPRecordReader to separate class

commit c3f1de55bd2dabc1efcdf798cd31c6d981e23c0f
Author: Chris Teoh 
Date:   2017-11-29T04:37:15Z

Adjusted class for unit test support:

commit bd487e9804fd58188f2a48ec5db504a678e7bf8c
Author: Chris Teoh 
Date:   2017-11-29T04:38:43Z

Adjusted exceptions to print full stack

commit 81416f08ae11f67854be89d2abc6a652fd28c3f8
Author: Chris Teoh 
Date:   2017-11-29T04:39:05Z

Moved unit tests to another class

commit 40e77151d65d8f8b3cd9dd983283ef7f53fad73e
Author: Chris Teoh 
Date:   2017-11-29T06:17:48Z

Updated unit tests

commit 2ad7205ad1272b3b5965b8918a2ba69672d8c8d2
Author: Chris Teoh 
Date:   2017-11-29T11:27:58Z

Tidied up unit tests

commit bc7c43338d21eaa67ea5d92ef4a1fff8efd5783f
Author: Chris Teoh 
Date:   2017-11-29T11:43:28Z

Updated getNextBinaryRecord logic to be simpler

commit eed86f87a904697098ccac952efab9ecbd98db84
Author: Chris Teoh 
Date:   2017-12-12T22:42:54Z

Added license information

commit ed7c2d5ed8ee7a2631a453c87a40cdf3a8d194cc
Author: Chris Teoh 
Date:   

[jira] [Created] (SQOOP-3327) Mainframe FTP needs to Include "Migrated" datasets when parsing the FTP list

2018-05-24 Thread Chris Teoh (JIRA)
Chris Teoh created SQOOP-3327:
-

 Summary: Mainframe FTP needs to Include "Migrated" datasets when 
parsing the FTP list
 Key: SQOOP-3327
 URL: https://issues.apache.org/jira/browse/SQOOP-3327
 Project: Sqoop
  Issue Type: Improvement
Reporter: Chris Teoh
Assignee: Chris Teoh


Need to Include "Migrated" datasets when parsing the FTP list.

 

** This applies to sequential datasets as well as GDG members **

 

Identifying migrated datasets – when performing manual FTP

 

ftp> open abc.def.ghi.jkl.mno

Connected to abc.def.ghi.jkl.mno (11.22.33.444).

220-TCPFTP01 Some FTP Server at abc.def.ghi.jkl.mno, 22:34:11 on 2018-01-22.

220 Connection will close if idle for more than 10 minutes.

Name (abc.def.ghi.jkl.mno:some_user): some_user

331 Send password please.

Password:

230 some_user is logged on.  Working directory is "some_user.".

Remote system type is MVS.

ftp> dir

227 Entering Passive Mode (33,44,555,66,7,8)

125 List started OK

Volume Unit    Referred Ext Used Recfm Lrecl BlkSz Dsorg Dsname

Migrated    DEV.DATA

Migrated    DUMMY.DATA

OVR343 3390   2018/01/23  1    1  FB 132 27984  PS  EMPTY

Migrated    JCL.CNTL

OVR346 3390   2018/01/22  1    1  FB  80 27920  PS  MIXED.FB80

Migrated    PLAIN.FB80

OVR341 3390   2018/01/23  1    9  VA 125   129  PS  PRDA.SPFLOG1.LIST

G20427 Tape 
UNLOAD.ABCDE.ZZ9UYT.FB.TAPE

SEM352 3390   2018/01/23  1    1  FB 150  1500  PS  USER.BRODCAST

OVR346 3390   2018/01/23  3    3  FB  80  6160  PO  USER.ISPPROF

250 List completed successfully.

 

"Migrated" should be included as one of the regex pattern searches.

Assuming space delimited, first column will be "Migrated", and the second (and 
final) column will contain the dataset name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (SQOOP-3326) Mainframe FTP listing for GDG should filter out non-GDG datasets in a heterogeneous listing

2018-05-24 Thread Chris Teoh (JIRA)
Chris Teoh created SQOOP-3326:
-

 Summary: Mainframe FTP listing for GDG should filter out non-GDG 
datasets in a heterogeneous listing
 Key: SQOOP-3326
 URL: https://issues.apache.org/jira/browse/SQOOP-3326
 Project: Sqoop
  Issue Type: Improvement
Reporter: Chris Teoh
Assignee: Chris Teoh


The FTP listing will automatically assume the first file in the listing is the 
most recent GDG file. This is a problem when there are mixed datasets in the 
listing that the FTP listing doesn't filter these out.

 

GDG base is : HLQ.ABC.DEF.AB15HUP

 

The sequential dataset in the middle of the GDG member listing is : 
HLQ.ABC.DEF.AB15HUP.DATA

 

The pattern for listing GDG members should be : <>.G\d\{4}V\d\{2}

 

 

  Menu  Options  View  Utilities  Compilers  Help 

ss

DSLIST - Data Sets Matching HLQ.ABC.DEF.AB15HUP Empty data set or member

Command ===>  Scroll ===> PAGE

   

Command - Enter "/" to select action  Message   Volume

---

 HLQ.ABC.DEF.AB15HUP  ??

 HLQ.ABC.DEF.AB15HUP.DATA     ZXC344+

 HLQ.ABC.DEF.AB15HUP.G0007V00 H54924

 HLQ.ABC.DEF.AB15HUP.G0008V00 G54837

 HLQ.ABC.DEF.AB15HUP.G0009V00 G53709

 HLQ.ABC.DEF.AB15HUP.G0010V00 G27559

* End of Data Set list 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

2018-05-24 Thread Boglarka Egyed

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review203766
---



Hi Daniel,

These are great news! Thanks for the patch update.

Compile works well for me however I have failing unit test cases with various 
error messages in these test classes:
org.apache.sqoop.hive.TestHiveImport
org.apache.sqoop.hive.TestHiveMiniCluster
org.apache.sqoop.hive.TestHiveServer2TextImport

Could you please check these on your side too?

Many thanks,
Bogi

- Boglarka Egyed


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> ---
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
> https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> To be able to eventually support the latest versions of Hive, HBase and 
> Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See 
> https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java fb2ab031 
>   src/java/org/apache/sqoop/hive/HiveImport.java 5da00a74 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e0499 
>   src/java/org/apache/sqoop/mapreduce/ParquetJob.java 46047733 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2a 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b7 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20dd 
>   
> src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java
>  549a8c6c 
>   
> src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java
>  79881f7b 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c1 
>   testdata/hcatalog/conf/hive-site.xml edac7aa9 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/5/
> 
> 
> Testing
> ---
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>



Re: Review Request 62492: SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-24 Thread Chris Teoh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62492/
---

(Updated May 24, 2018, 2:33 p.m.)


Review request for Sqoop.


Changes
---

Reduced code duplication


Bugs: SQOOP-3224
https://issues.apache.org/jira/browse/SQOOP-3224


Repository: sqoop-trunk


Description
---

Added --as-binaryfile and --buffersize to support FTP transfer mode switching.


Diffs (updated)
-

  src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba4 
  src/java/org/apache/sqoop/mapreduce/KeyRecordWriters.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/RawKeyTextOutputFormat.java fec34f21 
  
src/java/org/apache/sqoop/mapreduce/mainframe/AbstractMainframeDatasetImportMapper.java
 PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java 
ea54b07f 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryImportMapper.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryRecord.java 
PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
 1f78384b 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java 
0b7b5b85 
  src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java 
7e975c7b 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 783651a4 
  src/java/org/apache/sqoop/tool/ImportTool.java ee79d8b7 
  src/java/org/apache/sqoop/tool/MainframeImportTool.java 8883301d 
  src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java 95bc0ecb 
  
src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetBinaryRecord.java
 PRE-CREATION 
  
src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java
 3547294f 
  src/test/org/apache/sqoop/tool/TestMainframeImportTool.java 0b0c6c34 
  src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java 90a85194 


Diff: https://reviews.apache.org/r/62492/diff/10/

Changes: https://reviews.apache.org/r/62492/diff/9-10/


Testing
---

Unit tests.

Functional testing on mainframe.


Thanks,

Chris Teoh



Re: Review Request 62492: SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-24 Thread Chris Teoh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62492/
---

(Updated May 24, 2018, 2:12 p.m.)


Review request for Sqoop.


Changes
---

Rebased from trunk.


Bugs: SQOOP-3224
https://issues.apache.org/jira/browse/SQOOP-3224


Repository: sqoop-trunk


Description (updated)
---

Added --as-binaryfile and --buffersize to support FTP transfer mode switching.


Diffs (updated)
-

  src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba4 
  src/java/org/apache/sqoop/mapreduce/KeyRecordWriters.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/RawKeyTextOutputFormat.java fec34f21 
  
src/java/org/apache/sqoop/mapreduce/mainframe/AbstractMainframeDatasetImportMapper.java
 PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java 
ea54b07f 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryImportMapper.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryRecord.java 
PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
 1f78384b 
  
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java 
0b7b5b85 
  src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java 
7e975c7b 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 783651a4 
  src/java/org/apache/sqoop/tool/ImportTool.java ee79d8b7 
  src/java/org/apache/sqoop/tool/MainframeImportTool.java 8883301d 
  src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java 95bc0ecb 
  
src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetBinaryRecord.java
 PRE-CREATION 
  
src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java
 3547294f 
  src/test/org/apache/sqoop/tool/TestMainframeImportTool.java 0b0c6c34 
  src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java 90a85194 


Diff: https://reviews.apache.org/r/62492/diff/9/

Changes: https://reviews.apache.org/r/62492/diff/8-9/


Testing
---

Unit tests.

Functional testing on mainframe.


Thanks,

Chris Teoh



Re: Review Request 66548: Importing as ORC file to support full ACID Hive tables

2018-05-24 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66548/#review203762
---



Hi Dani,

Thank you for submitting this new feature and sorry for the late review.
I have left some minor comments inline and I suggest adding some more 
documentation explaining what exactly is supported with the ORC files.
The implementation suggests that we only support HDFS and Hive import at this 
moment, so export is not covered yet. If this is true I think we should 
emphasize it in the documentation.


src/java/org/apache/sqoop/hive/TableDefWriter.java
Lines 194 (patched)


I am not that familiar with Hive CREATE TABLE statement but as far as I 
understand 'STORED AS ORC' basically means that we will use 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, 
org.apache.hadoop.hive.ql.io.orc.OrcSerde is that correct?



src/java/org/apache/sqoop/hive/TableDefWriter.java
Line 197 (original), 200 (patched)


Does ORC support different compression codecs? If yes, I think we should 
emphasize in the documentation (and/or implement a fail fast) that Sqoop does 
not support the compression-codec option with ORC files at the moment.



src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java
Lines 41 (patched)


Can we just use NullWritable.get() instead of introducing a field called 
"nada"?



src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java
Lines 42 (patched)


Can we make this field private?



src/java/org/apache/sqoop/util/OrcConversionContext.java
Lines 98 (patched)


typo: tinyiny



src/java/org/apache/sqoop/util/OrcUtil.java
Lines 55 (patched)


We use Hive types as ORC schema types here, is this always going to be 
correct?
I am not too familiar with the ORC type, does it support all the Hive data 
types?



src/test/org/apache/sqoop/TestOrcImport.java
Lines 50 (patched)


It seems that this test case only covers the HDFS import, can we add test 
cases which cover the Hive import too?


- Szabolcs Vasas


On May 2, 2018, 12:12 p.m., daniel voros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66548/
> ---
> 
> (Updated May 2, 2018, 12:12 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3311
> https://issues.apache.org/jira/browse/SQOOP-3311
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID 
> by default. This will probably result in increased usage of ACID tables and 
> the need to support importing into ACID tables with Sqoop.
> 
> Currently the only table format supporting full ACID tables is ORC.
> 
> The easiest and most effective way to support importing into these tables 
> would be to write out files as ORC and keep using LOAD DATA as we do for all 
> other Hive tables (supported since HIVE-17361).
> 
> Workaround could be to create table as textfile (as before) and then CTAS 
> from that. This would push the responsibility of creating ORC format to Hive. 
> However it would result in writing every record twice; in text format and in 
> ORC.
> 
> Note that ORC is only necessary for full ACID tables. Insert-only (aka. 
> micromanaged) ACID tables can use arbitrary file format.
> 
> Supporting full ACID tables would also be the first step in making 
> "lastmodified" incremental imports work with Hive.
> 
> 
> Diffs
> -
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/docs/man/import-common-args.txt 22e3448e 
>   src/docs/man/sqoop-import-all-tables.txt 6db38ad8 
>   src/docs/user/hcatalog.txt 2ae1d54d 
>   src/docs/user/help.txt 8a0d1477 
>   src/docs/user/import-all-tables.txt fbad47b2 
>   src/docs/user/import.txt 2d074f49 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba4 
>   src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 783651a4 
>   src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
>   src/java/org/apache/sqoop/tool/ImportTool.java ee79d8b7 
>   src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION 
>   src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION 
>