[jira] [Updated] (SQOOP-3339) Netezza export doesn't work on ORC tables

2018-06-26 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SQOOP-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric ESCANDELL updated SQOOP-3339:
--
Description: 
While executing sqoop export on a ORC table, the exception followed is launched 
: 
{code:java}
Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
be cast to org.apache.hadoop.io.LongWritable
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
{code}
If the exported is stored as TextFile, the mapper class receive LongWritable as 
key but if it's an ORC table, the mapper class receive NullWritable.

 

 

  was:
While executing sqoop export on a ORC table, the exception followed is launched 
: 
{code:java}
Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
be cast to org.apache.hadoop.io.LongWritable
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
{code}
If the exported is stored as TextFile, the mapper class receive LongWritable as 
key but if it's an ORC table, the mapper class receive NullWritable.

The patch in attachment propose to modify the signature of map function :
{code:java}
public void map(LongWritable key, HCatRecord hcr, Context context){code}
to
{code:java}
public void map(Object key, HCatRecord hcr, Context context){code}
 

 

 

 


> Netezza export doesn't work on ORC tables
> -
>
> Key: SQOOP-3339
> URL: https://issues.apache.org/jira/browse/SQOOP-3339
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.7
>Reporter: Frédéric ESCANDELL
>Priority: Blocker
> Attachments: patch-file.patch
>
>
> While executing sqoop export on a ORC table, the exception followed is 
> launched : 
> {code:java}
> Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
> be cast to org.apache.hadoop.io.LongWritable
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
> {code}
> If the exported is stored as TextFile, the mapper class receive LongWritable 
> as key but if it's an ORC table, the mapper class receive NullWritable.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (SQOOP-3339) Netezza export doesn't work on ORC tables

2018-06-26 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SQOOP-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric ESCANDELL updated SQOOP-3339:
--
Description: 
While executing sqoop export on a ORC table, the exception followed is launched 
: 
{code:java}
Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
be cast to org.apache.hadoop.io.LongWritable
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
{code}
If the exported is stored as TextFile, the mapper class receive LongWritable as 
key but if it's an ORC table, the mapper class receive NullWritable.

The patch in attachment propose to modify the signature of map function :
{code:java}
public void map(LongWritable key, HCatRecord hcr, Context context){code}
to
{code:java}
public void map(Object key, HCatRecord hcr, Context context){code}
 

 

 

 

  was:
While executing sqoop export on a ORC table, the exception followed is launched 
: 

 
{code:java}
Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
be cast to org.apache.hadoop.io.LongWritable
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
{code}
If the exported is stored as TextFile, the mapper class receive LongWritable as 
key but if it's an ORC table, the mapper class receive NullWritable.

The patch in attachment propose to modify the signature of map function :

 
{code:java}
public void map(LongWritable key, HCatRecord hcr, Context context){code}
 

to 

 
{code:java}
public void map(Object key, HCatRecord hcr, Context context){code}
 

 

 

 


> Netezza export doesn't work on ORC tables
> -
>
> Key: SQOOP-3339
> URL: https://issues.apache.org/jira/browse/SQOOP-3339
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.7
>Reporter: Frédéric ESCANDELL
>Priority: Blocker
>
> While executing sqoop export on a ORC table, the exception followed is 
> launched : 
> {code:java}
> Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
> be cast to org.apache.hadoop.io.LongWritable
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
> {code}
> If the exported is stored as TextFile, the mapper class receive LongWritable 
> as key but if it's an ORC table, the mapper class receive NullWritable.
> The patch in attachment propose to modify the signature of map function :
> {code:java}
> public void map(LongWritable key, HCatRecord hcr, Context context){code}
> to
> {code:java}
> public void map(Object key, HCatRecord hcr, Context context){code}
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (SQOOP-3339) Netezza export doesn't work on ORC tables

2018-06-26 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SQOOP-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric ESCANDELL updated SQOOP-3339:
--
Attachment: (was: patch-file.patch)

> Netezza export doesn't work on ORC tables
> -
>
> Key: SQOOP-3339
> URL: https://issues.apache.org/jira/browse/SQOOP-3339
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.7
>Reporter: Frédéric ESCANDELL
>Priority: Blocker
>
> While executing sqoop export on a ORC table, the exception followed is 
> launched : 
> {code:java}
> Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
> be cast to org.apache.hadoop.io.LongWritable
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
> {code}
> If the exported is stored as TextFile, the mapper class receive LongWritable 
> as key but if it's an ORC table, the mapper class receive NullWritable.
> The patch in attachment propose to modify the signature of map function :
> {code:java}
> public void map(LongWritable key, HCatRecord hcr, Context context){code}
> to
> {code:java}
> public void map(Object key, HCatRecord hcr, Context context){code}
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (SQOOP-3339) Netezza export doesn't work on ORC tables

2018-06-26 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SQOOP-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric ESCANDELL updated SQOOP-3339:
--
Attachment: patch-file.patch

> Netezza export doesn't work on ORC tables
> -
>
> Key: SQOOP-3339
> URL: https://issues.apache.org/jira/browse/SQOOP-3339
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.7
>Reporter: Frédéric ESCANDELL
>Priority: Blocker
> Attachments: patch-file.patch
>
>
> While executing sqoop export on a ORC table, the exception followed is 
> launched : 
>  
> {code:java}
> Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
> be cast to org.apache.hadoop.io.LongWritable
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
>     at 
> org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
> {code}
> If the exported is stored as TextFile, the mapper class receive LongWritable 
> as key but if it's an ORC table, the mapper class receive NullWritable.
> The patch in attachment propose to modify the signature of map function :
>  
> {code:java}
> public void map(LongWritable key, HCatRecord hcr, Context context){code}
>  
> to 
>  
> {code:java}
> public void map(Object key, HCatRecord hcr, Context context){code}
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (SQOOP-3339) Netezza export doesn't work on ORC tables

2018-06-26 Thread JIRA
Frédéric ESCANDELL created SQOOP-3339:
-

 Summary: Netezza export doesn't work on ORC tables
 Key: SQOOP-3339
 URL: https://issues.apache.org/jira/browse/SQOOP-3339
 Project: Sqoop
  Issue Type: Bug
Affects Versions: 1.4.7
Reporter: Frédéric ESCANDELL


While executing sqoop export on a ORC table, the exception followed is launched 
: 

 
{code:java}
Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot 
be cast to org.apache.hadoop.io.LongWritable
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableHCatExportMapper.map(NetezzaExternalTableHCatExportMapper.java:34)
    at 
org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:233)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
{code}
If the exported is stored as TextFile, the mapper class receive LongWritable as 
key but if it's an ORC table, the mapper class receive NullWritable.

The patch in attachment propose to modify the signature of map function :

 
{code:java}
public void map(LongWritable key, HCatRecord hcr, Context context){code}
 

to 

 
{code:java}
public void map(Object key, HCatRecord hcr, Context context){code}
 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 62492: SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode for transfer

2018-06-26 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62492/#review205364
---



Hi Chris,

Since this is a quite big patch I will review it iteratively.
In the first iteration I would like you to move the mainframe-related code to 
the mainframe classes since your changes affect the mainframe connector only, 
please see my comments below.

I have also seen many unused imports added by your changes, please remove 
those. I have started to add separate comment for every unused import but the 
list is not complete please check the other classes too.


src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Lines 40 (patched)


Unused import.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Lines 43 (patched)


Unnecessary new line.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Lines 55 (patched)


Unused import.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Lines 57 (patched)


Unused import.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Lines 58 (patched)


Unused import.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Lines 59 (patched)


Unused import.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Line 91 (original), 100 (patched)


Since the BinaryFile layout is only supported by the mainframe connector we 
should move this branch to 
org.apache.sqoop.mapreduce.mainframe.MainframeImportJob#configureMapper.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
Lines 205 (patched)


Since the BinaryFile layout is only supported by the mainframe connector we 
should move this branch to 
org.apache.sqoop.mapreduce.mainframe.MainframeImportJob#getOutputFormatClass.



src/java/org/apache/sqoop/mapreduce/RawKeyTextOutputFormat.java
Lines 70 (patched)


This behavior should be moved to another class since it is really 
unexpected in a class called RawKeyTextOutputFormat.
I think you should introduce a new class for mainframe output format and if 
the format is text then you just delegate to an instance of 
RawKeyTextOutputFormat otherwise you return an instance of 
BinaryKeyRecordWriter.


- Szabolcs Vasas


On June 18, 2018, 1:47 a.m., Chris Teoh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62492/
> ---
> 
> (Updated June 18, 2018, 1:47 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3224
> https://issues.apache.org/jira/browse/SQOOP-3224
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Added --as-binaryfile and --buffersize to support FTP transfer mode switching.
> 
> 
> Diffs
> -
> 
>   build.xml 0ae729bc 
>   src/docs/user/import-mainframe.txt abeb7cde 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 3b542102 
>   src/java/org/apache/sqoop/mapreduce/KeyRecordWriters.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/RawKeyTextOutputFormat.java fec34f21 
>   
> src/java/org/apache/sqoop/mapreduce/mainframe/AbstractMainframeDatasetImportMapper.java
>  PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java 
> ea54b07f 
>   
> src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryImportMapper.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryRecord.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
>  1f78384b 
>   
> src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetImportMapper.java
>  0b7b5b85 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java 
> 8ef30d38 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java c62ee98c 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2c474b7e 
>   src/java/org/apache/sqoop/tool/MainframeImportTool.java 8883301d 
>   src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java 95bc0ecb 
>   src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java 
> 041dfb78 
>   src/test/org/apache/sqoop/manager/mainframe/MainframeTestUtil.java f28ff36c 
>   
> 

Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing

2018-06-26 Thread Fero Szabo via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67628/#review205359
---


Ship it!




Ship It!

- Fero Szabo


On June 26, 2018, 9:15 a.m., Szabolcs Vasas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67628/
> ---
> 
> (Updated June 26, 2018, 9:15 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3328
> https://issues.apache.org/jira/browse/SQOOP-3328
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> The new implementation uses classes from parquet.hadoop packages.
> TestParquetIncrementalImportMerge has been introduced to cover some gaps we 
> had in the Parquet merge support.
> The test infrastructure is also modified a bit which was needed because of 
> TestParquetIncrementalImportMerge.
> 
> Note that this JIRA does not cover the Hive Parquet import support I will 
> create another JIRA for that.
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/SqoopOptions.java 
> d9984af369f901c782b1a74294291819e7d13cdd 
>   src/java/org/apache/sqoop/avro/AvroUtil.java 
> 57c2062568778c5bb53cd4118ce4f030e4ff33f2 
>   src/java/org/apache/sqoop/manager/ConnManager.java 
> c80dd5d9cbaa9b114c12b693e9a686d2cbbe51a3 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 
> 3b5421028d3006e790ed4b711a06dbdb4035b8a0 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 
> 17c9ed39b1e613a6df36b54cd5395b80e5f8fb0b 
>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java 
> ae53a96bddc523a52384715dd97705dc3d9db607 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetExportJobConfigurator.java 
> 8d7b87f6d6832ce8d81d995af4c4bd5eeae38e1b 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java 
> fa1bc7d1395fbbbceb3cb72802675aebfdb27898 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactory.java
>  ed5103f1d84540ef2fa5de60599e94aa69156abe 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java
>  2286a52030778925349ebb32c165ac062679ff71 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorImplementation.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetMergeJobConfigurator.java 
> 67fdf6602bcbc6c091e1e9bf4176e56658ce5222 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportMapper.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportMapper.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetJobConfiguratorFactory.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java 
> 7f21205e1c4be4200f7248d3f1c8513e0c8e490c 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java
>  ca02c7bdcaf2fa981e15a6a96b111dec38ba2b25 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportMapper.java 
> 2d88a9c8ea4eb32001e1eb03e636d9386719 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java
>  87828d1413eb71761aed44ad3b138535692f9c97 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportMapper.java 
> 20adf6e422cc4b661a74c8def114d44a14787fc6 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetJobConfiguratorFactory.java
>  055e1166b07aeef711cd162052791500368c628d 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetMergeJobConfigurator.java
>  9fecf282885f7aeac011a66f7d5d05512624976f 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java 
> e68bba90d8b08ac3978fcc9ccae612bdf02388e8 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 
> c62ee98c2b22d819c9a994884b254f76eb518b6a 
>   src/java/org/apache/sqoop/tool/ImportTool.java 
> 2c474b7eeeff02b59204e4baca8554d668b6c61e 
>   src/java/org/apache/sqoop/tool/MergeTool.java 
> 4c20f7d151514b26a098dafdc1ee265cbde5ad20 
>   src/test/org/apache/sqoop/TestBigDecimalExport.java 
> ccea17345c0c8a2bdb7c8fd141f37e3c822ee41e 
>   src/test/org/apache/sqoop/TestMerge.java 
> 11806fea6c59ea897bc1aa23f6657ed172d093d5 
>   src/test/org/apache/sqoop/TestParquetExport.java 
> 43dabb57b7862b607490369e09b197b6de65a147 
>   src/test/org/apache/sqoop/TestParquetImport.java 
> 

[jira] [Created] (SQOOP-3338) Document the impact of the Kite removal

2018-06-26 Thread Szabolcs Vasas (JIRA)
Szabolcs Vasas created SQOOP-3338:
-

 Summary: Document the impact of the Kite removal
 Key: SQOOP-3338
 URL: https://issues.apache.org/jira/browse/SQOOP-3338
 Project: Sqoop
  Issue Type: Sub-task
Affects Versions: 1.4.7
Reporter: Szabolcs Vasas






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing

2018-06-26 Thread Szabolcs Vasas


> On June 25, 2018, 2:03 p.m., Fero Szabo wrote:
> > src/test/org/apache/sqoop/TestParquetExport.java
> > Line 69 (original), 66 (patched)
> > 
> >
> > If the intention here is to run the same test for every implementation, 
> > then it might make sense to reference the enum class again:
> > 
> > ParquetJobConfiguratorImplementation.values()
> > 
> > .. and automagically have a test for future imlementations.

My idea here is to decouple the constants used in the tests and the 
constants/enum used in the production code.
Let's say this goes to production and the users start to use the "kite" and the 
"hadoop" properties but somehow the value of the enum changes this tests will 
fail and show that the interface has changed. However if I use 
ParquetJobConfiguratorImplementation.values() the test will succeed but the 
users can have some broken jobs.


> On June 25, 2018, 2:03 p.m., Fero Szabo wrote:
> > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java
> > Lines 104 (patched)
> > 
> >
> > Is null sometimes OK here?
> > 
> > I think it will cause an NPE later down the call stack in 
> > org.apache.sqoop.avro.AvroUtil#getFileToTest
> > 
> > Though NPEs are straightforward to fix, so up to you whether you want 
> > to throw something here or not.

I have added some comments here, I hope it clarifies it a bit.


> On June 25, 2018, 2:03 p.m., Fero Szabo wrote:
> > src/java/org/apache/sqoop/tool/BaseSqoopTool.java
> > Line 1912 (original), 1920 (patched)
> > 
> >
> > Where do optionValue and propertyValue come from?
> > 
> > My guess is that option comes from a --flag, and property from a -D... 
> > property. I guess that the user can somehow declare them in the site.xml as 
> > well. (?)
> > 
> > Might be an issue:
> > I see that one overrides the other. Why the duplication? Did you 
> > document the precedence order?

Correct, option value comes from --parquet-configurator-implementation while 
the propertyValue comes from -Dparquetjob.configurator.implementation or the 
site.xml. The -D value overrides the value defined in the site.xml (this is how 
Hadoop works) and the convention in Sqoop is that the --flag overrides the 
property value. I will raise a separate JIRA for all the documentation tasks 
and mention this behavior there.


> On June 25, 2018, 2:03 p.m., Fero Szabo wrote:
> > src/java/org/apache/sqoop/tool/BaseSqoopTool.java
> > Lines 1935 (patched)
> > 
> >
> > To me this looks like a small detail that can be forgotten to be 
> > updated if / when a new implementation is added. 
> > 
> > To avoid it, you might consider using this for supperted values:
> > Arrays.toString(ParquetJobConfiguratorImplementation.values()

Nice catch, fixed.


- Szabolcs


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67628/#review205279
---


On June 26, 2018, 9:15 a.m., Szabolcs Vasas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67628/
> ---
> 
> (Updated June 26, 2018, 9:15 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3328
> https://issues.apache.org/jira/browse/SQOOP-3328
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> The new implementation uses classes from parquet.hadoop packages.
> TestParquetIncrementalImportMerge has been introduced to cover some gaps we 
> had in the Parquet merge support.
> The test infrastructure is also modified a bit which was needed because of 
> TestParquetIncrementalImportMerge.
> 
> Note that this JIRA does not cover the Hive Parquet import support I will 
> create another JIRA for that.
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/SqoopOptions.java 
> d9984af369f901c782b1a74294291819e7d13cdd 
>   src/java/org/apache/sqoop/avro/AvroUtil.java 
> 57c2062568778c5bb53cd4118ce4f030e4ff33f2 
>   src/java/org/apache/sqoop/manager/ConnManager.java 
> c80dd5d9cbaa9b114c12b693e9a686d2cbbe51a3 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 
> 3b5421028d3006e790ed4b711a06dbdb4035b8a0 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 
> 17c9ed39b1e613a6df36b54cd5395b80e5f8fb0b 
>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java 
> ae53a96bddc523a52384715dd97705dc3d9db607 
>   
> 

Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing

2018-06-26 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67628/
---

(Updated June 26, 2018, 9:15 a.m.)


Review request for Sqoop.


Bugs: SQOOP-3328
https://issues.apache.org/jira/browse/SQOOP-3328


Repository: sqoop-trunk


Description
---

The new implementation uses classes from parquet.hadoop packages.
TestParquetIncrementalImportMerge has been introduced to cover some gaps we had 
in the Parquet merge support.
The test infrastructure is also modified a bit which was needed because of 
TestParquetIncrementalImportMerge.

Note that this JIRA does not cover the Hive Parquet import support I will 
create another JIRA for that.


Diffs (updated)
-

  src/java/org/apache/sqoop/SqoopOptions.java 
d9984af369f901c782b1a74294291819e7d13cdd 
  src/java/org/apache/sqoop/avro/AvroUtil.java 
57c2062568778c5bb53cd4118ce4f030e4ff33f2 
  src/java/org/apache/sqoop/manager/ConnManager.java 
c80dd5d9cbaa9b114c12b693e9a686d2cbbe51a3 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 
3b5421028d3006e790ed4b711a06dbdb4035b8a0 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 
17c9ed39b1e613a6df36b54cd5395b80e5f8fb0b 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java 
ae53a96bddc523a52384715dd97705dc3d9db607 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetExportJobConfigurator.java 
8d7b87f6d6832ce8d81d995af4c4bd5eeae38e1b 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java 
fa1bc7d1395fbbbceb3cb72802675aebfdb27898 
  
src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactory.java 
ed5103f1d84540ef2fa5de60599e94aa69156abe 
  
src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java
 2286a52030778925349ebb32c165ac062679ff71 
  
src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorImplementation.java
 PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetMergeJobConfigurator.java 
67fdf6602bcbc6c091e1e9bf4176e56658ce5222 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportMapper.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportMapper.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetJobConfiguratorFactory.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java
 PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java 
7f21205e1c4be4200f7248d3f1c8513e0c8e490c 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java
 ca02c7bdcaf2fa981e15a6a96b111dec38ba2b25 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportMapper.java 
2d88a9c8ea4eb32001e1eb03e636d9386719 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java
 87828d1413eb71761aed44ad3b138535692f9c97 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportMapper.java 
20adf6e422cc4b661a74c8def114d44a14787fc6 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetJobConfiguratorFactory.java
 055e1166b07aeef711cd162052791500368c628d 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetMergeJobConfigurator.java
 9fecf282885f7aeac011a66f7d5d05512624976f 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java 
e68bba90d8b08ac3978fcc9ccae612bdf02388e8 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 
c62ee98c2b22d819c9a994884b254f76eb518b6a 
  src/java/org/apache/sqoop/tool/ImportTool.java 
2c474b7eeeff02b59204e4baca8554d668b6c61e 
  src/java/org/apache/sqoop/tool/MergeTool.java 
4c20f7d151514b26a098dafdc1ee265cbde5ad20 
  src/test/org/apache/sqoop/TestBigDecimalExport.java 
ccea17345c0c8a2bdb7c8fd141f37e3c822ee41e 
  src/test/org/apache/sqoop/TestMerge.java 
11806fea6c59ea897bc1aa23f6657ed172d093d5 
  src/test/org/apache/sqoop/TestParquetExport.java 
43dabb57b7862b607490369e09b197b6de65a147 
  src/test/org/apache/sqoop/TestParquetImport.java 
27d407aa3f9f2781f675294fa98431bc46f3dcfa 
  src/test/org/apache/sqoop/TestParquetIncrementalImportMerge.java PRE-CREATION 
  src/test/org/apache/sqoop/TestSqoopOptions.java 
bb7c20ddcb8fb5fc9c3b1edfb73fecb739bba269 
  src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java 
f6d591b73373fdf33b27202cb8116025fb694ef1 
  src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java 
a5f85a06ba21b01e99c1655450d36016c2901cc0 
  src/test/org/apache/sqoop/testutil/ImportJobTestCase.java 
dbefe209770885063d1b4d0c3940d078b8d91cad 
  src/test/org/apache/sqoop/tool/TestBaseSqoopTool.java 

[jira] [Updated] (SQOOP-3332) Extend Documentation of --resilient flag and add warning message when detected

2018-06-26 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3332:
--
Description: 
The resilient flag can be used to trigger the retry mechanism in the SQL Server 
connector. The documentation only tells that it can be used in export, however 
it can be used in import as well.

Also, the feature itself relies on the implicit assumption that the split-by 
column is unique and sorted in ascending order. The users have to be warned 
about this limitation, at the very least.

  was:
The non-resilient flag can be used to avoid the retry mechanism in the SQL 
Server connector. The documentation only tells that it can be used in export, 
however it can be used in import as well.

Also, the feature itself relies on the implicit assumption that the split-by 
column is unique and sorted in ascending order. The users have to be warned 
about this limitation, at the very least.


> Extend Documentation of --resilient flag and add warning message when detected
> --
>
> Key: SQOOP-3332
> URL: https://issues.apache.org/jira/browse/SQOOP-3332
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> The resilient flag can be used to trigger the retry mechanism in the SQL 
> Server connector. The documentation only tells that it can be used in export, 
> however it can be used in import as well.
> Also, the feature itself relies on the implicit assumption that the split-by 
> column is unique and sorted in ascending order. The users have to be warned 
> about this limitation, at the very least.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)