from:"ASF GitHub Bot \(JIRA\)"

[jira] [Commented] (SQOOP-2607) Direct import from Netezza and encoding

2015-10-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14948296#comment-14948296
 ] 

ASF GitHub Bot commented on SQOOP-2607:
---

GitHub user bonnetb opened a pull request:

https://github.com/apache/sqoop/pull/9

[SQOOP-2607] Add a table encoding parameter for Netezza direct import

Direct import makes an external Netezza table using 'internal' encoding
for text colums. Then it integrates the external table into HDFS reading
it as a UTF-8 encoded stream.
But if the table contains VARCHAR that are not UTF-8 (e.g. ISO-8859-x),
the external table will share the same encoding as the source table, and
reading it as a UTF-8 encoded stream will corrupt non ASCII characters.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bonnetb/sqoop trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/9.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9


commit 8b8dc61106eaebcc5c0abb1e0ae9394069f77562
Author: Benjamin BONNET 
Date:   2015-10-05T20:50:58Z

Add a table encoding parameter for Netezza direct import

Direct import makes an external Netezza table using 'internal' encoding
for text colums. Then it integrates the external table into HDFS reading
it as a UTF-8 encoded stream.
But if the table contains VARCHAR that are not UTF-8 (e.g. ISO-8859-x),
the external table will share the same encoding as the source table, and
reading it as a UTF-8 encoded stream will corrupt non ASCII characters.




> Direct import from Netezza and encoding
> ---
>
> Key: SQOOP-2607
> URL: https://issues.apache.org/jira/browse/SQOOP-2607
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors
>Affects Versions: 1.4.6
>Reporter: Benjamin BONNET
>
> Hi,
> I encountered an encoding issue while importing a Netezza table containing 
> ISO-8859-15 encoded VARCHAR. Using direct mode, non ASCII chars are 
> corrupted. That does not occur using non-direct mode.
> Actually, direct mode uses a Netezza "external table", i.e. it flushes the 
> table into a stream using "internal" encoding (in my case, it is ISO-8859-15).
> But Sqoop import mapper reads this stream as an UTF-8 one.
> That problem does not occur using non direct mode since it uses Netezza JDBC 
> driver to map fields directly to Java types (no stream encoding involved).
> To have that issue fixed in my environment, I modified sqood netezza 
> connector and added a parameter to specify netezza varchar encoding. Default 
> value will be UTF-8 of course. I will make a pull request on github to 
> propose that enhancement.
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2489) Sqoop2: Hive with Parquet in Kite Connector

2015-08-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718164#comment-14718164
 ] 

ASF GitHub Bot commented on SQOOP-2489:
---

GitHub user sleefd opened a pull request:

https://github.com/apache/sqoop/pull/8

fix the problem of NoClassDefFoundError of HiveOutputFormat when impo…

for jira SQOOP-2489 @abec 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sleefd/sqoop sqoop2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/8.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8


commit 3baa7430c6fcac8d0847aaa0c49fead4fadce372
Author: slee sle...@gmail.com
Date:   2015-08-28T07:44:50Z

fix the problem of NoClassDefFoundError of HiveOutputFormat when import or 
export to hive through kite connector




 Sqoop2: Hive with Parquet in Kite Connector
 ---

 Key: SQOOP-2489
 URL: https://issues.apache.org/jira/browse/SQOOP-2489
 Project: Sqoop
  Issue Type: Bug
  Components: sqoop2-kite-connector
Affects Versions: 1.99.6
Reporter: Abraham Elmahrek
 Fix For: 1.99.7


 {code}
 java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/io/HiveOutputFormat
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:190)
   at 
 org.kitesdk.data.spi.hive.HiveUtils.getHiveParquetOutputFormat(HiveUtils.java:446)
   at org.kitesdk.data.spi.hive.HiveUtils.clinit(HiveUtils.java:91)
   at 
 org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:83)
   at 
 org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:77)
   at org.kitesdk.data.Datasets.create(Datasets.java:239)
   at org.kitesdk.data.Datasets.create(Datasets.java:307)
   at org.kitesdk.data.Datasets.create(Datasets.java:335)
   at 
 org.apache.sqoop.connector.kite.KiteDatasetExecutor.createDataset(KiteDatasetExecutor.java:70)
   at 
 org.apache.sqoop.connector.kite.KiteLoader.getExecutor(KiteLoader.java:52)
   at org.apache.sqoop.connector.kite.KiteLoader.load(KiteLoader.java:62)
   at org.apache.sqoop.connector.kite.KiteLoader.load(KiteLoader.java:36)
   at 
 org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:250)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hive.ql.io.HiveOutputFormat
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   ... 31 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-1532) Sqoop2: Support Sqoop on Spark Execution Engine

2015-12-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057582#comment-15057582
 ] 

ASF GitHub Bot commented on SQOOP-1532:
---

Github user eruizgar commented on the pull request:

https://github.com/apache/sqoop/pull/11#issuecomment-164680816
  
Hi jarcec, enclosed you can find the link of the complete PR patch. Hope 
this helps:
https://patch-diff.githubusercontent.com/raw/apache/sqoop/pull/11.diff


> Sqoop2: Support Sqoop on Spark Execution Engine
> ---
>
> Key: SQOOP-1532
> URL: https://issues.apache.org/jira/browse/SQOOP-1532
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Veena Basavaraj
>Assignee: Veena Basavaraj
> Fix For: 2.0.0
>
>
> The current execution engine supported in sqoop is MR.
> The goal if this ticket is to support sqoop jobs ( map only and map+reduce ) 
> to run on spark environment.
> It should at the minimum support running on the standalone spark cluster and 
> then subsequently work with YARN/mesos.
> High level goals
> 1. Hook up with the connector apis to provide the basic load/ extract to the 
> spark RDD.
> 2. Implementation of the Sqoop RDD to support extraction from different data 
> sources . The design proposal will discuss the alternatives on how this can 
> be achieved.
> 3. Optimizing the loading/writing ( re-use/ refactor the consumer thread code 
> to be agnostic of the hadoop output format)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-1532) Sqoop2: Support Sqoop on Spark Execution Engine

2015-12-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057480#comment-15057480
 ] 

ASF GitHub Bot commented on SQOOP-1532:
---

Github user jarcec commented on the pull request:

https://github.com/apache/sqoop/pull/11#issuecomment-164666219
  
Sqoop project do not accept pull requests at this point. Would you mind 
attaching the patch the JIRA itself?


> Sqoop2: Support Sqoop on Spark Execution Engine
> ---
>
> Key: SQOOP-1532
> URL: https://issues.apache.org/jira/browse/SQOOP-1532
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Veena Basavaraj
>Assignee: Veena Basavaraj
> Fix For: 2.0.0
>
>
> The current execution engine supported in sqoop is MR.
> The goal if this ticket is to support sqoop jobs ( map only and map+reduce ) 
> to run on spark environment.
> It should at the minimum support running on the standalone spark cluster and 
> then subsequently work with YARN/mesos.
> High level goals
> 1. Hook up with the connector apis to provide the basic load/ extract to the 
> spark RDD.
> 2. Implementation of the Sqoop RDD to support extraction from different data 
> sources . The design proposal will discuss the alternatives on how this can 
> be achieved.
> 3. Optimizing the loading/writing ( re-use/ refactor the consumer thread code 
> to be agnostic of the hadoop output format)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-1532) Sqoop2: Support Sqoop on Spark Execution Engine

2015-12-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056011#comment-15056011
 ] 

ASF GitHub Bot commented on SQOOP-1532:
---

GitHub user eruizgar opened a pull request:

https://github.com/apache/sqoop/pull/11

[SQOOP-1532] Support Sqoop on Spark Execution Engine

We have implemented the Issue SQOOP-1532 to support sqoop jobs to run on 
spark environment. You can run on the standalone spark cluster, using sqoop 
client.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Stratio/sqoop SQOOP-1532

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/11.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11


commit 6fe1428afd5d98ff97a4e26ad19f34467fe6ddfa
Author: aargoma...@stratio.com 
Date:   2015-11-17T16:50:33Z

Added spark dependencies

commit eff93a0cc9428dcf401894649de0f679cf932fc6
Author: Jarek Jarcec Cecho 
Date:   2015-11-17T16:56:16Z

SQOOP-2684: Sqoop2: Upgrade groovy to 2.4.0

(Dian Fu via Jarek Jarcec Cecho)

commit dc011e12b128853ea45a3c6f8a9704b7285ac16f
Author: Jarek Jarcec Cecho 
Date:   2015-11-17T17:04:01Z

SQOOP-2682: Sqoop2: Add test cases for the object name with special char

(Colin Ma via Jarek Jarcec Cecho)

commit 370ea29c2aa1d1fb0814e56f0afd16d4f66e2e41
Author: Jarek Jarcec Cecho 
Date:   2015-11-17T17:08:35Z

SQOOP-2680: Sqoop2: Remove the id from public interface for connection

(Colin Ma via Jarek Jarcec Cecho)

commit dfe984c14aef83338e1ac68972ba7503fcaa6d0f
Author: Jarek Jarcec Cecho 
Date:   2015-11-18T16:06:49Z

SQOOP-2396: Sqoop2: Race condition in purge/update threads on Server 
shutdown

(Dian Fu via Jarek Jarcec Cecho)

commit 68ca8bc2e1f2ceb8ad0ab14763a146fb0d2682db
Author: Kate Ting 
Date:   2015-11-18T22:34:17Z

SQOOP-2688: Sqoop2: Provide utility method to safely retrieve value from 
JSONObject
(Jarek Jarcec  Cecho via Kate Ting)

commit cc3e77b89e653a5f33996d53d3fe4fb7839c16a3
Author: Kate Ting 
Date:   2015-11-18T22:58:57Z

SQOOP-2694: Sqoop2: Doc: Register structure in sphinx for our docs
(Jarek Jarcec Cecho via Kate Ting)

commit 4f6ea567ffd9b5f43614c2c2b632789e9c752422
Author: Jarek Jarcec Cecho 
Date:   2015-11-19T15:07:07Z

SQOOP-2700: Sqoop2: Tests in shell module are in infinite loop

(Dian Fu via Jarek Jarcec Cecho)

commit ee64ec6e2fb856a48f18685daf8459b9ad1da083
Author: Jarek Jarcec Cecho 
Date:   2015-11-20T16:19:01Z

SQOOP-2699: Sqoop2: Oraoop: Improve Oracle parameters

(David Robson via Jarek Jarcec Cecho)

commit bad653c995df426189c67775748ed83321b6ad54
Author: Kate Ting 
Date:   2015-11-20T22:26:21Z

SQOOP-2698: Sqoop2: RESTiliency: Split the InvalidRESTCallsTest into 
independent test cases
 (Jarek Jarcec Cecho via Kate Ting)

commit 2c58a54ec871dc05198023ab6e8a3e2afa1d9343
Author: Enrique ruiz 
Date:   2015-11-17T16:50:33Z

[SQOOP-1532] Support Sqoop on Spark Execution Engine




> Sqoop2: Support Sqoop on Spark Execution Engine
> ---
>
> Key: SQOOP-1532
> URL: https://issues.apache.org/jira/browse/SQOOP-1532
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Veena Basavaraj
>Assignee: Veena Basavaraj
> Fix For: 2.0.0
>
>
> The current execution engine supported in sqoop is MR.
> The goal if this ticket is to support sqoop jobs ( map only and map+reduce ) 
> to run on spark environment.
> It should at the minimum support running on the standalone spark cluster and 
> then subsequently work with YARN/mesos.
> High level goals
> 1. Hook up with the connector apis to provide the basic load/ extract to the 
> spark RDD.
> 2. Implementation of the Sqoop RDD to support extraction from different data 
> sources . The design proposal will discuss the alternatives on how this can 
> be achieved.
> 3. Optimizing the loading/writing ( re-use/ refactor the consumer thread code 
> to be agnostic of the hadoop output format)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2930) Sqoop job exec not overriding the saved job generic properties

2016-05-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299579#comment-15299579
 ] 

ASF GitHub Bot commented on SQOOP-2930:
---

GitHub user git-rbanerjee opened a pull request:

https://github.com/apache/sqoop/pull/20

SQOOP-2930 & SQOOP-1933 :: Sqoop job exec not overriding the saved pr…

…operties .

SQOOP-2930 & SQOOP-1933 :: Sqoop job exec not overriding the saved 
properties .

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/git-rbanerjee/sqoop-1 patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/20.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20


commit 3e091d1adf378b0ce73c0e63e42693aab1f2158f
Author: CodeR 
Date:   2016-05-25T07:02:09Z

SQOOP-2930 & SQOOP-1933 :: Sqoop job exec not overriding the saved 
properties .

SQOOP-2930 & SQOOP-1933 :: Sqoop job exec not overriding the saved 
properties .




> Sqoop job exec not overriding the saved job generic properties
> --
>
> Key: SQOOP-2930
> URL: https://issues.apache.org/jira/browse/SQOOP-2930
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Rabin Banerjee
> Attachments: fixpatch_v1.patch
>
>
> Sqoop job exec not overriding the saved job generic properties .
> sqoop job -Dorg.apache.sqoop.xyz=xyz --create job1 -- import .. 
> sqoop job -Dorg.apache.sqoop.xyz=abc --exec job1
> exec is not overriding the xyz with abc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2016-06-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350402#comment-15350402
 ] 

ASF GitHub Bot commented on SQOOP-2949:
---

GitHub user gireeshp opened a pull request:

https://github.com/apache/sqoop/pull/21

[SQOOP-2949] SQL Syntax error when split-by column is of character type and 
min or max value has single quote inside it

**SQL Syntax error when split-by column is of character type and min or max 
value has single quote inside it**

https://issues.apache.org/jira/browse/SQOOP-2949?jql=project%20%3D%20SQOOP

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gireeshp/sqoop trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/21.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21


commit e6b56c7449b87c1f997b52ab54fa5a7f9c4214ec
Author: Gireesh Puthumana 
Date:   2016-06-27T04:22:14Z

Fix for JIRA ticket SQOOP-2949




> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
>

[jira] [Commented] (SQOOP-2821) Direct export to Netezza : user/owner confusion

2016-02-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140644#comment-15140644
 ] 

ASF GitHub Bot commented on SQOOP-2821:
---

GitHub user bonnetb opened a pull request:

https://github.com/apache/sqoop/pull/12

SQOOP-2821

Direct export to Netezza : user/owner confusion

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bonnetb/sqoop SQOOP-2821

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/12.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12


commit c0e82d67251ac3fc79e9b75215374cebd5d3eed2
Author: Benjamin BONNET 
Date:   2016-02-10T11:14:49Z

SQOOP-2821
Direct export to Netezza : user/owner confusion




> Direct export to Netezza : user/owner confusion
> ---
>
> Key: SQOOP-2821
> URL: https://issues.apache.org/jira/browse/SQOOP-2821
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors
>Affects Versions: 1.4.6
>Reporter: Benjamin BONNET
>
> Hi,
> when exporting to Netezza, if connected user (in the Netezza URL) is not the 
> target table owner, things go wrong :
> - if you do not use qualified table name, the table existence check will fail 
> since SQOOP will assume table owner is the same as connected user
> - if you do use qualified table name, the table existence check will succeed 
> but table export will fail since SQOOP will try to export to a twice 
> qualified table (db.owner.owner.table instead of db.owner.table)
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2607) Direct import from Netezza and encoding

2016-02-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140745#comment-15140745
 ] 

ASF GitHub Bot commented on SQOOP-2607:
---

Github user bonnetb closed the pull request at:

https://github.com/apache/sqoop/pull/9


> Direct import from Netezza and encoding
> ---
>
> Key: SQOOP-2607
> URL: https://issues.apache.org/jira/browse/SQOOP-2607
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors
>Affects Versions: 1.4.6
>Reporter: Benjamin BONNET
>Assignee: Benjamin BONNET
> Fix For: 1.4.7
>
> Attachments: 
> 0001-Add-a-table-encoding-parameter-for-Netezza-direct-im.patch
>
>
> Hi,
> I encountered an encoding issue while importing a Netezza table containing 
> ISO-8859-15 encoded VARCHAR. Using direct mode, non ASCII chars are 
> corrupted. That does not occur using non-direct mode.
> Actually, direct mode uses a Netezza "external table", i.e. it flushes the 
> table into a stream using "internal" encoding (in my case, it is ISO-8859-15).
> But Sqoop import mapper reads this stream as an UTF-8 one.
> That problem does not occur using non direct mode since it uses Netezza JDBC 
> driver to map fields directly to Java types (no stream encoding involved).
> To have that issue fixed in my environment, I modified sqood netezza 
> connector and added a parameter to specify netezza varchar encoding. Default 
> value will be UTF-8 of course. I will make a pull request on github to 
> propose that enhancement.
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2839) Sqoop import failure due to data member conflict in ORM code for table

2016-02-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146988#comment-15146988
 ] 

ASF GitHub Bot commented on SQOOP-2839:
---

GitHub user vishnusn opened a pull request:

https://github.com/apache/sqoop/pull/13

[SQOOP-2839] : Sqoop import failure due to data member conflict in ORM code 
for table

[SQOOP-2839] : Sqoop creates a Java class corresponding to the table to be 
imported, which contains all the column names as data members. It also includes 
a constant named  "PROTOCOL_VERSION" and this leads to conflict between data 
members, if one of the column is named “PROTOCOL_VERSION”. So we updated the 
constant name to PROTOCOL_VERSION_SQOOP_.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vishnusn/sqoop patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/13.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13


commit 2a3d5e4027ff7118049c72190994c1c0900aa455
Author: vishnusn 
Date:   2016-02-15T06:58:07Z

Update ClassWriter.java

[SQOOP-2839] : Sqoop creates a Java class corresponding to the table to be 
imported, which contains all the column names as data members. It also includes 
a constant named  "PROTOCOL_VERSION" and this leads to conflict between data 
members, if one of the column is named “PROTOCOL_VERSION”. So we updated the 
constant name to PROTOCOL_VERSION_SQOOP_.




> Sqoop import failure due to data member conflict in ORM code for table
> --
>
> Key: SQOOP-2839
> URL: https://issues.apache.org/jira/browse/SQOOP-2839
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen
>Reporter: VISHNU S NAIR
>
> While importing data with Sqoop, if any of the table contains a column named 
> "PROTOCOL_VERSION", Sqoop will fail to import data.
> /tmp/sqoop-user/compile/fd570d817e8323d1135a7f2a6612e321/QueryResult.java:173:
>  error: variable PROTOCOL_VERSION is already defined in class QueryResult
>   private String PROTOCOL_VERSION;
>  ^
> /tmp/sqoop-user/compile/fd570d817e8323d1135a7f2a6612e321/QueryResult.java:175:
>  error: incompatible types
> return PROTOCOL_VERSION;
>^
>   required: String
>   found:int



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2839) Sqoop import failure due to data member conflict in ORM code for table

2016-02-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149966#comment-15149966
 ] 

ASF GitHub Bot commented on SQOOP-2839:
---

GitHub user vishnusn opened a pull request:

https://github.com/apache/sqoop/pull/14

[SQOOP-2839]

On our investigation we found out this is related to Sqoop’s internal 
implementation. Sqoop internally creates a Java class corresponding to the 
table to be imported, which contains all the column names as data members. It 
also includes a constant named  "PROTOCOL_VERSION" and this leads to conflict 
between data members, if one of the column is named “PROTOCOL_VERSION”.

So in order to solve this issue , adding PROTOCOL_VERSION in reserved words 
list

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vishnusn/sqoop patch-3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/14.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14


commit e8887a418ee916546a2fb62475546e294a9a552e
Author: vishnusn 
Date:   2016-02-17T06:16:01Z

[SQOOP-2839]

On our investigation we found out this is related to Sqoop’s internal 
implementation. Sqoop internally creates a Java class corresponding to the 
table to be imported, which contains all the column names as data members. It 
also includes a constant named  "PROTOCOL_VERSION" and this leads to conflict 
between data members, if one of the column is named “PROTOCOL_VERSION”.

So in order to solve this issue , adding PROTOCOL_VERSION in reserved words 
list




> Sqoop import failure due to data member conflict in ORM code for table
> --
>
> Key: SQOOP-2839
> URL: https://issues.apache.org/jira/browse/SQOOP-2839
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.4.6
>Reporter: VISHNU S NAIR
>
> While importing data with Sqoop, if any of the table contains a column named 
> "PROTOCOL_VERSION", Sqoop will fail to import data.
> /tmp/sqoop-user/compile/fd570d817e8323d1135a7f2a6612e321/QueryResult.java:173:
>  error: variable PROTOCOL_VERSION is already defined in class QueryResult
>   private String PROTOCOL_VERSION;
>  ^
> /tmp/sqoop-user/compile/fd570d817e8323d1135a7f2a6612e321/QueryResult.java:175:
>  error: incompatible types
> return PROTOCOL_VERSION;
>^
>   required: String
>   found:int



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2906) Optimization of AvroUtil.toAvroIdentifier

2016-04-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237026#comment-15237026
 ] 

ASF GitHub Bot commented on SQOOP-2906:
---

Github user JoeriHermans commented on the pull request:

https://github.com/apache/sqoop/pull/18#issuecomment-208859666
  
@stanleyxu2005 The main difference in my implementation is that I only have 
to do a single copy. StringBuilders are inefficient, char arrays are a lot 
faster since everything is pre-allocated already. Furthermore, the overhead 
over constantly allocating a new object, and the fact that internally the 
StringBuilder will do some copying as well, make this an inefficient approach.

For example, I implemented your approach and instead of a 500% improvement 
with our method, I have a 230% increase in performance with the StringBuilder 
approach. So the char-array is definitely a lot faster, and this is what it is 
all about, since this is a very prominent function on the stack.

Kind regards,

Joeri


> Optimization of AvroUtil.toAvroIdentifier
> -
>
> Key: SQOOP-2906
> URL: https://issues.apache.org/jira/browse/SQOOP-2906
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Joeri Hermans
>  Labels: avro, hadoop, optimization
>
> Hi all
> Our distributed profiler indicated some inefficiencies in the 
> AvroUtil.toAvroIdentifier method, more specifically, the use of Regex 
> patterns. This can be directly observed from the FlameGraph generated by this 
> profiler (https://jhermans.web.cern.ch/jhermans/sqoop_avro_flamegraph.svg). 
> We implemented an optimization, and compared this with the original method. 
> On our testing machine, the optimization by itself is about 500% (on average) 
> more efficient compared to the original implementation. We have yet to test 
> how this optimization will influence the performance of user jobs.
> Any suggestions or remarks are welcome.
> Kind regards,
> Joeri
> https://github.com/apache/sqoop/pull/18



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2906) Optimization of AvroUtil.toAvroIdentifier

2016-04-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237015#comment-15237015
 ] 

ASF GitHub Bot commented on SQOOP-2906:
---

Github user stanleyxu2005 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/18#discussion_r59360059
  
--- Diff: src/java/org/apache/sqoop/avro/AvroUtil.java ---
@@ -114,11 +114,20 @@ public static String toAvroColumn(String column) {
* Format candidate to avro specifics
*/
   public static String toAvroIdentifier(String candidate) {
-String formattedCandidate = candidate.replaceAll("\\W+", "_");
-if (formattedCandidate.substring(0,1).matches("[a-zA-Z_]")) {
-  return formattedCandidate;
+char[] data = candidate.toCharArray();
+int stringIndex = 0;
+
+for (char c:data) {
+  if (Character.isLetterOrDigit(c) || c == '_') {
+data[stringIndex++] = c;
+  }
+}
+
+char initial = data[0];
+if (Character.isLetter(initial) || initial == '_') {
+  return new String(data, 0, stringIndex);
--- End diff --

Your code will first create a char array and then eventually update char in 
the array. As result you will create another copy as a new String. Have you 
thought about using a `StringBuilder` directly?
```
  final StringBuilder sb = new StringBuilder();
  for (char c : candidate) {
if (Character.isLetterOrDigit(c) || c == '_') {
  sb.append(c);
}
  }
  ...
return sb.toString();
```


> Optimization of AvroUtil.toAvroIdentifier
> -
>
> Key: SQOOP-2906
> URL: https://issues.apache.org/jira/browse/SQOOP-2906
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Joeri Hermans
>  Labels: avro, hadoop, optimization
>
> Hi all
> Our distributed profiler indicated some inefficiencies in the 
> AvroUtil.toAvroIdentifier method, more specifically, the use of Regex 
> patterns. This can be directly observed from the FlameGraph generated by this 
> profiler (https://jhermans.web.cern.ch/jhermans/sqoop_avro_flamegraph.svg). 
> We implemented an optimization, and compared this with the original method. 
> On our testing machine, the optimization by itself is about 500% (on average) 
> more efficient compared to the original implementation. We have yet to test 
> how this optimization will influence the performance of user jobs.
> Any suggestions or remarks are welcome.
> Kind regards,
> Joeri
> https://github.com/apache/sqoop/pull/18



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2895) NCLOB import to Hive not supported

2016-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212102#comment-15212102
 ] 

ASF GitHub Bot commented on SQOOP-2895:
---

GitHub user bonnetb opened a pull request:

https://github.com/apache/sqoop/pull/17

SQOOP-2895 : NCLOB import to Hive not supported

see https://issues.apache.org/jira/browse/SQOOP-2895

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bonnetb/sqoop SQOOP-2895

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/17.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17


commit 42e44ed64b4ec8801d9377111e7eb5460bfef3bc
Author: Benjamin BONNET 
Date:   2016-03-25T17:07:00Z

SQOOP-2895 : NCLOB import to Hive not supported




> NCLOB import to Hive not supported 
> ---
>
> Key: SQOOP-2895
> URL: https://issues.apache.org/jira/browse/SQOOP-2895
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Reporter: Benjamin BONNET
>
> Hi,
> Sqoop cannot import and create Hive table when the table to be imported 
> contains NCLOB columns.
> Actually, data is correctly imported into HDFS but Hive table creation fails 
> with :
> ERROR tool.ImportTool: Encountered IOException running import job: 
> java.io.IOException: Hive does not support the SQL type for column 
> myNClobColumn
> at 
> org.apache.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:181)
> at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:188)
> at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
> at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
> at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
> That issue is due to the fact SQOOP knows CLOB type but not NCLOB. See type 
> mappings in org.apache.sqoop.hive.HiveTypes where NCLOB is not.
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2916) When completing the TO part of a job, people cannot state custom SQL statement by any means

2016-05-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268340#comment-15268340
 ] 

ASF GitHub Bot commented on SQOOP-2916:
---

Github user FuqiaoWang commented on the pull request:

https://github.com/apache/sqoop/pull/19#issuecomment-216464764
  
Thank you for your reply @afine, I have submitted my patch 
to:https://issues.apache.org/jira/browse/SQOOP-2916 .I would be very happy if 
you can review it.


> When completing the TO part of a job, people cannot state custom SQL 
> statement by any means
> ---
>
> Key: SQOOP-2916
> URL: https://issues.apache.org/jira/browse/SQOOP-2916
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/generic
>Affects Versions: 1.99.6
>Reporter: FuqiaoWang
> Attachments: 2916.patch
>
>
> 1. When both TableName and SQL statement are provided, it reports the 
> ’GENERIC_JDBC_CONNECTOR_0007("The table name and the table sql cannot be 
> specified together")' exception.
> 2. When only the SQL statement is provided, it reports the 'Both table name 
> and SQL cannot be specified' exception.
> The modification we have made is simply to allow the TableName and the SQL 
> statement can be input together and in this case, the SQL have the higher 
> priority to be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2991) Never ending imports from Netezza

2016-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401822#comment-15401822
 ] 

ASF GitHub Bot commented on SQOOP-2991:
---

GitHub user bonnetb opened a pull request:

https://github.com/apache/sqoop/pull/24

SQOOP-2991 : avoid endless failed Netezza import

see Jira  : https://issues.apache.org/jira/browse/SQOOP-2991

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bonnetb/sqoop SQOOP-2991

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/24.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #24


commit 086b3599d1e7e53dd9f3574c0c3b7697dc3c762a
Author: Benjamin BONNET 
Date:   2016-08-01T10:42:16Z

SQOOP-2991 : avoid endless failed Netezza import




> Never ending imports from Netezza
> -
>
> Key: SQOOP-2991
> URL: https://issues.apache.org/jira/browse/SQOOP-2991
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors
>Affects Versions: 1.4.6
>Reporter: Benjamin BONNET
>Priority: Critical
>
> Hi,
> there are situations where a Netezza import may fail but never end (i.e. the 
> map reduce job will run for ever).
> That occurs when Sqoop manages to open a connection to the database and 
> executes a query that fails on the Netezza side without writing anything into 
> the connection. For instance, that typically occurs for authroization 
> problems. 
> Then you have to kill the map reuce job by hand if you want to free the 
> resource (memory) kept by the MR container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2930) Sqoop job exec not overriding the saved job generic properties

2016-07-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361977#comment-15361977
 ] 

ASF GitHub Bot commented on SQOOP-2930:
---

Github user jarcec commented on the issue:

https://github.com/apache/sqoop/pull/20
  
Hi @git-rbanerjee ,
Sqoop project currently does not accept pull requests. You will need to 
generate patch and upload it to the JIRA as per our instructions here:

https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute


> Sqoop job exec not overriding the saved job generic properties
> --
>
> Key: SQOOP-2930
> URL: https://issues.apache.org/jira/browse/SQOOP-2930
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Rabin Banerjee
> Attachments: fixpatch_v1.patch
>
>
> Sqoop job exec not overriding the saved job generic properties .
> sqoop job -Dorg.apache.sqoop.xyz=xyz --create job1 -- import .. 
> sqoop job -Dorg.apache.sqoop.xyz=abc --exec job1
> exec is not overriding the xyz with abc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2930) Sqoop job exec not overriding the saved job generic properties

2016-07-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362321#comment-15362321
 ] 

ASF GitHub Bot commented on SQOOP-2930:
---

Github user git-rbanerjee commented on the issue:

https://github.com/apache/sqoop/pull/20
  
Thanks Jarek !!

Patch is also added .
https://issues.apache.org/jira/browse/SQOOP-2930
https://issues.apache.org/jira/secure/attachment/12806082/fixpatch_v1.patch

On Tue, Jul 5, 2016 at 9:45 AM, Jarek Jarcec Cecho  wrote:

> Hi @git-rbanerjee  ,
> Sqoop project currently does not accept pull requests. You will need to
> generate patch and upload it to the JIRA as per our instructions here:
>
> https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
> 

> .
>



-- 
*Rabin Banerjee *



> Sqoop job exec not overriding the saved job generic properties
> --
>
> Key: SQOOP-2930
> URL: https://issues.apache.org/jira/browse/SQOOP-2930
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Rabin Banerjee
> Attachments: fixpatch_v1.patch
>
>
> Sqoop job exec not overriding the saved job generic properties .
> sqoop job -Dorg.apache.sqoop.xyz=xyz --create job1 -- import .. 
> sqoop job -Dorg.apache.sqoop.xyz=abc --exec job1
> exec is not overriding the xyz with abc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2016-07-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15370391#comment-15370391
 ] 

ASF GitHub Bot commented on SQOOP-2949:
---

Github user liz-z17 commented on the issue:

https://github.com/apache/sqoop/pull/21
  
Hi @gireeshp ,
Sqoop project currently does not accept pull requests. To contribute, you 
will need to generate patch and upload it to the JIRA.
See the instructions here: 
[https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute](https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute)
If you need more directions, I'd be happy to help!


> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2989) throw nullpointerexception

2016-07-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399593#comment-15399593
 ] 

ASF GitHub Bot commented on SQOOP-2989:
---

Github user happyziqi commented on the issue:

https://github.com/apache/sqoop/pull/23
  
@liz-z17 
hi,i create the issue,
https://issues.apache.org/jira/browse/SQOOP-2989
what should i do to assign it and upload the patch


> throw nullpointerexception
> --
>
> Key: SQOOP-2989
> URL: https://issues.apache.org/jira/browse/SQOOP-2989
> Project: Sqoop
>  Issue Type: Bug
>  Components: tools
>Reporter: happyziqi
>  Labels: newbie
> Fix For: no-release
>
>
> when the configure parameter 'bindir' point at a common directory,
> sqoop may throw a NullPointerException  if file in that directory is been 
> deleted during the building jar stage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430912#comment-15430912
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

GitHub user kevin00chen opened a pull request:

https://github.com/apache/sqoop/pull/26

[SQOOP-3002] sqoop merge tool composite merge-key

JIRA Issue:https://issues.apache.org/jira/browse/SQOOP-3002
Sqoop Merge Tool can just specify one column by using --merge-key argument.
When i need to specify two or more column, i need to modify some source code

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kevin00chen/sqoop my_change

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/26.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #26


commit cd1e840c8dfb6261aa3be81b9c4881e80bc038bd
Author: KaimingChen 
Date:   2016-08-22T14:34:45Z

sqoop merge tool composite merge-key




> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430980#comment-15430980
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

Github user kevin00chen commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/26#discussion_r75698166
  
--- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java ---
@@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c)
 }
 Object keyObj = null;
 if (keyColName.contains(",")) {
+String connectStr = new String(new byte[]{1});
 StringBuilder keyFieldsSb = new StringBuilder();
 for (String str : keyColName.split(",")) {
-keyFieldsSb.append("+").append(fieldMap.get(str).toString());
+
keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString());
--- End diff --

for example one table has two column, a and b
Field a | Field b
 | -
a+ | b
a | +b

when use "+" to connect two field, two record will has same keyObj.
To avoid this i use a String contains one byte.


> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430986#comment-15430986
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

Github user kevin00chen commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/26#discussion_r75698508
  
--- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java ---
@@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c)
 }
 Object keyObj = null;
 if (keyColName.contains(",")) {
+String connectStr = new String(new byte[]{1});
 StringBuilder keyFieldsSb = new StringBuilder();
 for (String str : keyColName.split(",")) {
-keyFieldsSb.append("+").append(fieldMap.get(str).toString());
+
keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString());
--- End diff --

for example one table has two column, a and b

Field a | Field b
 | -
a+ | b
a | +b

when use "+" to connect two field, two record will has same keyObj.
To avoid this i use a String contains one byte.


> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3010) Sqoop should not allow --as-parquetfile when job type is hcatalog

2016-09-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487775#comment-15487775
 ] 

ASF GitHub Bot commented on SQOOP-3010:
---

GitHub user git-rbanerjee opened a pull request:

https://github.com/apache/sqoop/pull/28

[SQOOP-3010] Fix for Sqoop should not allow --as-parquetfile when job…

… type is hcatalog

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/git-rbanerjee/sqoop-1 trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/28.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #28


commit b8b986f264d2d5fe5f8c7611477e8e7759e46762
Author: CodeR 
Date:   2016-09-13T17:05:00Z

[SQOOP-3010] Fix for Sqoop should not allow --as-parquetfile when job type 
is hcatalog




> Sqoop should not allow --as-parquetfile when job type is hcatalog
> -
>
> Key: SQOOP-3010
> URL: https://issues.apache.org/jira/browse/SQOOP-3010
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Sowmya Ramesh
>Assignee: Sowmya Ramesh
>
> sqoop import ... --create-hcatalog-table --hcatalog-table --as-parquetfile
> {noformat}
>   Error: java.lang.RuntimeException: Should never be used
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat.getRecordWriter(MapredParquetOutputFormat.java:76)
>   at 
> org.apache.hive.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:103)
> {noformat}
> This should not run and should display a validation error as it does for both 
> --as-sequencefile and --as-avrodatafile but the job runs and fails later with 
> RuntimeException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521446#comment-15521446
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

Github user liz-z17 commented on the issue:

https://github.com/apache/sqoop/pull/26
  
Hi @kevin00chen ,
Sqoop project currently does not accept pull requests. To contribute, you 
will need to generate patch and upload it to the JIRA (of course if there's no 
corresponding JIRA issue, you will also need to create one first).
See the instructions here: 
https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute
If you need more directions, I'd be happy to help!


> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3010) Sqoop should not allow --as-parquetfile with hcatalog jobs or when hive import with create-hive-table is used

2016-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521450#comment-15521450
 ] 

ASF GitHub Bot commented on SQOOP-3010:
---

Github user liz-z17 commented on the issue:

https://github.com/apache/sqoop/pull/28
  
Hi @git-rbanerjee ,
Sqoop project currently does not accept pull requests. To contribute, you 
will need to generate patch and upload it to the JIRA.
See the instructions here: 
https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute
If you need more directions, I'd be happy to help!


> Sqoop should not allow --as-parquetfile with hcatalog jobs or when hive 
> import with create-hive-table is used
> -
>
> Key: SQOOP-3010
> URL: https://issues.apache.org/jira/browse/SQOOP-3010
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Sowmya Ramesh
>Assignee: Sowmya Ramesh
>
> sqoop import ... --create-hcatalog-table --hcatalog-table --as-parquetfile
> {noformat}
>   Error: java.lang.RuntimeException: Should never be used
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat.getRecordWriter(MapredParquetOutputFormat.java:76)
>   at 
> org.apache.hive.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:103)
> {noformat}
> This should not run and should display a validation error as it does for both 
> --as-sequencefile and --as-avrodatafile but the job runs and fails later with 
> RuntimeException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3043) Sqoop HiveImport fails with Wrong FS while removing the _logs

2016-11-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635810#comment-15635810
 ] 

ASF GitHub Bot commented on SQOOP-3043:
---

GitHub user RameshByndoor opened a pull request:

https://github.com/apache/sqoop/pull/29

SQOOP-3043: Sqoop HiveImport fails with Wrong FS while removing the _…



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/RameshByndoor/sqoop trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/29.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #29


commit 3220e3029edd9e0c38c2d973764f4f991b7165f3
Author: Ramesh B 
Date:   2016-11-04T09:29:55Z

SQOOP-3043: Sqoop HiveImport fails with Wrong FS while removing the _logs




> Sqoop HiveImport fails with Wrong FS while removing the _logs 
> --
>
> Key: SQOOP-3043
> URL: https://issues.apache.org/jira/browse/SQOOP-3043
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Reporter: Ramesh B
>
> With s3:// as --target-dir and --hive-import provided sqoop fails with 
> {code}ERROR tool.ImportTool: Imported Failed: Wrong FS: 
> s3a://dataplatform/sqoop/target/user/_logs, expected: hdfs://nn1
> {code}
> This is due to removeTempLogs method in HiveImport.java which is expecting 
> hdfs as the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-3172) SQOOP - Broken Pipe Error in the Sqoop export

2017-04-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967249#comment-15967249
 ] 

ASF GitHub Bot commented on SQOOP-3172:
---

GitHub user shetaksroc opened a pull request:

https://github.com/apache/sqoop/pull/35

SQOOP-3172:checking whether the object is stale or not



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shetaksroc/sqoop fix_state_connection

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/35.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #35


commit b26615cc69885e96fd100e7451e0f5657cd5616b
Author: akshayshet 
Date:   2017-04-13T07:43:09Z

SQOOP-3172:checking whether the object is stale or not




> SQOOP - Broken Pipe Error in the Sqoop export
> -
>
> Key: SQOOP-3172
> URL: https://issues.apache.org/jira/browse/SQOOP-3172
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Akshay
>  Labels: sqoop, testing
>
> Error: java.io.IOException: Broken pipe
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:326)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
> at 
> org.apache.sqoop.mapreduce.MySQLExportMapper.closeExportHandles(MySQLExportMapper.java:259)
> at 
> org.apache.sqoop.mapreduce.MySQLExportMapper.run(MySQLExportMapper.java:250)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:796)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (SQOOP-3172) SQOOP - Broken Pipe Error in the Sqoop export

2017-04-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967277#comment-15967277
 ] 

ASF GitHub Bot commented on SQOOP-3172:
---

Github user ebogi commented on the issue:

https://github.com/apache/sqoop/pull/35
  
Hi,
Sqoop project currently does not accept pull requests. To contribute, you 
will need to generate patch and upload it to the JIRA. See instructions here: 
https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute
If you need more help, feel free to reach out to the community at 
dev@sqoop.apache.org


> SQOOP - Broken Pipe Error in the Sqoop export
> -
>
> Key: SQOOP-3172
> URL: https://issues.apache.org/jira/browse/SQOOP-3172
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.5
>Reporter: Akshay
>  Labels: sqoop, testing
> Fix For: 1.4.5
>
>
> Error: java.io.IOException: Broken pipe
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:326)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
> at 
> org.apache.sqoop.mapreduce.MySQLExportMapper.closeExportHandles(MySQLExportMapper.java:259)
> at 
> org.apache.sqoop.mapreduce.MySQLExportMapper.run(MySQLExportMapper.java:250)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:796)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (SQOOP-2903) Add Kudu connector for Sqoop

2017-07-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077283#comment-16077283
 ] 

ASF GitHub Bot commented on SQOOP-2903:
---

GitHub user cammachusa opened a pull request:

https://github.com/apache/sqoop/pull/37

[SQOOP-2903] - Add Kudu connector for Sqoop



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/InspurUSA/sqoop SQOOP-2903

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/37.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #37


commit c9c07b2e8eabfb2066187c42a4d533e29ceded6c
Author: cam 
Date:   2017-07-04T00:04:41Z

[SQOOP-2903] - Add Kudu connector for Sqoop




> Add Kudu connector for Sqoop
> 
>
> Key: SQOOP-2903
> URL: https://issues.apache.org/jira/browse/SQOOP-2903
> Project: Sqoop
>  Issue Type: Improvement
>  Components: connectors
>Reporter: Sameer Abhyankar
>Assignee: Sameer Abhyankar
> Attachments: SQOOP-2903.1.patch, SQOOP-2903.2.patch, SQOOP-2903.patch
>
>
> Sqoop currently does not have a connector for Kudu. We should add the 
> functionality to allow Sqoop to ingest data directly into Kudu.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (SQOOP-3264) Import JDBC SQL date,time,timestamp to Hive as TIMESTAMP, BIGINT and TIMESTAMP

2017-11-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269004#comment-16269004
 ] 

ASF GitHub Bot commented on SQOOP-3264:
---

GitHub user michalklempa opened a pull request:

https://github.com/apache/sqoop/pull/40

SQOOP-3264 Hive types for JDBC timestamp, date and time types are 
timestamp, timestamp and bigint respectively

This resolves https://issues.apache.org/jira/browse/SQOOP-3264
Although further testing on different databases is needed (help welcome).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/triviadata/sqoop 
SQOOP-3264_date_to_hive_timestamp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #40


commit ceefc33a6c5632f1300bddc73c20f7819959f10f
Author: Michal Klempa 
Date:   2017-11-28T16:30:28Z

SQOOP-3264 Hive types for JDBC timestamp, date and time types are 
timestamp, timestamp and bigint respectively




> Import JDBC SQL date,time,timestamp to Hive as TIMESTAMP, BIGINT and TIMESTAMP
> --
>
> Key: SQOOP-3264
> URL: https://issues.apache.org/jira/browse/SQOOP-3264
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.6
>Reporter: Michal Klempa
>Priority: Minor
> Fix For: 1.4.7
>
>
> When importing JDBC SQL  Types:
> {code}
> public final static int DATE=  91;
> public final static int TIME=  92;
> public final static int TIMESTAMP   =  93;
> {code}
> Sqoop currently uses the org.apache.sqoop.hive.HiveTypes.toHiveType method, 
> where all of these types are mapped to STRING type.
> Given that in fact, the JDBC value returned is of type Long, let me propose 
> we can output the type for Hive as:
> {code}
> DATE -> TIMESTAMP
> TIME -> BIGINT
> TIMESTAMP -> TIMESTAMP
> {code}
> This is also in line with org.apache.sqoop.manager.ConnManager.toAvroType, 
> where the type is 
> {code}
> case Types.DATE:
> case Types.TIME:
> case Types.TIMESTAMP:
>   return Type.LONG;
> {code}
> Some of the connectors override the toJavaType:
> {code}
> org.apache.sqoop.manager.SQLServerManager
> org.apache.sqoop.manager.oracle.OraOopConnManager
> {code}
> which may indicate different handling.
> The SQLServerManager uses Java String as the output type, because of 
> timezones.
> Same holds true for OraOopConnManager, although it has a separate 
> configuration boolean value 
> 'oraoop.timestamp.string' which controls whether the import will use 
> timezones and convert date types
> to Java String, or timezones are going to be dropped and import will behave 
> the 'sqoop way'.
> Both of these connectors already handle these types as String by default, 
> proposed change would not affect them.
> Other connectors are needed to be checked.
> Some of the connectors override the toHiveType:
> {code}
> org.apache.sqoop.manager.oracle.OraOopConnManager
> {code}
> This connector uses the 'sqoop way':
> {code}
> String hiveType = super.toHiveType(sqlType);
> {code}
> and only when not resolved, the type used is decided:
> {code}
> if (hiveType == null) {
>   // http://wiki.apache.org/hadoop/Hive/Tutorial#Primitive_Types
>   if (sqlType == OraOopOracleQueries.getOracleType("BFILE")
>   || sqlType == OraOopOracleQueries.getOracleType("INTERVALYM")
>   || sqlType == OraOopOracleQueries.getOracleType("INTERVALDS")
>   || sqlType == OraOopOracleQueries.getOracleType("NCLOB")
>   || sqlType == OraOopOracleQueries.getOracleType("NCHAR")
>   || sqlType == OraOopOracleQueries.getOracleType("NVARCHAR")
>   || sqlType == OraOopOracleQueries.getOracleType("OTHER")
>   || sqlType == OraOopOracleQueries.getOracleType("ROWID")
>   || sqlType == OraOopOracleQueries.getOracleType("TIMESTAMPTZ")
>   || sqlType == OraOopOracleQueries.getOracleType("TIMESTAMPLTZ")
>   || sqlType == OraOopOracleQueries.getOracleType("STRUCT")) {
> hiveType = "STRING";
>   }
>   if (sqlType == OraOopOracleQueries.getOracleType("BINARY_FLOAT")) {
> hiveType = "FLOAT";
>   }
>   if (sqlType == OraOopOracleQueries.getOracleType("BINARY_DOUBLE")) {
> hiveType = "DOUBLE";
>   }
> }
> {code}
> This code is affected with proposed change. As the Hive TIMESTAMP is 
> timezone-less, we have to change the handling in this method - respect the 
> property 'oraoop.timestamp.string' - if true, output STRING hive

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490248#comment-16490248
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

GitHub user christeoh opened a pull request:

https://github.com/apache/sqoop/pull/46

SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode 
for transfer

Added --as-binaryfile and --buffersize for FTP binary mode transfers.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/christeoh/sqoop 3224-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/46.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #46


commit b5cadeebb6f05df29c6018b391e5965a2caecbdb
Author: Chris Teoh 
Date:   2018-05-23T06:00:43Z

Fixed merge conflict

commit c0de096f9dcb3109b1e8200e9494ee19c4bf203f
Author: Chris Teoh 
Date:   2018-05-23T06:01:50Z

Refactored com.cloudera namespace to org.apache.sqoop

commit 49108cdf6185ad1881a857911351d4cc81fd34dd
Author: Chris Teoh 
Date:   2018-05-23T07:03:33Z

Added --as-binary flag

commit cce81e53b82eb95ac5946d6092ece7178e281726
Author: Chris Teoh 
Date:   2017-11-15T02:36:07Z

Moved mainframe FTP transfermode default setting to initDefaults()

commit 19cb4a11051f7bc094ff13f54f0ecb47a91677cd
Author: Chris Teoh 
Date:   2017-11-15T02:38:02Z

Replaced import java.io.* with single class imports

commit ad54c7caf205ec510feac0708ff130cce3d8970e
Author: Chris Teoh 
Date:   2017-11-16T02:07:07Z

Removed excessive logging per record to improve performance

commit e48820ea21598b44630e4331be4ee04bb2842d5e
Author: Chris Teoh 
Date:   2017-11-16T02:07:42Z

Added comment to document why we need to add custom class for binary 
transfers

commit 288412b7db4d731506b97eb2be2229ba1bcad639
Author: Chris Teoh 
Date:   2017-11-16T03:27:48Z

Converted to use BufferedInputStream instead of InputStream

commit e4a1f3a5a4a6f1fcc562b26eeda109d773b854e1
Author: Chris Teoh 
Date:   2017-11-17T00:57:48Z

Added unit tests for MainframeDatasetFTPRecordReader.getNextBinaryRecord

commit 290d5895b37ef9ca515d14e7e5d4d13730684e15
Author: Chris Teoh 
Date:   2017-11-17T01:19:04Z

Updated unit tests and used helper classes

commit 51e6d75767e56d481467d6d6c7de0bf0c76fba1d
Author: Chris Teoh 
Date:   2017-11-17T01:22:06Z

Updated unit tests to use a method of org.junit.Assert

commit 8e5ea6f8d993a4c479ac20e87f5b4b7cf2e9c8df
Author: Chris Teoh 
Date:   2017-11-17T01:35:04Z

Updated unit test for compilation

commit c737ea28e57c517d8f28a81802978e11e768ec3b
Author: Chris Teoh 
Date:   2017-11-17T05:28:57Z

Used StringUtils to do comparisons and corrected bulk imports

commit 48602eb6e5d70ca86456a30862e583ad82e863e0
Author: Chris Teoh 
Date:   2017-11-28T03:51:07Z

Replaced star import with specific class import

commit e81b400ec2dc4c47d46d1db198ab665c0a85de3c
Author: Chris Teoh 
Date:   2017-11-28T03:51:33Z

Updated to use current class instead of deprecated class

commit 3fd76409108184eace1ae1b60cad0d739af474bd
Author: Chris Teoh 
Date:   2017-11-28T03:52:10Z

Refactored common functionality to another function

commit c50bd2183717f3b2a393c92711ef19fdab4dbbd2
Author: Chris Teoh 
Date:   2017-11-28T03:52:30Z

Adjusted comment

commit 6a66f3e88e7150d0dceba2a6accf120ea4498199
Author: Chris Teoh 
Date:   2017-11-29T04:36:44Z

Moved tests from TestMainframeDatasetFTPRecordReader to separate class

commit c3f1de55bd2dabc1efcdf798cd31c6d981e23c0f
Author: Chris Teoh 
Date:   2017-11-29T04:37:15Z

Adjusted class for unit test support:

commit bd487e9804fd58188f2a48ec5db504a678e7bf8c
Author: Chris Teoh 
Date:   2017-11-29T04:38:43Z

Adjusted exceptions to print full stack

commit 81416f08ae11f67854be89d2abc6a652fd28c3f8
Author: Chris Teoh 
Date:   2017-11-29T04:39:05Z

Moved unit tests to another class

commit 40e77151d65d8f8b3cd9dd983283ef7f53fad73e
Author: Chris Teoh 
Date:   2017-11-29T06:17:48Z

Updated unit tests

commit 2ad7205ad1272b3b5965b8918a2ba69672d8c8d2
Author: Chris Teoh 
Date:   2017-11-29T11:27:58Z

Tidied up unit tests

commit bc7c43338d21eaa67ea5d92ef4a1fff8efd5783f
Author: Chris Teoh 
Date:   2017-11-29T11:43:28Z

Updated getNextBinaryRecord logic to be simpler

commit eed86f87a904697098ccac952efab9ecbd98db84
Author: Chris Teoh 
Date:   2017-12-12T22:42:54Z

Added license information

commit ed7c2d5ed8ee7a2631a453c87a40cdf3a8d194cc
Author: Chris Teoh 
Date:

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-06-17 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515282#comment-16515282
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

Github user christeoh closed the pull request at:

https://github.com/apache/sqoop/pull/47


> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495081#comment-16495081
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

Github user christeoh closed the pull request at:

https://github.com/apache/sqoop/pull/44


> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495080#comment-16495080
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

Github user christeoh closed the pull request at:

https://github.com/apache/sqoop/pull/46


> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-05-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495085#comment-16495085
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

GitHub user christeoh opened a pull request:

https://github.com/apache/sqoop/pull/47

SQOOP-3224: Mainframe Binary File Transfer mode



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/christeoh/sqoop 3224-5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/47.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #47


commit e450d857eb09418463d8e4a312a5acd4866436f4
Author: Chris Teoh 
Date:   2018-05-23T06:00:43Z

Fixed merge conflict

commit f0136a65da4eae8a7f61e411fca17c4ab1be5aca
Author: Chris Teoh 
Date:   2018-05-23T06:01:50Z

Refactored com.cloudera namespace to org.apache.sqoop

commit 16d9821531e55819bf6e1d35a4cca841d991344f
Author: Chris Teoh 
Date:   2018-05-23T07:03:33Z

Added --as-binary flag

commit 0f743e49ee6ed78f3363461797cca794766c6b86
Author: Chris Teoh 
Date:   2017-11-15T02:36:07Z

Moved mainframe FTP transfermode default setting to initDefaults()

commit 348ac8e66d5d44d06fffa0a0ffb0ef0ad1d1a20d
Author: Chris Teoh 
Date:   2017-11-15T02:38:02Z

Replaced import java.io.* with single class imports

commit 45346912ac8a8c0f9b34e80b9b20517e46288e0e
Author: Chris Teoh 
Date:   2017-11-16T02:07:07Z

Removed excessive logging per record to improve performance

commit 71cf80319a0d74e1d376a92bc82ad879881ade28
Author: Chris Teoh 
Date:   2017-11-16T02:07:42Z

Added comment to document why we need to add custom class for binary 
transfers

commit 57c6ece38654c5e92744b88b395f1e6e3d3eaaef
Author: Chris Teoh 
Date:   2017-11-16T03:27:48Z

Converted to use BufferedInputStream instead of InputStream

commit 49382f0272c70a99f9e2b91cbcefda274190e2bb
Author: Chris Teoh 
Date:   2017-11-17T00:57:48Z

Added unit tests for MainframeDatasetFTPRecordReader.getNextBinaryRecord

commit b4ab6218a94b28e76ea2e6670eb60489c41c4dde
Author: Chris Teoh 
Date:   2017-11-17T01:19:04Z

Updated unit tests and used helper classes

commit c10c1ef0f06b40424ac225c6bc95722d7e5f8a90
Author: Chris Teoh 
Date:   2017-11-17T01:22:06Z

Updated unit tests to use a method of org.junit.Assert

commit e03f5b70d6e9e836c7cfaad8ec433dd21d97e965
Author: Chris Teoh 
Date:   2017-11-17T01:35:04Z

Updated unit test for compilation

commit 11cb40a0a397019e9fcd9940941c4e76766f3c77
Author: Chris Teoh 
Date:   2017-11-17T05:28:57Z

Used StringUtils to do comparisons and corrected bulk imports

commit cf26972ebccc673960130dfb33b35b472e8387de
Author: Chris Teoh 
Date:   2017-11-28T03:51:07Z

Replaced star import with specific class import

commit e002e17a9231e4c1f3d8cf4fd4e5a1b3cf517c9c
Author: Chris Teoh 
Date:   2017-11-28T03:52:10Z

Refactored common functionality to another function

commit 07cfa041c9b5e9e4dd4888cad1cf0ae93f28
Author: Chris Teoh 
Date:   2017-11-28T03:52:30Z

Adjusted comment

commit 32aa551a1c42732d871d1849800a512a40f7e5ed
Author: Chris Teoh 
Date:   2017-11-29T04:36:44Z

Moved tests from TestMainframeDatasetFTPRecordReader to separate class

commit 5d6c56bc46a294b97430bb22479d733b6976c4a0
Author: Chris Teoh 
Date:   2017-11-29T04:37:15Z

Adjusted class for unit test support:

commit f52e5c9f23c83318b3cd1d302abcd277167ee51c
Author: Chris Teoh 
Date:   2017-11-29T04:38:43Z

Adjusted exceptions to print full stack

commit b6ab74742c96fa616707238364001c9bf0b63baa
Author: Chris Teoh 
Date:   2017-11-29T04:39:05Z

Moved unit tests to another class

commit 6cd6f091ca2f600656f0386b99a8a28678859b2a
Author: Chris Teoh 
Date:   2017-11-29T06:17:48Z

Updated unit tests

commit 75c98768589c2095fdc8d98442baae06265e7733
Author: Chris Teoh 
Date:   2017-11-29T11:27:58Z

Tidied up unit tests

commit 96a93393bb26b186046df090af3754fa2eea18bb
Author: Chris Teoh 
Date:   2017-11-29T11:43:28Z

Updated getNextBinaryRecord logic to be simpler

commit 8ccbdbf5316e943ba0ee0ba2bfc3873fb14397c7
Author: Chris Teoh 
Date:   2017-12-12T22:42:54Z

Added license information

commit ac0f23decd760c052ebc40d358bc6cd627253517
Author: Chris Teoh 
Date:   2017-10-25T10:20:07Z

Added support for binary ftp transfers

commit 1785a0ce89fb89e06497853d517a910ac8d2b406
Author: Chris Teoh 
Date:   2017-11-15T02:36:07Z

Moved mainframe FTP transfermode default setting to initDefaults()

commit 9165090ae43381e5fc8d2b0023fd12b835aa06a3
Author: Chris Teoh 
Date:   2017-11-15T02:38:02Z

Replaced import java.io.* with single class imports

commit a9e9427d91811cd8e73d2adc1a6f19a8e1be858a
Author: Chris Teoh 
Date:   2017-11-16T02:07:07Z

Removed excessive logging per record to improve performance

commit

[jira] [Commented] (SQOOP-3273) Removing com.cloudera.sqoop packages

2018-01-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316138#comment-16316138
 ] 

ASF GitHub Bot commented on SQOOP-3273:
---

GitHub user szvasas opened a pull request:

https://github.com/apache/sqoop/pull/42

Cloudera package removal

This pull request is created for: 
https://issues.apache.org/jira/browse/SQOOP-3273
Corresponding RD review is: https://reviews.apache.org/r/65017

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/szvasas/sqoop cloudera_package_removal

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/42.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #42


commit 22122ff17d656455674aaad5dcc0d7d35a338944
Author: Szabolcs Vasas 
Date:   2017-12-18T10:12:36Z

com.cloudera.sqoop.cli.RelatedOptions is not used anymore.

commit dbcca41ef0c51386759c85e1b748e5ed25344752
Author: Szabolcs Vasas 
Date:   2017-12-18T10:14:56Z

com.cloudera.sqoop.cli.SqoopParser is not used anymore.

commit 8874646ec1c13800afc9b67227b443caa01309ff
Author: Szabolcs Vasas 
Date:   2017-12-18T10:18:57Z

com.cloudera.sqoop.cli.ToolOptions is not used anymore.

commit e5dfa077dffd35d16a317e3c94a8e70ece5c5f70
Author: Szabolcs Vasas 
Date:   2017-12-18T10:23:21Z

com.cloudera.sqoop.config.ConfigurationHelper is not used anymore.

commit 7d7438ed3b05e68211342815927a587cd27fe4fb
Author: Szabolcs Vasas 
Date:   2017-12-18T10:24:42Z

com.cloudera.sqoop.hbase.HBasePutProcessor is not used anymore.

commit 49098cf204f409350a5cb943c057379c41b29860
Author: Szabolcs Vasas 
Date:   2017-12-18T10:26:52Z

com.cloudera.sqoop.hbase.HBaseUtil is not used anymore.

commit 5bd9d96c31c3e199e5d614f2b0f8d1fdcc169f96
Author: Szabolcs Vasas 
Date:   2017-12-18T10:27:42Z

com.cloudera.sqoop.hbase.PutTransformer is not used anymore.

commit 7f08298bc5cf1bbe91f4d36e6af7d78461507b97
Author: Szabolcs Vasas 
Date:   2017-12-18T10:30:14Z

com.cloudera.sqoop.hive.HiveImport is not used anymore.

commit f03816d12e9a1dcf4ef4e486fbbd625871307962
Author: Szabolcs Vasas 
Date:   2017-12-18T10:31:14Z

com.cloudera.sqoop.hive.HiveTypes is not used anymore.

commit e02b278dcd6a405c0ab35531ff1fc489edae92c0
Author: Szabolcs Vasas 
Date:   2017-12-18T10:32:54Z

com.cloudera.sqoop.hive.TableDefWriter is not used anymore.

commit db3e3b76a4b33adc2b35df44653fcd3ae54455c9
Author: Szabolcs Vasas 
Date:   2017-12-18T10:35:53Z

com.cloudera.sqoop.io.CodecMap is not used anymore.

commit 38c370e8283b6af0a878f528964bc51631207197
Author: Szabolcs Vasas 
Date:   2017-12-18T10:44:47Z

com.cloudera.sqoop.io.LobFile and its inner classes are not used anymore. 
The class had to be commented out because the interface of its super class is 
modified.

commit 6f25e4acff7c2f807a787a31694a619c89d31af8
Author: Szabolcs Vasas 
Date:   2017-12-18T10:48:01Z

com.cloudera.sqoop.io.NamedFifo is not used anymore.

commit 569fca251023ac9ce67a326f9efa4f271c128dbb
Author: Szabolcs Vasas 
Date:   2017-12-18T10:51:46Z

com.cloudera.sqoop.io.SplittableBufferedWriter is not used anymore. 
TestSplittableBufferedWriter had to be moved to org.apache package because it 
used a protected constructor.

commit 349b8a825cc14da878962c113bb91aba35a15ad7
Author: Szabolcs Vasas 
Date:   2017-12-18T10:53:40Z

com.cloudera.sqoop.io.SplittingOutputStream is now only used in 
com.cloudera.sqoop.io.SplittableBufferedWriter but it is not used anymore.

commit 501cafb2fd0e811e72d07d14cd6b00e9bd24d876
Author: Szabolcs Vasas 
Date:   2017-12-18T10:59:06Z

com.cloudera.sqoop.io.UnsupportedCodecException is not used anymore. 
com.cloudera.sqoop.io.CodecMap had to be commented out because the interface of 
its superclass has changed.

commit ddc9fadf6a4cab66fc21c964c8def20472abe3dc
Author: Szabolcs Vasas 
Date:   2017-12-18T11:10:35Z

com.cloudera.sqoop.lib.BigDecimalSerializer is not used anymore.

commit dd58f7d8cb22cb4adf1a0f94aee6dd53f8460560
Author: Szabolcs Vasas 
Date:   2017-12-18T11:18:38Z

com.cloudera.sqoop.lib.BlobRef is not used anymore. It had to be commented 
out because the super interface has changed.

commit 7198ef4bdcd69bb17668cdd5da08a872c2bfb73c
Author: Szabolcs Vasas 
Date:   2017-12-18T11:20:32Z

com.cloudera.sqoop.lib.BooleanParser is not used anymore.

commit 49ba230efbc594f654d29d2117133230724d8fa2
Author: Szabolcs Vasas 
Date:   2017-12-18T11:26:13Z

com.cloudera.sqoop.lib.ClobRef is not used anymore. It had to be commented 
out because the interface of its

[jira] [Commented] (SQOOP-3278) Direct export to Netezza and encoding

2018-01-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328532#comment-16328532
 ] 

ASF GitHub Bot commented on SQOOP-3278:
---

GitHub user bonnetb opened a pull request:

https://github.com/apache/sqoop/pull/43

[SQOOP-3278]:Direct export to Netezza and encoding

Direct mode used an OutputStream writes to an export file using UTF-8. So, 
if your targeted Netezza uses an ISO encoding, non ASCII chars will be 
corrupted when Netezza loads the export file.
Using an OutputStreamWriter and relying on existing 'encoding' extended 
parameter (SQOOP-2607) enables to set the encoding that will be used to write 
the export file.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bonnetb/sqoop SQOOP-3278

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/43.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #43


commit bb960e587d9d57fdbba3c96b5c1a467166f95016
Author: Benjamin BONNET 
Date:   2018-01-17T09:10:30Z

SQOOP-3278:use writer instead of stream and set encoding

commit e65d324d001d538c2f9bdf946aa7865e879f4222
Author: Benjamin BONNET 
Date:   2018-01-17T09:11:19Z

remove useless imports




> Direct export to Netezza and encoding
> -
>
> Key: SQOOP-3278
> URL: https://issues.apache.org/jira/browse/SQOOP-3278
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors
>Affects Versions: 1.4.6
> Environment: HDP 2.6
>Reporter: Benjamin BONNET
>Assignee: Benjamin BONNET
>Priority: Major
>
> Hi,
> I encountered an encoding issue while exporting from Hive to a Netezza table 
> containing ISO-8859-15 encoded VARCHAR. Using direct mode, non ASCII chars 
> are corrupted. That does not occur using non-direct mode.
> This bug is quite similar to 
> [https://issues.apache.org/jira/projects/SQOOP/issues/SQOOP-2607] , excepted 
> it concerns export (and not import).
> Regards



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-03-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404590#comment-16404590
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

GitHub user christeoh opened a pull request:

https://github.com/apache/sqoop/pull/44

[SQOOP-3224] Binary transfer mode



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/christeoh/sqoop 3224-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/44.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #44






> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-03-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405020#comment-16405020
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/44#discussion_r175483531
  
--- Diff: src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java ---
@@ -207,8 +208,18 @@ public static FTPClient getFTPConnection(Configuration 
conf)
 throw new IOException("Could not login to server " + server
 + ":" + ftp.getReplyString());
   }
-  // set ASCII transfer mode
-  ftp.setFileType(FTP.ASCII_FILE_TYPE);
+  // set transfer mode
+  String transferMode = 
conf.get(MainframeConfiguration.MAINFRAME_FTP_TRANSFER_MODE);
--- End diff --

The whole getFTPConnection method is already very long and does a bunch of 
steps. You could consider refactoring it into smaller methods for readability.



> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-03-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405022#comment-16405022
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/44#discussion_r175478607
  
--- Diff: 
src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java ---
@@ -33,4 +33,13 @@
   public static final String MAINFRAME_INPUT_DATASET_TAPE = 
"mainframe.input.dataset.tape";
 
   public static final String MAINFRAME_FTP_FILE_ENTRY_PARSER_CLASSNAME = 
"org.apache.sqoop.mapreduce.mainframe.MainframeFTPFileEntryParser";
+
+  public static final String MAINFRAME_FTP_TRANSFER_MODE = 
"mainframe.ftp.transfermode";
+
+  public static final String MAINFRAME_FTP_TRANSFER_MODE_ASCII = "ascii";
+
+  public static final String MAINFRAME_FTP_TRANSFER_MODE_BINARY = "binary";
+
+  // this is the buffer size used when doing binary ftp transfers from 
mainframe
+  public static final Integer MAINFRAME_FTP_TRANSFER_BINARY_BUFFER = 32760;
--- End diff --

Is this value always a good choice? I would make it configurable (if it 
isn't already), and use this value as default if said configuration is not 
present, so the user has a choice here.


> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-03-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405021#comment-16405021
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/44#discussion_r175484508
  
--- Diff: src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java 
---
@@ -362,4 +363,29 @@ public void 
testPartitionedDatasetsShouldReturnAllFiles() {
   Assert.fail(ioeString);
 }
   }
+  @Test
+  public void testBinaryTransferMode() throws IOException {
+final String EXPECTED_RESPONSE = "200 Representation type is Image";
+final int EXPECTED_RESPONSE_CODE = 200;
+when(mockFTPClient.login("user", "pssword")).thenReturn(true);
--- End diff --

is the typo "pssword" intentional? Though won't cause any trouble I guess...


> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3224) Mainframe FTP transfer should have an option to use binary mode for transfer

2018-03-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411299#comment-16411299
 ] 

ASF GitHub Bot commented on SQOOP-3224:
---

Github user christeoh commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/44#discussion_r176717474
  
--- Diff: src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java ---
@@ -207,8 +208,18 @@ public static FTPClient getFTPConnection(Configuration 
conf)
 throw new IOException("Could not login to server " + server
 + ":" + ftp.getReplyString());
   }
-  // set ASCII transfer mode
-  ftp.setFileType(FTP.ASCII_FILE_TYPE);
+  // set transfer mode
+  String transferMode = 
conf.get(MainframeConfiguration.MAINFRAME_FTP_TRANSFER_MODE);
--- End diff --

I'm not sure how you would refactor it. I can give it a try and see if it 
is what you're looking for.


> Mainframe FTP transfer should have an option to use binary mode for transfer
> 
>
> Key: SQOOP-3224
> URL: https://issues.apache.org/jira/browse/SQOOP-3224
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Currently the mainframe FTP module is hard coded to use ascii transfer mode. 
> Propose a mainframe module flag to be able to change modes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3387) Include Column-Remarks

2018-11-12 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683909#comment-16683909
 ] 

ASF GitHub Bot commented on SQOOP-3387:
---

Github user hatala91 closed the pull request at:

https://github.com/apache/sqoop/pull/49


> Include Column-Remarks
> --
>
> Key: SQOOP-3387
> URL: https://issues.apache.org/jira/browse/SQOOP-3387
> Project: Sqoop
>  Issue Type: Wish
>  Components: connectors, metastore
>Affects Versions: 1.4.7
>Reporter: Tomas Sebastian Hätälä
>Assignee: Tomas Sebastian Hätälä
>Priority: Critical
>  Labels: easy-fix, features, pull-request-available
> Fix For: 1.5.0
>
> Attachments: SQOOP_3387.patch
>
>
> In most RDBMS it is possible to enter comments/ remarks for table and view 
> columns. That way a user can obtain additional information regarding the data 
> and how to use it.
> With the avro file format it would be possible to store this information in 
> the schema file using the "doc"-tag. At the moment this is, however, left 
> blanc.
> Review: https://reviews.apache.org/r/68989/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3387) Include Column-Remarks

2018-10-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634032#comment-16634032
 ] 

ASF GitHub Bot commented on SQOOP-3387:
---

GitHub user hatala91 opened a pull request:

https://github.com/apache/sqoop/pull/49

SQOOP-3387: Add column remarks to avro schema file

Needs further work to work with different databases.

E. g. to work with Oracle we would need to add ojdbc jar as a dependency 
and setRemarksReporting(true) in OracleManager when opening a connection. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hatala91/sqoop trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/49.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #49


commit c8c8b37cb516ea68c2edf3b5b0a87a429950e41d
Author: hatala91 
Date:   2018-10-01T13:33:56Z

SQOOP-3387: Add column remarks to avro schema file




> Include Column-Remarks
> --
>
> Key: SQOOP-3387
> URL: https://issues.apache.org/jira/browse/SQOOP-3387
> Project: Sqoop
>  Issue Type: Wish
>  Components: connectors, metastore
>Affects Versions: 1.4.7
>Reporter: Tomas Sebastian Hätälä
>Priority: Critical
>  Labels: easy-fix, features
> Fix For: 1.5.0
>
>
> In most RDBMS it is possible to enter comments/ remarks for table and view 
> columns. That way a user can obtain additional information regarding the data 
> and how to use it.
> With the avro file format it would be possible to store this information in 
> the schema file using the "doc"-tag. At the moment this is, however, left 
> blanc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3418) Document decimal support in Hive external import into parquet files

2018-12-12 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718785#comment-16718785
 ] 

ASF GitHub Bot commented on SQOOP-3418:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/66#discussion_r240968555
  
--- Diff: src/docs/user/import.txt ---
@@ -472,36 +472,48 @@ Enabling Logical Types in Avro and Parquet import for 
numbers
 ^
 
 To enable the use of logical types in Sqoop's avro schema generation,
-i.e. used during both avro and parquet imports, one has to use the
-sqoop.avro.logical_types.decimal.enable flag. This is necessary if one
+i.e. used both during avro and parquet imports, one has to use the
++sqoop.avro.logical_types.decimal.enable+ property. This is necessary if 
one
 wants to store values as decimals in the avro file format.
 
-Padding number types in avro import
-^^^
+In case of a parquet import, one has to use the
++sqoop.parquet.logical_types.decimal.enable+ property.
+
+Padding number types in avro and parquet import

 
 Certain databases, such as Oracle and Postgres store number and decimal
 values without padding. For example 1.5 in a column declared
-as NUMBER (20,5) is stored as is in Oracle, while the equivalent
+as NUMBER (20, 5) is stored as is in Oracle, while the equivalent
 DECIMAL (20, 5) is stored as 1.5 in an SQL server instance.
-This leads to a scale mismatch during avro import.
+This leads to a scale mismatch during the import.
 
-To avoid this error, one can use the sqoop.avro.decimal_padding.enable flag
-to turn on padding with 0s. This flag has to be used together with the
-sqoop.avro.logical_types.decimal.enable flag set to true.
+To avoid this error, one can use the +sqoop.avro.decimal_padding.enable+
+property to turn on padding with 0s during. Naturally, this property is 
used
--- End diff --

yes!


> Document decimal support in Hive external import into parquet files
> ---
>
> Key: SQOOP-3418
> URL: https://issues.apache.org/jira/browse/SQOOP-3418
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Remember to note the limitations in Hive i.e. the max scale and precision is 
> 38 and how it behaves in edge cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3418) Document decimal support in Hive external import into parquet files

2018-12-12 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718788#comment-16718788
 ] 

ASF GitHub Bot commented on SQOOP-3418:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/66#discussion_r240969163
  
--- Diff: src/docs/user/import.txt ---
@@ -838,20 +850,27 @@ $ sqoop import --connect jdbc:mysql://db.foo.com/corp 
\
 
 
 Enabling logical types in avro import and also turning on padding with 0s:
-
 
 $ sqoop import -Dsqoop.avro.decimal_padding.enable=true 
-Dsqoop.avro.logical_types.decimal.enable=true
---connect $CON --username $USER --password $PASS --query "select * 
from table_name where \$CONDITIONS"
+--connect $MYCONN --username $MYUSER --password $MYPASS --query 
"select * from table_name where \$CONDITIONS"
 --target-dir hdfs://nameservice1//etl/target_path --as-avrodatafile 
--verbose -m 1
 
 
 
 Enabling logical types in avro import and also turning on padding with 0s, 
while specifying default precision and scale as well:
-
 
 $ sqoop import -Dsqoop.avro.decimal_padding.enable=true 
-Dsqoop.avro.logical_types.decimal.enable=true
 -Dsqoop.avro.logical_types.decimal.default.precision=38 
-Dsqoop.avro.logical_types.decimal.default.scale=10
---connect $CON --username $USER --password $PASS --query "select * 
from table_name where \$CONDITIONS"
+--connect $MYCONN --username $MYUSER --password $MYPASS --query 
"select * from table_name where \$CONDITIONS"
 --target-dir hdfs://nameservice1//etl/target_path --as-avrodatafile 
--verbose -m 1
 
 
+
+Enabling logical types in parquet import and also turning on padding with 
0s, while specifying default precision and scale as well:
+
+$ sqoop import -Dsqoop.parquet.decimal_padding.enable=true 
-Dsqoop.avro.logical_types.decimal.enable=true
--- End diff --

Yes!


> Document decimal support in Hive external import into parquet files
> ---
>
> Key: SQOOP-3418
> URL: https://issues.apache.org/jira/browse/SQOOP-3418
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Remember to note the limitations in Hive i.e. the max scale and precision is 
> 38 and how it behaves in edge cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3418) Document decimal support in Hive external import into parquet files

2018-12-12 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718776#comment-16718776
 ] 

ASF GitHub Bot commented on SQOOP-3418:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/66#discussion_r240958193
  
--- Diff: src/docs/user/import.txt ---
@@ -472,36 +472,48 @@ Enabling Logical Types in Avro and Parquet import for 
numbers
 ^
 
 To enable the use of logical types in Sqoop's avro schema generation,
-i.e. used during both avro and parquet imports, one has to use the
-sqoop.avro.logical_types.decimal.enable flag. This is necessary if one
+i.e. used both during avro and parquet imports, one has to use the
++sqoop.avro.logical_types.decimal.enable+ property. This is necessary if 
one
 wants to store values as decimals in the avro file format.
 
-Padding number types in avro import
-^^^
+In case of a parquet import, one has to use the
++sqoop.parquet.logical_types.decimal.enable+ property.
+
+Padding number types in avro and parquet import

 
 Certain databases, such as Oracle and Postgres store number and decimal
 values without padding. For example 1.5 in a column declared
-as NUMBER (20,5) is stored as is in Oracle, while the equivalent
+as NUMBER (20, 5) is stored as is in Oracle, while the equivalent
 DECIMAL (20, 5) is stored as 1.5 in an SQL server instance.
-This leads to a scale mismatch during avro import.
+This leads to a scale mismatch during the import.
 
-To avoid this error, one can use the sqoop.avro.decimal_padding.enable flag
-to turn on padding with 0s. This flag has to be used together with the
-sqoop.avro.logical_types.decimal.enable flag set to true.
+To avoid this error, one can use the +sqoop.avro.decimal_padding.enable+
+property to turn on padding with 0s during. Naturally, this property is 
used
--- End diff --

during import(?)


> Document decimal support in Hive external import into parquet files
> ---
>
> Key: SQOOP-3418
> URL: https://issues.apache.org/jira/browse/SQOOP-3418
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Remember to note the limitations in Hive i.e. the max scale and precision is 
> 38 and how it behaves in edge cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3418) Document decimal support in Hive external import into parquet files

2018-12-12 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718777#comment-16718777
 ] 

ASF GitHub Bot commented on SQOOP-3418:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/66#discussion_r240966974
  
--- Diff: src/docs/user/import.txt ---
@@ -838,20 +850,27 @@ $ sqoop import --connect jdbc:mysql://db.foo.com/corp 
\
 
 
 Enabling logical types in avro import and also turning on padding with 0s:
-
 
 $ sqoop import -Dsqoop.avro.decimal_padding.enable=true 
-Dsqoop.avro.logical_types.decimal.enable=true
---connect $CON --username $USER --password $PASS --query "select * 
from table_name where \$CONDITIONS"
+--connect $MYCONN --username $MYUSER --password $MYPASS --query 
"select * from table_name where \$CONDITIONS"
 --target-dir hdfs://nameservice1//etl/target_path --as-avrodatafile 
--verbose -m 1
 
 
 
 Enabling logical types in avro import and also turning on padding with 0s, 
while specifying default precision and scale as well:
-
 
 $ sqoop import -Dsqoop.avro.decimal_padding.enable=true 
-Dsqoop.avro.logical_types.decimal.enable=true
 -Dsqoop.avro.logical_types.decimal.default.precision=38 
-Dsqoop.avro.logical_types.decimal.default.scale=10
---connect $CON --username $USER --password $PASS --query "select * 
from table_name where \$CONDITIONS"
+--connect $MYCONN --username $MYUSER --password $MYPASS --query 
"select * from table_name where \$CONDITIONS"
 --target-dir hdfs://nameservice1//etl/target_path --as-avrodatafile 
--verbose -m 1
 
 
+
+Enabling logical types in parquet import and also turning on padding with 
0s, while specifying default precision and scale as well:
+
+$ sqoop import -Dsqoop.parquet.decimal_padding.enable=true 
-Dsqoop.avro.logical_types.decimal.enable=true
--- End diff --

I think you wanted to say -Dsqoop.avro.decimal_padding.enable=true and 
-Dsqoop.parquet.logical_types.decimal.enable=true


> Document decimal support in Hive external import into parquet files
> ---
>
> Key: SQOOP-3418
> URL: https://issues.apache.org/jira/browse/SQOOP-3418
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Remember to note the limitations in Hive i.e. the max scale and precision is 
> 38 and how it behaves in edge cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3418) Document decimal support in Hive external import into parquet files

2018-12-11 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717170#comment-16717170
 ] 

ASF GitHub Bot commented on SQOOP-3418:
---

GitHub user fszabo2 opened a pull request:

https://github.com/apache/sqoop/pull/66

SQOOP-3418: Document decimal support in Hive external import into parquet 
files

This is the documentation part of SQOOP-3382 and SQOOP-3396

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fszabo2/sqoop SQOOP-3418

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/66.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #66


commit 821cb6bfd3d7d32671d55c8fe5ccd5a92ca9a39a
Author: Fero Szabo 
Date:   2018-12-05T16:37:23Z

Documentation for parquet decimal support, and parquet decimal support in 
Hive import

commit 936116ff07d9ba8f19f16f07c3ae4f9c2dabaf01
Author: Fero Szabo 
Date:   2018-12-06T15:13:33Z

rephrased new lines a bit

commit 47535ee3dee24170db3303b71855456a8d142711
Author: Fero Szabo 
Date:   2018-12-11T13:31:34Z

various fixes and reformats




> Document decimal support in Hive external import into parquet files
> ---
>
> Key: SQOOP-3418
> URL: https://issues.apache.org/jira/browse/SQOOP-3418
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Remember to note the limitations in Hive i.e. the max scale and precision is 
> 38 and how it behaves in edge cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-11-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704895#comment-16704895
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r237894381
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -37,16 +42,28 @@
   private static final String HIVE_TYPE_STRING = "STRING";
   private static final String HIVE_TYPE_BOOLEAN = "BOOLEAN";
   private static final String HIVE_TYPE_BINARY = "BINARY";
+  private static final String HIVE_TYPE_DECIMAL = "DECIMAL";
 
   public static final Log LOG = 
LogFactory.getLog(HiveTypes.class.getName());
 
   private HiveTypes() { }
 
+
+  public static String toHiveType(int sqlType, SqoopOptions options) {
+
+if 
(options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false)
+&& (sqlType == Types.NUMERIC || sqlType == Types.DECIMAL)){
+  return HIVE_TYPE_DECIMAL;
+}
+return toHiveType(sqlType);
+  }
+
+
   /**
* Given JDBC SQL types coming from another database, what is the best
* mapping to a Hive-specific type?
*/
-  public static String toHiveType(int sqlType) {
+  private static String toHiveType(int sqlType) {
--- End diff --

Actually, there is a method that defines extra mappings for hive types in 
OraOop:
org.apache.sqoop.manager.oracle.OraOopConnManager#toHiveType 

So we need to keep the "null" return value here, in order to allow for that 
to work. 

I think it still makes sense to log a warning though.

On the long run: it might make sense to refactor the ora oop connector, but 
I think it's out of the scope of this change.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3415) Fix gradle test+build when clean applied as the first command + warning issue fixes

2018-11-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704902#comment-16704902
 ] 

ASF GitHub Bot commented on SQOOP-3415:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/62#discussion_r237896212
  
--- Diff: build.gradle ---
@@ -356,6 +359,15 @@ tasks.withType(Test) {
 ignoreFailures ignoreTestFailures
 }
 
+project.tasks.each {
+  if ( it.name.toLowerCase().endsWith('test') ) {
+it.doFirst({
--- End diff --

Could we move this piece of code to the tasks.withType(Test) block so we do 
not have to iterate over the tasks:
```
tasks.withType(Test) {
...
 doFirst {
   project.mkdir(testBuildDir)
   project.mkdir(testBuildDirData)
 }
...
}
```


> Fix gradle test+build when clean applied as the first command + warning issue 
> fixes
> ---
>
> Key: SQOOP-3415
> URL: https://issues.apache.org/jira/browse/SQOOP-3415
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Major
> Fix For: 1.5.0
>
>
> If the user wants to build like the following command:
> gradlew clean unittest
> the gradle process ends up in an exception and the whole process left there 
> hanging forever. The root cause of this is the following:
> tasks.withType runs in the configuration part of the build, where we ensure 
> the neccessary directories exist.
> after that clean is executed and all of the dirs got deleted.
> Proposed fix:
> Apply directory creation as the first step of test tasks.
> on the top:
> there are some missing options b/c of Junit annotation processors, and also 
> Xlint information are swallowed currently. We aim to fix these things as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3273) Removing com.cloudera.sqoop packages

2018-11-28 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701516#comment-16701516
 ] 

ASF GitHub Bot commented on SQOOP-3273:
---

Github user szvasas closed the pull request at:

https://github.com/apache/sqoop/pull/42


> Removing com.cloudera.sqoop packages
> 
>
> Key: SQOOP-3273
> URL: https://issues.apache.org/jira/browse/SQOOP-3273
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: SQOOP-3273-final.patch, SQOOP-3273.patch, 
> SQOOP-3273.patch, SQOOP-3273.patch
>
>
> Sqoop has dozens of classes in com.cloudera.sqoop packages which most of the 
> cases just extend their corresponding class in org.apache.sqoop package 
> without adding extra functionality.
> These classes make the code harder to read and navigate, they are already 
> deprecated but because of backward compatibility considerations we have not 
> removed them yet.
> The task is to make sure that all the functionality from com.cloudera.sqoop 
> classes is available in org.apache.sqoop classes and remove the classes from 
> com.cloudera.sqoop packages.
> The tests defined in com.cloudera.sqoop packages should also be migrated to 
> org.apache.sqoop package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3414) Introduce a Gradle build parameter to set the ignoreTestFailures of the test tasks

2018-11-28 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701634#comment-16701634
 ] 

ASF GitHub Bot commented on SQOOP-3414:
---

Github user asfgit closed the pull request at:

https://github.com/apache/sqoop/pull/59


> Introduce a Gradle build parameter to set the ignoreTestFailures of the test 
> tasks
> --
>
> Key: SQOOP-3414
> URL: https://issues.apache.org/jira/browse/SQOOP-3414
> Project: Sqoop
>  Issue Type: Test
>Affects Versions: 1.4.7
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
> Attachments: test_with_ignoreTestFailures=true.txt, 
> test_without_ignoreTestFailures.txt
>
>
> The 
> [ignoreFailures|https://docs.gradle.org/current/dsl/org.gradle.api.tasks.testing.Test.html#org.gradle.api.tasks.testing.Test:ignoreFailures]
>  parameter of the Gradle test tasks is set to false which means that if a 
> Gradle test task fails the gradle
> process returns with non-zero. In some CI tools (e.g. Jenkins) this will make 
> the status of the job red and not yellow
> which usually means some more serious issue than a test failure.
> I would like to introduce a parameter to be able set this parameter of the 
> test tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3415) Fix gradle test+build when clean applied as the first command + warning issue fixes

2018-11-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704149#comment-16704149
 ] 

ASF GitHub Bot commented on SQOOP-3415:
---

GitHub user maugly24 opened a pull request:

https://github.com/apache/sqoop/pull/61

SQOOP-3415: Fixing gradle clean unittest and gradle warning issues

(Attila Szabo)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maugly24/sqoop SQOOP-3415

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/61.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #61


commit 12d1484f1f5da030331e3ccbe74b3d05ad64
Author: Attila Szabo 
Date:   2018-11-30T01:58:59Z

SQOOP-3415: Fixing gradle clean unittest and gradle warning issues

(Attila Szabo)




> Fix gradle test+build when clean applied as the first command + warning issue 
> fixes
> ---
>
> Key: SQOOP-3415
> URL: https://issues.apache.org/jira/browse/SQOOP-3415
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Major
> Fix For: 1.5.0
>
>
> If the user wants to build like the following command:
> gradlew clean unittest
> the gradle process ends up in an exception and the whole process left there 
> hanging forever. The root cause of this is the following:
> tasks.withType runs in the configuration part of the build, where we ensure 
> the neccessary directories exist.
> after that clean is executed and all of the dirs got deleted.
> Proposed fix:
> Apply directory creation as the first step of test tasks.
> on the top:
> there are some missing options b/c of Junit annotation processors, and also 
> Xlint information are swallowed currently. We aim to fix these things as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3415) Fix gradle test+build when clean applied as the first command + warning issue fixes

2018-11-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704158#comment-16704158
 ] 

ASF GitHub Bot commented on SQOOP-3415:
---

GitHub user maugly24 opened a pull request:

https://github.com/apache/sqoop/pull/62

SQOOP-3415: Fixing gradle clean unittest and gradle warning issues

(Attila Szabo)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maugly24/sqoop SQOOP-3415

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/62.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #62


commit 0ba740647994f5a7d0967a6404d88c06713ff8d2
Author: Attila Szabo 
Date:   2018-11-30T01:58:59Z

SQOOP-3415: Fixing gradle clean unittest and gradle warning issues

(Attila Szabo)




> Fix gradle test+build when clean applied as the first command + warning issue 
> fixes
> ---
>
> Key: SQOOP-3415
> URL: https://issues.apache.org/jira/browse/SQOOP-3415
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Major
> Fix For: 1.5.0
>
>
> If the user wants to build like the following command:
> gradlew clean unittest
> the gradle process ends up in an exception and the whole process left there 
> hanging forever. The root cause of this is the following:
> tasks.withType runs in the configuration part of the build, where we ensure 
> the neccessary directories exist.
> after that clean is executed and all of the dirs got deleted.
> Proposed fix:
> Apply directory creation as the first step of test tasks.
> on the top:
> there are some missing options b/c of Junit annotation processors, and also 
> Xlint information are swallowed currently. We aim to fix these things as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3415) Fix gradle test+build when clean applied as the first command + warning issue fixes

2018-11-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704156#comment-16704156
 ] 

ASF GitHub Bot commented on SQOOP-3415:
---

Github user maugly24 closed the pull request at:

https://github.com/apache/sqoop/pull/61


> Fix gradle test+build when clean applied as the first command + warning issue 
> fixes
> ---
>
> Key: SQOOP-3415
> URL: https://issues.apache.org/jira/browse/SQOOP-3415
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Major
> Fix For: 1.5.0
>
>
> If the user wants to build like the following command:
> gradlew clean unittest
> the gradle process ends up in an exception and the whole process left there 
> hanging forever. The root cause of this is the following:
> tasks.withType runs in the configuration part of the build, where we ensure 
> the neccessary directories exist.
> after that clean is executed and all of the dirs got deleted.
> Proposed fix:
> Apply directory creation as the first step of test tasks.
> on the top:
> there are some missing options b/c of Junit annotation processors, and also 
> Xlint information are swallowed currently. We aim to fix these things as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-11-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704873#comment-16704873
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r237890806
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -37,16 +42,28 @@
   private static final String HIVE_TYPE_STRING = "STRING";
   private static final String HIVE_TYPE_BOOLEAN = "BOOLEAN";
   private static final String HIVE_TYPE_BINARY = "BINARY";
+  private static final String HIVE_TYPE_DECIMAL = "DECIMAL";
 
   public static final Log LOG = 
LogFactory.getLog(HiveTypes.class.getName());
 
   private HiveTypes() { }
 
+
+  public static String toHiveType(int sqlType, SqoopOptions options) {
+
+if 
(options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false)
--- End diff --

I refactored this file.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3415) Fix gradle test+build when clean applied as the first command + warning issue fixes

2018-11-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705159#comment-16705159
 ] 

ASF GitHub Bot commented on SQOOP-3415:
---

Github user maugly24 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/62#discussion_r237962514
  
--- Diff: build.gradle ---
@@ -356,6 +359,15 @@ tasks.withType(Test) {
 ignoreFailures ignoreTestFailures
 }
 
+project.tasks.each {
+  if ( it.name.toLowerCase().endsWith('test') ) {
+it.doFirst({
--- End diff --

Hi Szabi,

Thanks for the suggestion!

I've modified the build.gradle file accordingly!


> Fix gradle test+build when clean applied as the first command + warning issue 
> fixes
> ---
>
> Key: SQOOP-3415
> URL: https://issues.apache.org/jira/browse/SQOOP-3415
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Major
> Fix For: 1.5.0
>
>
> If the user wants to build like the following command:
> gradlew clean unittest
> the gradle process ends up in an exception and the whole process left there 
> hanging forever. The root cause of this is the following:
> tasks.withType runs in the configuration part of the build, where we ensure 
> the neccessary directories exist.
> after that clean is executed and all of the dirs got deleted.
> Proposed fix:
> Apply directory creation as the first step of test tasks.
> on the top:
> there are some missing options b/c of Junit annotation processors, and also 
> Xlint information are swallowed currently. We aim to fix these things as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-11-30 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705314#comment-16705314
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user maugly24 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238013139
  
--- Diff: 
src/test/org/apache/sqoop/importjob/configuration/MysqlImportJobTestConfiguration.java
 ---
@@ -65,4 +66,21 @@
   public String toString() {
 return getClass().getSimpleName();
   }
+
+  @Override
+  public Object[] getExpectedResultsForHive() {
--- End diff --

Hi @fszabo2 ,

If that's the case:
Could you please at least add some information about the intention behind 
the values (A.K.A. explanatory comments by Uncle Bob), b/c right now for 
example if this test would fail I would not have much clue about what values do 
we have here, and why.

I'm thinking about something like:
this value is an int, XXX precision, so it should be Y in Parquet/Hive

or something like
this value represents the biggest the longest possible values stored as 
type ZZZ, thus it contains N digits, and should be  in Hive.

Could we add these information, thus put these "Magic numbers" into a 
context?


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-11-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704039#comment-16704039
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user maugly24 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r237698797
  
--- Diff: 
src/test/org/apache/sqoop/hive/numerictypes/NumericTypesHiveImportTest.java ---
@@ -0,0 +1,202 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.sqoop.hive.numerictypes;
+
+import org.apache.sqoop.hive.minicluster.HiveMiniCluster;
+import org.apache.sqoop.hive.minicluster.NoAuthenticationConfiguration;
+import org.apache.sqoop.importjob.configuration.HiveTestConfiguration;
+import 
org.apache.sqoop.importjob.configuration.MysqlImportJobTestConfiguration;
+import 
org.apache.sqoop.importjob.configuration.OracleImportJobTestConfiguration;
+import 
org.apache.sqoop.importjob.configuration.OracleImportJobTestConfigurationForNumber;
+import 
org.apache.sqoop.importjob.configuration.PostgresqlImportJobTestConfigurationForNumeric;
+import 
org.apache.sqoop.importjob.configuration.PostgresqlImportJobTestConfigurationPaddingShouldSucceed;
+import 
org.apache.sqoop.importjob.configuration.SqlServerImportJobTestConfiguration;
+import org.apache.sqoop.testcategories.sqooptest.IntegrationTest;
+import org.apache.sqoop.testcategories.thirdpartytest.MysqlTest;
+import org.apache.sqoop.testcategories.thirdpartytest.OracleTest;
+import org.apache.sqoop.testcategories.thirdpartytest.PostgresqlTest;
+import org.apache.sqoop.testcategories.thirdpartytest.SqlServerTest;
+import org.apache.sqoop.testutil.HiveServer2TestUtil;
+import org.apache.sqoop.testutil.NumericTypesTestUtils;
+import org.apache.sqoop.testutil.adapter.DatabaseAdapter;
+import org.apache.sqoop.testutil.adapter.MysqlDatabaseAdapter;
+import org.apache.sqoop.testutil.adapter.OracleDatabaseAdapter;
+import org.apache.sqoop.testutil.adapter.PostgresDatabaseAdapter;
+import org.apache.sqoop.testutil.adapter.SqlServerDatabaseAdapter;
+import org.apache.sqoop.util.BlockJUnit4ClassRunnerWithParametersFactory;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.experimental.categories.Category;
+import org.junit.experimental.runners.Enclosed;
+import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.Arrays;
+
+import static 
org.apache.sqoop.testutil.NumericTypesTestUtils.FAIL_WITHOUT_EXTRA_ARGS;
+import static 
org.apache.sqoop.testutil.NumericTypesTestUtils.FAIL_WITH_PADDING_ONLY;
+import static 
org.apache.sqoop.testutil.NumericTypesTestUtils.SUCCEED_WITHOUT_EXTRA_ARGS;
+import static 
org.apache.sqoop.testutil.NumericTypesTestUtils.SUCCEED_WITH_PADDING_ONLY;
+
+@RunWith(Enclosed.class)
+@Category(IntegrationTest.class)
+public class NumericTypesHiveImportTest {
+
+  @Rule
+  public ExpectedException expectedException = ExpectedException.none();
+
+  private static HiveMiniCluster hiveMiniCluster;
+
+  private static HiveServer2TestUtil hiveServer2TestUtil;
+
+  @BeforeClass
+  public static void beforeClass() {
+startHiveMiniCluster();
+  }
+
+  @AfterClass
+  public static void afterClass() {
+stopHiveMiniCluster();
+  }
+
+  public static void startHiveMiniCluster() {
+hiveMiniCluster = new HiveMiniCluster(new 
NoAuthenticationConfiguration());
+hiveMiniCluster.start();
+hiveServer2TestUtil = new 
HiveServer2TestUtil(hiveMiniCluster.getUrl());
+  }
+
+  public static void stopHiveMiniCluster() {
+hiveMiniCluster.stop();
+  }
+
+  @Category(MysqlTest.class)
+  public static class MysqlNumericTypesHiveImportTest extends 
NumericTypesHiveImportTestBase {
+
+public

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-11-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704040#comment-16704040
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user maugly24 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r237699087
  
--- Diff: 
src/test/org/apache/sqoop/importjob/configuration/OracleImportJobTestConfigurationForNumber.java
 ---
@@ -68,4 +68,14 @@
   public String toString() {
 return getClass().getSimpleName();
   }
+
+  @Override
+  public Object[] getExpectedResultsForHive() {
--- End diff --

Same goes here for the duplication. Extract this part please!


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-11-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704038#comment-16704038
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user maugly24 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r237698979
  
--- Diff: 
src/test/org/apache/sqoop/importjob/configuration/MysqlImportJobTestConfiguration.java
 ---
@@ -65,4 +66,21 @@
   public String toString() {
 return getClass().getSimpleName();
   }
+
+  @Override
+  public Object[] getExpectedResultsForHive() {
--- End diff --

Source of code duplication again!
I would move these expected values into a helper class or a super class 
(choice is up to you).


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-11-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704041#comment-16704041
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user maugly24 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r237701227
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -37,16 +42,28 @@
   private static final String HIVE_TYPE_STRING = "STRING";
   private static final String HIVE_TYPE_BOOLEAN = "BOOLEAN";
   private static final String HIVE_TYPE_BINARY = "BINARY";
+  private static final String HIVE_TYPE_DECIMAL = "DECIMAL";
 
   public static final Log LOG = 
LogFactory.getLog(HiveTypes.class.getName());
 
   private HiveTypes() { }
 
+
+  public static String toHiveType(int sqlType, SqoopOptions options) {
+
+if 
(options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false)
+&& (sqlType == Types.NUMERIC || sqlType == Types.DECIMAL)){
+  return HIVE_TYPE_DECIMAL;
+}
+return toHiveType(sqlType);
+  }
+
+
   /**
* Given JDBC SQL types coming from another database, what is the best
* mapping to a Hive-specific type?
*/
-  public static String toHiveType(int sqlType) {
+  private static String toHiveType(int sqlType) {
--- End diff --

If you touch this class+method anyway would you mind modifying the 
underlying design in a way not to return null value?

It would be quite error prone e.g. in case of TableDefWriter if the 
FileLayout is Parquet and by any mistake the type of a given column is not 
mapped then the execution path would not end up in "not supported" exception, 
but TableDefWriter would try to create the table with "NULL" datatype.

IMHO this would be a great improvement here.

I would also standardize it in a way not to throw IOException (e.g. 
getHiveColumnTypeForTextTable) and RuntimeException for the same problem, but 
go only with RuntimeException.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3393) TestNetezzaExternalTableExportMapper hangs

2018-12-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705752#comment-16705752
 ] 

ASF GitHub Bot commented on SQOOP-3393:
---

Github user dvoros closed the pull request at:

https://github.com/apache/sqoop/pull/63


> TestNetezzaExternalTableExportMapper hangs
> --
>
> Key: SQOOP-3393
> URL: https://issues.apache.org/jira/browse/SQOOP-3393
> Project: Sqoop
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.5.0, 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Introduced in SQOOP-3378, spotted by [~vasas].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3393) TestNetezzaExternalTableExportMapper hangs

2018-12-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705757#comment-16705757
 ] 

ASF GitHub Bot commented on SQOOP-3393:
---

Github user dvoros commented on the issue:

https://github.com/apache/sqoop/pull/63
  
Thanks for the review @szvasas and for the reopening tip, just did that. 爛 


> TestNetezzaExternalTableExportMapper hangs
> --
>
> Key: SQOOP-3393
> URL: https://issues.apache.org/jira/browse/SQOOP-3393
> Project: Sqoop
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.5.0, 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Introduced in SQOOP-3378, spotted by [~vasas].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3417) Execute Oracle XE tests on Travis CI

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708504#comment-16708504
 ] 

ASF GitHub Bot commented on SQOOP-3417:
---

GitHub user szvasas opened a pull request:

https://github.com/apache/sqoop/pull/65

SQOOP-3417: Execute Oracle XE tests on Travis CI



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/szvasas/sqoop SQOOP-3417

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/65.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #65


commit 1ea1e35324b9cde28703b03c94ba5166897eeecb
Author: Szabolcs Vasas 
Date:   2018-12-03T15:17:44Z

Oracle JDBC driver is now downloaded from a Maven repository.




> Execute Oracle XE tests on Travis CI
> 
>
> Key: SQOOP-3417
> URL: https://issues.apache.org/jira/browse/SQOOP-3417
> Project: Sqoop
>  Issue Type: Test
>Affects Versions: 1.4.7
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
>
> The task is to enable the Travis CI to execute Oracle XE tests too 
> automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709883#comment-16709883
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r239008265
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -83,27 +89,58 @@ public static String toHiveType(int sqlType) {
   }
   }
 
-  public static String toHiveType(Schema.Type avroType) {
-  switch (avroType) {
-case BOOLEAN:
-  return HIVE_TYPE_BOOLEAN;
-case INT:
-  return HIVE_TYPE_INT;
-case LONG:
-  return HIVE_TYPE_BIGINT;
-case FLOAT:
-  return HIVE_TYPE_FLOAT;
-case DOUBLE:
-  return HIVE_TYPE_DOUBLE;
-case STRING:
-case ENUM:
-  return HIVE_TYPE_STRING;
-case BYTES:
-case FIXED:
-  return HIVE_TYPE_BINARY;
-default:
-  return null;
+  public static String toHiveType(Schema schema, SqoopOptions options) {
+if (schema.getType() == Schema.Type.UNION) {
+  for (Schema subSchema : schema.getTypes()) {
+if (subSchema.getType() != Schema.Type.NULL) {
+  return toHiveType(subSchema, options);
+}
+  }
+}
+
+Schema.Type avroType = schema.getType();
+switch (avroType) {
+  case BOOLEAN:
+return HIVE_TYPE_BOOLEAN;
+  case INT:
+return HIVE_TYPE_INT;
+  case LONG:
+return HIVE_TYPE_BIGINT;
+  case FLOAT:
+return HIVE_TYPE_FLOAT;
+  case DOUBLE:
+return HIVE_TYPE_DOUBLE;
+  case STRING:
+  case ENUM:
+return HIVE_TYPE_STRING;
+  case BYTES:
+return mapToDecimalOrBinary(schema, options);
+  case FIXED:
+return HIVE_TYPE_BINARY;
+  default:
+throw new RuntimeException(String.format("There is no Hive type 
mapping defined for the Avro type of: %s ", avroType.getName()));
+}
+  }
+
+  private static String mapToDecimalOrBinary(Schema schema, SqoopOptions 
options) {
+boolean logicalTypesEnabled = 
options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false);
+if (logicalTypesEnabled && schema.getLogicalType() != null && 
schema.getLogicalType() instanceof Decimal) {
+  Decimal decimal = (Decimal) schema.getLogicalType();
+
+  // trimming precision and scale to Hive's maximum values.
+  int precision = Math.min(HiveDecimal.MAX_PRECISION, 
decimal.getPrecision());
+  if (precision < decimal.getPrecision()) {
+LOG.warn("Warning! Precision in the Hive table definition will be 
smaller than the actual precision of the column on storage! Hive may not be 
able to read data from this column.");
--- End diff --

I created SQOOP-3418 to cover the documentation part.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709822#comment-16709822
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238984324
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
   public static Iterable parameters() {
 return Arrays.asList(
-new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN)},
-new Object[]{"INT", Schema.create(Schema.Type.INT)},
-new Object[]{"BIGINT", Schema.create(Schema.Type.LONG)},
-new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT)},
-new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE)},
-new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>())}, // Schema.Type.ENUM
-new Object[]{"STRING", Schema.create(Schema.Type.STRING)},
-new Object[]{"BINARY", Schema.create(Schema.Type.BYTES)},
-new Object[]{"BINARY", Schema.createFixed("Fixed", "doc", "space", 
1) }
-//, new Object[]{"DECIMAL", Schema.create(Schema.Type.UNION).}
+new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN), new 
SqoopOptions()},
--- End diff --

We could static import the create methods from Schema so that would make 
these lines shorter.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709824#comment-16709824
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238984033
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
--- End diff --

I would not add options here, because SqoopOptions does not have a nice 
toString() method so we won't see anything meaningful in the test output.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709821#comment-16709821
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238984816
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
   public static Iterable parameters() {
 return Arrays.asList(
-new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN)},
-new Object[]{"INT", Schema.create(Schema.Type.INT)},
-new Object[]{"BIGINT", Schema.create(Schema.Type.LONG)},
-new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT)},
-new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE)},
-new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>())}, // Schema.Type.ENUM
-new Object[]{"STRING", Schema.create(Schema.Type.STRING)},
-new Object[]{"BINARY", Schema.create(Schema.Type.BYTES)},
-new Object[]{"BINARY", Schema.createFixed("Fixed", "doc", "space", 
1) }
-//, new Object[]{"DECIMAL", Schema.create(Schema.Type.UNION).}
+new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN), new 
SqoopOptions()},
+new Object[]{"INT", Schema.create(Schema.Type.INT), new 
SqoopOptions()},
+new Object[]{"BIGINT", Schema.create(Schema.Type.LONG), new 
SqoopOptions()},
+new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT), new 
SqoopOptions()},
+new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE), new 
SqoopOptions()},
+new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>()), new SqoopOptions()}, // Schema.Type.ENUM
+new Object[]{"STRING", Schema.create(Schema.Type.STRING), new 
SqoopOptions()},
+new Object[]{"BINARY", Schema.create(Schema.Type.BYTES), new 
SqoopOptions()},
--- End diff --

I would add one more test case here:
new Object[]{"BINARY", create(Schema.Type.BYTES), 
createSqoopOptionsWithLogicalTypesEnabled()},

to make sure that we test the if condition 
org/apache/sqoop/hive/HiveTypes.java:127 fully



> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709823#comment-16709823
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238984411
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
   public static Iterable parameters() {
 return Arrays.asList(
-new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN)},
-new Object[]{"INT", Schema.create(Schema.Type.INT)},
-new Object[]{"BIGINT", Schema.create(Schema.Type.LONG)},
-new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT)},
-new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE)},
-new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>())}, // Schema.Type.ENUM
-new Object[]{"STRING", Schema.create(Schema.Type.STRING)},
-new Object[]{"BINARY", Schema.create(Schema.Type.BYTES)},
-new Object[]{"BINARY", Schema.createFixed("Fixed", "doc", "space", 
1) }
-//, new Object[]{"DECIMAL", Schema.create(Schema.Type.UNION).}
+new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN), new 
SqoopOptions()},
+new Object[]{"INT", Schema.create(Schema.Type.INT), new 
SqoopOptions()},
+new Object[]{"BIGINT", Schema.create(Schema.Type.LONG), new 
SqoopOptions()},
+new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT), new 
SqoopOptions()},
+new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE), new 
SqoopOptions()},
+new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>()), new SqoopOptions()}, // Schema.Type.ENUM
--- End diff --

Do we need the comment at the end of the line?


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3416) Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable in the gradle build

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709831#comment-16709831
 ] 

ASF GitHub Bot commented on SQOOP-3416:
---

Github user fszabo2 commented on the issue:

https://github.com/apache/sqoop/pull/64
  
This became obsolete


> Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable 
> in the gradle build
> ---
>
> Key: SQOOP-3416
> URL: https://issues.apache.org/jira/browse/SQOOP-3416
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Minor
>
> Since the sqoopThirdPartyLib doesn't have a default value, if one runs the 
> Oracle tests, one always have to specify the sqoop.thirdparty.lib.dir system 
> variable.
> With this change, we just have to move the downloaded oracle driver to 
> /var/lib/dir and avoid some typing. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709884#comment-16709884
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r239008539
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
--- End diff --

removed


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3416) Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable in the gradle build

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709832#comment-16709832
 ] 

ASF GitHub Bot commented on SQOOP-3416:
---

Github user fszabo2 closed the pull request at:

https://github.com/apache/sqoop/pull/64


> Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable 
> in the gradle build
> ---
>
> Key: SQOOP-3416
> URL: https://issues.apache.org/jira/browse/SQOOP-3416
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Minor
>
> Since the sqoopThirdPartyLib doesn't have a default value, if one runs the 
> Oracle tests, one always have to specify the sqoop.thirdparty.lib.dir system 
> variable.
> With this change, we just have to move the downloaded oracle driver to 
> /var/lib/dir and avoid some typing. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708806#comment-16708806
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238693735
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -83,27 +89,58 @@ public static String toHiveType(int sqlType) {
   }
   }
 
-  public static String toHiveType(Schema.Type avroType) {
-  switch (avroType) {
-case BOOLEAN:
-  return HIVE_TYPE_BOOLEAN;
-case INT:
-  return HIVE_TYPE_INT;
-case LONG:
-  return HIVE_TYPE_BIGINT;
-case FLOAT:
-  return HIVE_TYPE_FLOAT;
-case DOUBLE:
-  return HIVE_TYPE_DOUBLE;
-case STRING:
-case ENUM:
-  return HIVE_TYPE_STRING;
-case BYTES:
-case FIXED:
-  return HIVE_TYPE_BINARY;
-default:
-  return null;
+  public static String toHiveType(Schema schema, SqoopOptions options) {
+if (schema.getType() == Schema.Type.UNION) {
+  for (Schema subSchema : schema.getTypes()) {
+if (subSchema.getType() != Schema.Type.NULL) {
+  return toHiveType(subSchema, options);
+}
+  }
+}
+
+Schema.Type avroType = schema.getType();
+switch (avroType) {
+  case BOOLEAN:
+return HIVE_TYPE_BOOLEAN;
+  case INT:
+return HIVE_TYPE_INT;
+  case LONG:
+return HIVE_TYPE_BIGINT;
+  case FLOAT:
+return HIVE_TYPE_FLOAT;
+  case DOUBLE:
+return HIVE_TYPE_DOUBLE;
+  case STRING:
+  case ENUM:
+return HIVE_TYPE_STRING;
+  case BYTES:
+return mapToDecimalOrBinary(schema, options);
+  case FIXED:
+return HIVE_TYPE_BINARY;
+  default:
+throw new RuntimeException(String.format("There is no Hive type 
mapping defined for the Avro type of: %s ", avroType.getName()));
+}
+  }
+
+  private static String mapToDecimalOrBinary(Schema schema, SqoopOptions 
options) {
+boolean logicalTypesEnabled = 
options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false);
+if (logicalTypesEnabled && schema.getLogicalType() != null && 
schema.getLogicalType() instanceof Decimal) {
+  Decimal decimal = (Decimal) schema.getLogicalType();
+
+  // trimming precision and scale to Hive's maximum values.
+  int precision = Math.min(HiveDecimal.MAX_PRECISION, 
decimal.getPrecision());
+  if (precision < decimal.getPrecision()) {
+LOG.warn("Warning! Precision in the Hive table definition will be 
smaller than the actual precision of the column on storage! Hive may not be 
able to read data from this column.");
--- End diff --

Do you think we should remove this warning? (I think, even if it's 
redundant, it's useful to write this out.)


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708798#comment-16708798
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238691269
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -79,8 +85,42 @@ public static String toHiveType(int sqlType) {
   default:
 // TODO(aaron): Support BINARY, VARBINARY, LONGVARBINARY, DISTINCT,
 // BLOB, ARRAY, STRUCT, REF, JAVA_OBJECT.
-return null;
+return null;
+  }
+  }
+
+  private static String mapDecimalsToHiveType(int sqlType, SqoopOptions 
options) {
+if 
(options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false)
+&& (sqlType == Types.NUMERIC || sqlType == Types.DECIMAL)){
--- End diff --

This piece of code was reverted. 


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708801#comment-16708801
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238693021
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -79,8 +85,42 @@ public static String toHiveType(int sqlType) {
   default:
 // TODO(aaron): Support BINARY, VARBINARY, LONGVARBINARY, DISTINCT,
 // BLOB, ARRAY, STRUCT, REF, JAVA_OBJECT.
-return null;
+return null;
+  }
+  }
+
+  private static String mapDecimalsToHiveType(int sqlType, SqoopOptions 
options) {
+if 
(options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false)
+&& (sqlType == Types.NUMERIC || sqlType == Types.DECIMAL)){
+  return HIVE_TYPE_DECIMAL;
+}
+return HIVE_TYPE_DOUBLE;
+  }
+
+
+  public static String toHiveType(Schema schema) {
+if (schema.getType() == Schema.Type.UNION) {
+  for (Schema subSchema : schema.getTypes()) {
+if (subSchema.getLogicalType() != null && 
subSchema.getLogicalType() instanceof Decimal) {
--- End diff --

I learn something new every day :), removed.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708812#comment-16708812
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238694607
  
--- Diff: 
src/test/org/apache/sqoop/importjob/numerictypes/NumericTypesImportTestBase.java
 ---
@@ -65,240 +46,79 @@
  * 2. Decimal padding during avro or parquet import
  * In case of Oracle and Postgres, Sqoop has to pad the values with 0s to 
avoid errors.
  */
-public abstract class NumericTypesImportTestBase extends ImportJobTestCase 
implements DatabaseAdapterFactory {
+public abstract class NumericTypesImportTestBase extends ThirdPartyTestBase  {
 
   public static final Log LOG = 
LogFactory.getLog(NumericTypesImportTestBase.class.getName());
 
-  private Configuration conf = new Configuration();
-
-  private final T configuration;
-  private final DatabaseAdapter adapter;
   private final boolean failWithoutExtraArgs;
   private final boolean failWithPadding;
 
-  // Constants for the basic test case, that doesn't use extra arguments
-  // that are required to avoid errors, i.e. padding and default precision 
and scale.
-  protected final static boolean SUCCEED_WITHOUT_EXTRA_ARGS = false;
-  protected final static boolean FAIL_WITHOUT_EXTRA_ARGS = true;
-
-  // Constants for the test case that has padding specified but not 
default precision and scale.
-  protected final static boolean SUCCEED_WITH_PADDING_ONLY = false;
-  protected final static boolean FAIL_WITH_PADDING_ONLY = true;
-
-  private Path tableDirPath;
-
   public NumericTypesImportTestBase(T configuration, boolean 
failWithoutExtraArgs, boolean failWithPaddingOnly) {
-this.adapter = createAdapter();
-this.configuration = configuration;
+super(configuration);
 this.failWithoutExtraArgs = failWithoutExtraArgs;
 this.failWithPadding = failWithPaddingOnly;
   }
 
-  @Rule
-  public ExpectedException thrown = ExpectedException.none();
-
-  @Override
-  protected Configuration getConf() {
-return conf;
-  }
-
-  @Override
-  protected boolean useHsqldbTestServer() {
-return false;
-  }
-
-  @Override
-  protected String getConnectString() {
-return adapter.getConnectionString();
-  }
-
-  @Override
-  protected SqoopOptions getSqoopOptions(Configuration conf) {
-SqoopOptions opts = new SqoopOptions(conf);
-adapter.injectConnectionParameters(opts);
-return opts;
-  }
-
-  @Override
-  protected void dropTableIfExists(String table) throws SQLException {
-adapter.dropTableIfExists(table, getManager());
-  }
-
   @Before
   public void setUp() {
 super.setUp();
-String[] names = configuration.getNames();
-String[] types = configuration.getTypes();
-createTableWithColTypesAndNames(names, types, new String[0]);
-List inputData = configuration.getSampleData();
-for (String[] input  : inputData) {
-  insertIntoTable(names, types, input);
-}
 tableDirPath = new Path(getWarehouseDir() + "/" + getTableName());
   }
 
-  @After
-  public void tearDown() {
-try {
-  dropTableIfExists(getTableName());
-} catch (SQLException e) {
-  LOG.warn("Error trying to drop table on tearDown: " + e);
-}
-super.tearDown();
-  }
+  public Path tableDirPath;
--- End diff --

Makes sense


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3417) Execute Oracle XE tests on Travis CI

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708771#comment-16708771
 ] 

ASF GitHub Bot commented on SQOOP-3417:
---

Github user asfgit closed the pull request at:

https://github.com/apache/sqoop/pull/65


> Execute Oracle XE tests on Travis CI
> 
>
> Key: SQOOP-3417
> URL: https://issues.apache.org/jira/browse/SQOOP-3417
> Project: Sqoop
>  Issue Type: Test
>Affects Versions: 1.4.7
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
>
> The task is to enable the Travis CI to execute Oracle XE tests too 
> automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710047#comment-16710047
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r239057381
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
   public static Iterable parameters() {
 return Arrays.asList(
-new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN)},
-new Object[]{"INT", Schema.create(Schema.Type.INT)},
-new Object[]{"BIGINT", Schema.create(Schema.Type.LONG)},
-new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT)},
-new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE)},
-new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>())}, // Schema.Type.ENUM
-new Object[]{"STRING", Schema.create(Schema.Type.STRING)},
-new Object[]{"BINARY", Schema.create(Schema.Type.BYTES)},
-new Object[]{"BINARY", Schema.createFixed("Fixed", "doc", "space", 
1) }
-//, new Object[]{"DECIMAL", Schema.create(Schema.Type.UNION).}
+new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN), new 
SqoopOptions()},
+new Object[]{"INT", Schema.create(Schema.Type.INT), new 
SqoopOptions()},
+new Object[]{"BIGINT", Schema.create(Schema.Type.LONG), new 
SqoopOptions()},
+new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT), new 
SqoopOptions()},
+new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE), new 
SqoopOptions()},
+new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>()), new SqoopOptions()}, // Schema.Type.ENUM
+new Object[]{"STRING", Schema.create(Schema.Type.STRING), new 
SqoopOptions()},
+new Object[]{"BINARY", Schema.create(Schema.Type.BYTES), new 
SqoopOptions()},
--- End diff --

Yeah, make sense


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710044#comment-16710044
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r239057124
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
   public static Iterable parameters() {
 return Arrays.asList(
-new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN)},
-new Object[]{"INT", Schema.create(Schema.Type.INT)},
-new Object[]{"BIGINT", Schema.create(Schema.Type.LONG)},
-new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT)},
-new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE)},
-new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>())}, // Schema.Type.ENUM
-new Object[]{"STRING", Schema.create(Schema.Type.STRING)},
-new Object[]{"BINARY", Schema.create(Schema.Type.BYTES)},
-new Object[]{"BINARY", Schema.createFixed("Fixed", "doc", "space", 
1) }
-//, new Object[]{"DECIMAL", Schema.create(Schema.Type.UNION).}
+new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN), new 
SqoopOptions()},
+new Object[]{"INT", Schema.create(Schema.Type.INT), new 
SqoopOptions()},
+new Object[]{"BIGINT", Schema.create(Schema.Type.LONG), new 
SqoopOptions()},
+new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT), new 
SqoopOptions()},
+new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE), new 
SqoopOptions()},
+new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>()), new SqoopOptions()}, // Schema.Type.ENUM
--- End diff --

removed


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710046#comment-16710046
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r239057294
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -41,30 +44,49 @@
 
   private final String hiveType;
   private final Schema schema;
+  private final SqoopOptions options;
 
-  @Parameters(name = "hiveType = {0}, schema = {1}")
+  @Parameters(name = "hiveType = {0}, schema = {1}, options = {2}")
   public static Iterable parameters() {
 return Arrays.asList(
-new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN)},
-new Object[]{"INT", Schema.create(Schema.Type.INT)},
-new Object[]{"BIGINT", Schema.create(Schema.Type.LONG)},
-new Object[]{"FLOAT", Schema.create(Schema.Type.FLOAT)},
-new Object[]{"DOUBLE", Schema.create(Schema.Type.DOUBLE)},
-new Object[]{"STRING", Schema.createEnum("ENUM", "doc", 
"namespce", new ArrayList<>())}, // Schema.Type.ENUM
-new Object[]{"STRING", Schema.create(Schema.Type.STRING)},
-new Object[]{"BINARY", Schema.create(Schema.Type.BYTES)},
-new Object[]{"BINARY", Schema.createFixed("Fixed", "doc", "space", 
1) }
-//, new Object[]{"DECIMAL", Schema.create(Schema.Type.UNION).}
+new Object[]{"BOOLEAN", Schema.create(Schema.Type.BOOLEAN), new 
SqoopOptions()},
--- End diff --

I've added a bunch of static imports (also for Schema.Type) so now the 
whole thing is a lot more readable.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710230#comment-16710230
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r239108068
  
--- Diff: 
src/test/org/apache/sqoop/importjob/numerictypes/NumericTypesParquetImportTestBase.java
 ---
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.sqoop.importjob.numerictypes;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.fs.Path;
--- End diff --

Unused import


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710229#comment-16710229
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r239107983
  
--- Diff: 
src/test/org/apache/sqoop/importjob/numerictypes/NumericTypesParquetImportTestBase.java
 ---
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.sqoop.importjob.numerictypes;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.schema.MessageType;
+import org.apache.parquet.schema.OriginalType;
+import org.apache.sqoop.importjob.configuration.ParquetTestConfiguration;
+import org.apache.sqoop.testutil.ArgumentArrayBuilder;
+import org.apache.sqoop.testutil.NumericTypesTestUtils;
+import org.apache.sqoop.util.ParquetReader;
+import org.junit.Before;
--- End diff --

Unused import


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708865#comment-16708865
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238707355
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -83,27 +89,58 @@ public static String toHiveType(int sqlType) {
   }
   }
 
-  public static String toHiveType(Schema.Type avroType) {
-  switch (avroType) {
-case BOOLEAN:
-  return HIVE_TYPE_BOOLEAN;
-case INT:
-  return HIVE_TYPE_INT;
-case LONG:
-  return HIVE_TYPE_BIGINT;
-case FLOAT:
-  return HIVE_TYPE_FLOAT;
-case DOUBLE:
-  return HIVE_TYPE_DOUBLE;
-case STRING:
-case ENUM:
-  return HIVE_TYPE_STRING;
-case BYTES:
-case FIXED:
-  return HIVE_TYPE_BINARY;
-default:
-  return null;
+  public static String toHiveType(Schema schema, SqoopOptions options) {
+if (schema.getType() == Schema.Type.UNION) {
+  for (Schema subSchema : schema.getTypes()) {
+if (subSchema.getType() != Schema.Type.NULL) {
+  return toHiveType(subSchema, options);
+}
+  }
+}
+
+Schema.Type avroType = schema.getType();
+switch (avroType) {
+  case BOOLEAN:
+return HIVE_TYPE_BOOLEAN;
+  case INT:
+return HIVE_TYPE_INT;
+  case LONG:
+return HIVE_TYPE_BIGINT;
+  case FLOAT:
+return HIVE_TYPE_FLOAT;
+  case DOUBLE:
+return HIVE_TYPE_DOUBLE;
+  case STRING:
+  case ENUM:
+return HIVE_TYPE_STRING;
+  case BYTES:
+return mapToDecimalOrBinary(schema, options);
+  case FIXED:
+return HIVE_TYPE_BINARY;
+  default:
+throw new RuntimeException(String.format("There is no Hive type 
mapping defined for the Avro type of: %s ", avroType.getName()));
+}
+  }
+
+  private static String mapToDecimalOrBinary(Schema schema, SqoopOptions 
options) {
+boolean logicalTypesEnabled = 
options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false);
+if (logicalTypesEnabled && schema.getLogicalType() != null && 
schema.getLogicalType() instanceof Decimal) {
--- End diff --

I'm learning something new every day! :) removed 


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708864#comment-16708864
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238707092
  
--- Diff: 
src/test/org/apache/sqoop/importjob/numerictypes/NumericTypesImportTestBase.java
 ---
@@ -65,240 +46,79 @@
  * 2. Decimal padding during avro or parquet import
  * In case of Oracle and Postgres, Sqoop has to pad the values with 0s to 
avoid errors.
  */
-public abstract class NumericTypesImportTestBase extends ImportJobTestCase 
implements DatabaseAdapterFactory {
+public abstract class NumericTypesImportTestBase extends ThirdPartyTestBase  {
 
   public static final Log LOG = 
LogFactory.getLog(NumericTypesImportTestBase.class.getName());
 
-  private Configuration conf = new Configuration();
-
-  private final T configuration;
-  private final DatabaseAdapter adapter;
   private final boolean failWithoutExtraArgs;
   private final boolean failWithPadding;
 
-  // Constants for the basic test case, that doesn't use extra arguments
-  // that are required to avoid errors, i.e. padding and default precision 
and scale.
-  protected final static boolean SUCCEED_WITHOUT_EXTRA_ARGS = false;
-  protected final static boolean FAIL_WITHOUT_EXTRA_ARGS = true;
-
-  // Constants for the test case that has padding specified but not 
default precision and scale.
-  protected final static boolean SUCCEED_WITH_PADDING_ONLY = false;
-  protected final static boolean FAIL_WITH_PADDING_ONLY = true;
-
-  private Path tableDirPath;
-
   public NumericTypesImportTestBase(T configuration, boolean 
failWithoutExtraArgs, boolean failWithPaddingOnly) {
-this.adapter = createAdapter();
-this.configuration = configuration;
+super(configuration);
 this.failWithoutExtraArgs = failWithoutExtraArgs;
 this.failWithPadding = failWithPaddingOnly;
   }
 
-  @Rule
-  public ExpectedException thrown = ExpectedException.none();
-
-  @Override
-  protected Configuration getConf() {
-return conf;
-  }
-
-  @Override
-  protected boolean useHsqldbTestServer() {
-return false;
-  }
-
-  @Override
-  protected String getConnectString() {
-return adapter.getConnectionString();
-  }
-
-  @Override
-  protected SqoopOptions getSqoopOptions(Configuration conf) {
-SqoopOptions opts = new SqoopOptions(conf);
-adapter.injectConnectionParameters(opts);
-return opts;
-  }
-
-  @Override
-  protected void dropTableIfExists(String table) throws SQLException {
-adapter.dropTableIfExists(table, getManager());
-  }
-
   @Before
   public void setUp() {
 super.setUp();
-String[] names = configuration.getNames();
-String[] types = configuration.getTypes();
-createTableWithColTypesAndNames(names, types, new String[0]);
-List inputData = configuration.getSampleData();
-for (String[] input  : inputData) {
-  insertIntoTable(names, types, input);
-}
 tableDirPath = new Path(getWarehouseDir() + "/" + getTableName());
   }
 
-  @After
-  public void tearDown() {
-try {
-  dropTableIfExists(getTableName());
-} catch (SQLException e) {
-  LOG.warn("Error trying to drop table on tearDown: " + e);
-}
-super.tearDown();
-  }
+  public Path tableDirPath;
 
-  private ArgumentArrayBuilder getArgsBuilder(SqoopOptions.FileLayout 
fileLayout) {
-ArgumentArrayBuilder builder = new ArgumentArrayBuilder();
-if (AvroDataFile.equals(fileLayout)) {
-  builder.withOption("as-avrodatafile");
-}
-else if (ParquetFile.equals(fileLayout)) {
-  builder.withOption("as-parquetfile");
-}
+  @Rule
+  public ExpectedException thrown = ExpectedException.none();
+
+  abstract public ArgumentArrayBuilder getArgsBuilder();
+  abstract public void verify();
 
+  public ArgumentArrayBuilder includeCommonOptions(ArgumentArrayBuilder 
builder) {
 return builder.withCommonHadoopFlags(true)
 .withOption("warehouse-dir", getWarehouseDir())
 .withOption("num-mappers", "1")
 .withOption("table", getTableName())
 .withOption("connect", getConnectString());
   }
 
-  /**
-   * Adds properties to the given arg builder for decimal precision and 
scale.
-   * @param builder
-   */
-  private void addPrecisionAndScale(ArgumentArrayBuilder builder) {
-

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708813#comment-16708813
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238694723
  
--- Diff: 
src/test/org/apache/sqoop/importjob/numerictypes/NumericTypesParquetImportTestBase.java
 ---
@@ -0,0 +1,89 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.sqoop.importjob.numerictypes;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.schema.MessageType;
+import org.apache.parquet.schema.OriginalType;
+import org.apache.sqoop.importjob.configuration.ParquetTestConfiguration;
+import org.apache.sqoop.testutil.ArgumentArrayBuilder;
+import org.apache.sqoop.testutil.NumericTypesTestUtils;
+import org.apache.sqoop.util.ParquetReader;
+import org.junit.Before;
+
+import java.util.Arrays;
+
+import static org.junit.Assert.assertEquals;
+
+public abstract class NumericTypesParquetImportTestBase extends NumericTypesImportTestBase  {
+
+  public static final Log LOG = 
LogFactory.getLog(NumericTypesParquetImportTestBase.class.getName());
+
+  public NumericTypesParquetImportTestBase(T configuration, boolean 
failWithoutExtraArgs, boolean failWithPaddingOnly) {
+super(configuration, failWithoutExtraArgs, failWithPaddingOnly);
+  }
+
+  @Before
+  public void setUp() {
+super.setUp();
+tableDirPath = new Path(getWarehouseDir() + "/" + getTableName());
--- End diff --

nope.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708811#comment-16708811
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238694488
  
--- Diff: 
src/test/org/apache/sqoop/importjob/numerictypes/NumericTypesAvroImportTestBase.java
 ---
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.sqoop.importjob.numerictypes;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.fs.Path;
+import org.apache.sqoop.importjob.configuration.AvroTestConfiguration;
+import org.apache.sqoop.testutil.ArgumentArrayBuilder;
+import org.apache.sqoop.testutil.AvroTestUtils;
+import org.apache.sqoop.testutil.NumericTypesTestUtils;
+import org.junit.Before;
+
+public abstract class NumericTypesAvroImportTestBase extends NumericTypesImportTestBase  {
+
+  public static final Log LOG = 
LogFactory.getLog(NumericTypesAvroImportTestBase.class.getName());
+
+  public NumericTypesAvroImportTestBase(T configuration, boolean 
failWithoutExtraArgs, boolean failWithPaddingOnly) {
+super(configuration, failWithoutExtraArgs, failWithPaddingOnly);
+  }
+
+  @Before
+  public void setUp() {
+super.setUp();
+tableDirPath = new Path(getWarehouseDir() + "/" + getTableName());
--- End diff --

Nope.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708940#comment-16708940
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238731504
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -83,27 +89,58 @@ public static String toHiveType(int sqlType) {
   }
   }
 
-  public static String toHiveType(Schema.Type avroType) {
-  switch (avroType) {
-case BOOLEAN:
-  return HIVE_TYPE_BOOLEAN;
-case INT:
-  return HIVE_TYPE_INT;
-case LONG:
-  return HIVE_TYPE_BIGINT;
-case FLOAT:
-  return HIVE_TYPE_FLOAT;
-case DOUBLE:
-  return HIVE_TYPE_DOUBLE;
-case STRING:
-case ENUM:
-  return HIVE_TYPE_STRING;
-case BYTES:
-case FIXED:
-  return HIVE_TYPE_BINARY;
-default:
-  return null;
+  public static String toHiveType(Schema schema, SqoopOptions options) {
+if (schema.getType() == Schema.Type.UNION) {
+  for (Schema subSchema : schema.getTypes()) {
+if (subSchema.getType() != Schema.Type.NULL) {
+  return toHiveType(subSchema, options);
+}
+  }
+}
+
+Schema.Type avroType = schema.getType();
+switch (avroType) {
+  case BOOLEAN:
+return HIVE_TYPE_BOOLEAN;
+  case INT:
+return HIVE_TYPE_INT;
+  case LONG:
+return HIVE_TYPE_BIGINT;
+  case FLOAT:
+return HIVE_TYPE_FLOAT;
+  case DOUBLE:
+return HIVE_TYPE_DOUBLE;
+  case STRING:
+  case ENUM:
+return HIVE_TYPE_STRING;
+  case BYTES:
+return mapToDecimalOrBinary(schema, options);
+  case FIXED:
+return HIVE_TYPE_BINARY;
+  default:
+throw new RuntimeException(String.format("There is no Hive type 
mapping defined for the Avro type of: %s ", avroType.getName()));
+}
+  }
+
+  private static String mapToDecimalOrBinary(Schema schema, SqoopOptions 
options) {
+boolean logicalTypesEnabled = 
options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false);
+if (logicalTypesEnabled && schema.getLogicalType() != null && 
schema.getLogicalType() instanceof Decimal) {
+  Decimal decimal = (Decimal) schema.getLogicalType();
+
+  // trimming precision and scale to Hive's maximum values.
+  int precision = Math.min(HiveDecimal.MAX_PRECISION, 
decimal.getPrecision());
+  if (precision < decimal.getPrecision()) {
+LOG.warn("Warning! Precision in the Hive table definition will be 
smaller than the actual precision of the column on storage! Hive may not be 
able to read data from this column.");
--- End diff --

Sorry, I meant that apart from the warning messages here we should mention 
it in the documentation too.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-04 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708863#comment-16708863
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238706982
  
--- Diff: 
src/test/org/apache/sqoop/importjob/numerictypes/NumericTypesImportTestBase.java
 ---
@@ -65,240 +46,79 @@
  * 2. Decimal padding during avro or parquet import
  * In case of Oracle and Postgres, Sqoop has to pad the values with 0s to 
avoid errors.
  */
-public abstract class NumericTypesImportTestBase extends ImportJobTestCase 
implements DatabaseAdapterFactory {
+public abstract class NumericTypesImportTestBase extends ThirdPartyTestBase  {
 
   public static final Log LOG = 
LogFactory.getLog(NumericTypesImportTestBase.class.getName());
 
-  private Configuration conf = new Configuration();
-
-  private final T configuration;
-  private final DatabaseAdapter adapter;
   private final boolean failWithoutExtraArgs;
   private final boolean failWithPadding;
 
-  // Constants for the basic test case, that doesn't use extra arguments
-  // that are required to avoid errors, i.e. padding and default precision 
and scale.
-  protected final static boolean SUCCEED_WITHOUT_EXTRA_ARGS = false;
-  protected final static boolean FAIL_WITHOUT_EXTRA_ARGS = true;
-
-  // Constants for the test case that has padding specified but not 
default precision and scale.
-  protected final static boolean SUCCEED_WITH_PADDING_ONLY = false;
-  protected final static boolean FAIL_WITH_PADDING_ONLY = true;
-
-  private Path tableDirPath;
-
   public NumericTypesImportTestBase(T configuration, boolean 
failWithoutExtraArgs, boolean failWithPaddingOnly) {
-this.adapter = createAdapter();
-this.configuration = configuration;
+super(configuration);
 this.failWithoutExtraArgs = failWithoutExtraArgs;
 this.failWithPadding = failWithPaddingOnly;
   }
 
-  @Rule
-  public ExpectedException thrown = ExpectedException.none();
-
-  @Override
-  protected Configuration getConf() {
-return conf;
-  }
-
-  @Override
-  protected boolean useHsqldbTestServer() {
-return false;
-  }
-
-  @Override
-  protected String getConnectString() {
-return adapter.getConnectionString();
-  }
-
-  @Override
-  protected SqoopOptions getSqoopOptions(Configuration conf) {
-SqoopOptions opts = new SqoopOptions(conf);
-adapter.injectConnectionParameters(opts);
-return opts;
-  }
-
-  @Override
-  protected void dropTableIfExists(String table) throws SQLException {
-adapter.dropTableIfExists(table, getManager());
-  }
-
   @Before
   public void setUp() {
 super.setUp();
-String[] names = configuration.getNames();
-String[] types = configuration.getTypes();
-createTableWithColTypesAndNames(names, types, new String[0]);
-List inputData = configuration.getSampleData();
-for (String[] input  : inputData) {
-  insertIntoTable(names, types, input);
-}
 tableDirPath = new Path(getWarehouseDir() + "/" + getTableName());
   }
 
-  @After
-  public void tearDown() {
-try {
-  dropTableIfExists(getTableName());
-} catch (SQLException e) {
-  LOG.warn("Error trying to drop table on tearDown: " + e);
-}
-super.tearDown();
-  }
+  public Path tableDirPath;
 
-  private ArgumentArrayBuilder getArgsBuilder(SqoopOptions.FileLayout 
fileLayout) {
-ArgumentArrayBuilder builder = new ArgumentArrayBuilder();
-if (AvroDataFile.equals(fileLayout)) {
-  builder.withOption("as-avrodatafile");
-}
-else if (ParquetFile.equals(fileLayout)) {
-  builder.withOption("as-parquetfile");
-}
+  @Rule
+  public ExpectedException thrown = ExpectedException.none();
+
+  abstract public ArgumentArrayBuilder getArgsBuilder();
+  abstract public void verify();
 
+  public ArgumentArrayBuilder includeCommonOptions(ArgumentArrayBuilder 
builder) {
 return builder.withCommonHadoopFlags(true)
 .withOption("warehouse-dir", getWarehouseDir())
 .withOption("num-mappers", "1")
 .withOption("table", getTableName())
 .withOption("connect", getConnectString());
   }
 
-  /**
-   * Adds properties to the given arg builder for decimal precision and 
scale.
-   * @param builder
-   */
-  private void addPrecisionAndScale(ArgumentArrayBuilder builder) {
-

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-07 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712783#comment-16712783
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user asfgit closed the pull request at:

https://github.com/apache/sqoop/pull/60


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3415) Fix gradle test+build when clean applied as the first command + warning issue fixes

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706974#comment-16706974
 ] 

ASF GitHub Bot commented on SQOOP-3415:
---

Github user asfgit closed the pull request at:

https://github.com/apache/sqoop/pull/62


> Fix gradle test+build when clean applied as the first command + warning issue 
> fixes
> ---
>
> Key: SQOOP-3415
> URL: https://issues.apache.org/jira/browse/SQOOP-3415
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Major
> Fix For: 1.5.0
>
>
> If the user wants to build like the following command:
> gradlew clean unittest
> the gradle process ends up in an exception and the whole process left there 
> hanging forever. The root cause of this is the following:
> tasks.withType runs in the configuration part of the build, where we ensure 
> the neccessary directories exist.
> after that clean is executed and all of the dirs got deleted.
> Proposed fix:
> Apply directory creation as the first step of test tasks.
> on the top:
> there are some missing options b/c of Junit annotation processors, and also 
> Xlint information are swallowed currently. We aim to fix these things as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3393) TestNetezzaExternalTableExportMapper hangs

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706985#comment-16706985
 ] 

ASF GitHub Bot commented on SQOOP-3393:
---

Github user asfgit closed the pull request at:

https://github.com/apache/sqoop/pull/63


> TestNetezzaExternalTableExportMapper hangs
> --
>
> Key: SQOOP-3393
> URL: https://issues.apache.org/jira/browse/SQOOP-3393
> Project: Sqoop
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.5.0, 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 1.5.0, 3.0.0
>
>
> Introduced in SQOOP-3378, spotted by [~vasas].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707126#comment-16707126
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238251323
  
--- Diff: 
src/test/org/apache/sqoop/importjob/configuration/MysqlImportJobTestConfiguration.java
 ---
@@ -65,4 +66,21 @@
   public String toString() {
 return getClass().getSimpleName();
   }
+
+  @Override
+  public Object[] getExpectedResultsForHive() {
--- End diff --

Fair point!

I added a couple of comments, hope it clarifies the magic numbers.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707129#comment-16707129
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user fszabo2 commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238251774
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -37,16 +42,28 @@
   private static final String HIVE_TYPE_STRING = "STRING";
   private static final String HIVE_TYPE_BOOLEAN = "BOOLEAN";
   private static final String HIVE_TYPE_BINARY = "BINARY";
+  private static final String HIVE_TYPE_DECIMAL = "DECIMAL";
 
   public static final Log LOG = 
LogFactory.getLog(HiveTypes.class.getName());
 
   private HiveTypes() { }
 
+
+  public static String toHiveType(int sqlType, SqoopOptions options) {
+
+if 
(options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false)
+&& (sqlType == Types.NUMERIC || sqlType == Types.DECIMAL)){
+  return HIVE_TYPE_DECIMAL;
+}
+return toHiveType(sqlType);
+  }
+
+
   /**
* Given JDBC SQL types coming from another database, what is the best
* mapping to a Hive-specific type?
*/
-  public static String toHiveType(int sqlType) {
+  private static String toHiveType(int sqlType) {
--- End diff --

After taking another look at this file, it turns out that this method is 
only called textfile import. Since my only intended use case is parquetfile, I 
reverted this part of the change.

The implementation will be throwing a RuntimeException (for mapping error) 
and log warnings in case of a parquet files, as you've suggested.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3416) Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable in the gradle build

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707152#comment-16707152
 ] 

ASF GitHub Bot commented on SQOOP-3416:
---

GitHub user fszabo2 opened a pull request:

https://github.com/apache/sqoop/pull/64

SQOOP-3416 Give the default value of /var/lib/sqoop to the 
sqoopThirdPartyLib variable in the gradle build



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fszabo2/sqoop SQOOP-3416

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/64.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #64


commit f439cb31674f662b6f51172cc329b64e95aea13b
Author: Fero Szabo 
Date:   2018-11-30T15:46:28Z

adding default value to sqoopThirdPartyLib




> Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable 
> in the gradle build
> ---
>
> Key: SQOOP-3416
> URL: https://issues.apache.org/jira/browse/SQOOP-3416
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Minor
>
> Since the sqoopThirdPartyLib doesn't have a default value, if one runs the 
> Oracle tests, one always have to specify the sqoop.thirdparty.lib.dir system 
> variable.
> With this change, we just have to move the downloaded oracle driver to 
> /var/lib/dir and avoid some typing. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707472#comment-16707472
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238326484
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -83,27 +89,58 @@ public static String toHiveType(int sqlType) {
   }
   }
 
-  public static String toHiveType(Schema.Type avroType) {
-  switch (avroType) {
-case BOOLEAN:
-  return HIVE_TYPE_BOOLEAN;
-case INT:
-  return HIVE_TYPE_INT;
-case LONG:
-  return HIVE_TYPE_BIGINT;
-case FLOAT:
-  return HIVE_TYPE_FLOAT;
-case DOUBLE:
-  return HIVE_TYPE_DOUBLE;
-case STRING:
-case ENUM:
-  return HIVE_TYPE_STRING;
-case BYTES:
-case FIXED:
-  return HIVE_TYPE_BINARY;
-default:
-  return null;
+  public static String toHiveType(Schema schema, SqoopOptions options) {
+if (schema.getType() == Schema.Type.UNION) {
+  for (Schema subSchema : schema.getTypes()) {
+if (subSchema.getType() != Schema.Type.NULL) {
+  return toHiveType(subSchema, options);
+}
+  }
+}
+
+Schema.Type avroType = schema.getType();
+switch (avroType) {
+  case BOOLEAN:
+return HIVE_TYPE_BOOLEAN;
+  case INT:
+return HIVE_TYPE_INT;
+  case LONG:
+return HIVE_TYPE_BIGINT;
+  case FLOAT:
+return HIVE_TYPE_FLOAT;
+  case DOUBLE:
+return HIVE_TYPE_DOUBLE;
+  case STRING:
+  case ENUM:
+return HIVE_TYPE_STRING;
+  case BYTES:
+return mapToDecimalOrBinary(schema, options);
+  case FIXED:
+return HIVE_TYPE_BINARY;
+  default:
+throw new RuntimeException(String.format("There is no Hive type 
mapping defined for the Avro type of: %s ", avroType.getName()));
+}
+  }
+
+  private static String mapToDecimalOrBinary(Schema schema, SqoopOptions 
options) {
+boolean logicalTypesEnabled = 
options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false);
+if (logicalTypesEnabled && schema.getLogicalType() != null && 
schema.getLogicalType() instanceof Decimal) {
+  Decimal decimal = (Decimal) schema.getLogicalType();
+
+  // trimming precision and scale to Hive's maximum values.
+  int precision = Math.min(HiveDecimal.MAX_PRECISION, 
decimal.getPrecision());
+  if (precision < decimal.getPrecision()) {
+LOG.warn("Warning! Precision in the Hive table definition will be 
smaller than the actual precision of the column on storage! Hive may not be 
able to read data from this column.");
--- End diff --

I think we mention this potential problem in the documentation somewhere.


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707471#comment-16707471
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r237907664
  
--- Diff: src/java/org/apache/sqoop/hive/HiveTypes.java ---
@@ -79,8 +85,42 @@ public static String toHiveType(int sqlType) {
   default:
 // TODO(aaron): Support BINARY, VARBINARY, LONGVARBINARY, DISTINCT,
 // BLOB, ARRAY, STRUCT, REF, JAVA_OBJECT.
-return null;
+return null;
+  }
+  }
+
+  private static String mapDecimalsToHiveType(int sqlType, SqoopOptions 
options) {
+if 
(options.getConf().getBoolean(ConfigurationConstants.PROP_ENABLE_PARQUET_LOGICAL_TYPE_DECIMAL,
 false)
+&& (sqlType == Types.NUMERIC || sqlType == Types.DECIMAL)){
--- End diff --

Do we need to test for NUMERIC and DECIMAL types again here?


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-12-03 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707478#comment-16707478
 ] 

ASF GitHub Bot commented on SQOOP-3396:
---

Github user szvasas commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/60#discussion_r238317375
  
--- Diff: 
src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java ---
@@ -38,29 +40,30 @@
 public class TestHiveTypesForAvroTypeMapping {
--- End diff --

Can we add more parameters to this test to cover the changes in 
org.apache.sqoop.hive.HiveTypes#mapToDecimalOrBinary?


> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 118 matches

Mail list logo