[jira] [Commented] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread Boglarka Egyed (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163097#comment-16163097
 ] 

Boglarka Egyed commented on SQOOP-3014:
---

Thank you [~Swank] for this fix! Please feel free to close the related Review 
Request.

> Sqoop with HCatalog import loose precision for large numbers that does not 
> fit into double
> --
>
> Key: SQOOP-3014
> URL: https://issues.apache.org/jira/browse/SQOOP-3014
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
>Reporter: Pavel Benes
>Assignee: Zoltán Tóth
>Priority: Critical
> Fix For: 1.5.0
>
> Attachments: oracle-sqoop-error.png, SQOOP-3014.patch
>
>
> When using sqoop with HCatalog to import data from JDBC (I have tried 
> Oracle11) all numbers that does not fit into double are loosing its precision 
> or are distorted.
> Steps to reproduce:
> 1) Create test table in Oracle and fill it with test data
> {code}
> CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5))
>  
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', 
> 454018528782.42006329)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', 
> 87658675864540185.123456789123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', 
> 87658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', 
> 87658675864540185.123)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', 
> 7658675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', 
> 7658675864540185.123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', 
> 658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', 
> 58675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', 
> 8675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', 
> 675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', 
> 75864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', 
> 35864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', 
> 5864540185.12345) 
> {code}
> 2) Create table in Hive database
> {code}
> CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) 
> STORED AS ORC;
> {code}
> 3) Import data from Oracle to Hive using sqoop
> {code}
> export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar
> sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username  XXX 
> --password  XXX --hcatalog-database  default --hcatalog-table 
> pbe_test_sqoop_error  --driver oracle.jdbc.OracleDriver  --query 'SELECT id, 
> type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1
> {code}
> 4) Display data from Hive table
> {code}
> hive> select * from pbe_test_sqoop_error;
> OK
> row-1   454018528782.42004
> row-2   87658675864540192
> row-3   87658675864540192
> row-4   87658675864540192
> row-5   7658675864540185
> row-6   7658675864540185
> row-7   658675864540185.125
> row-8   58675864540185.125
> row-9   8675864540185.12305
> row-10  675864540185.12341
> row-11  75864540185.12344
> row-12  35864540185.12345
> row-13  5864540185.12345
> Time taken: 0.455 seconds, Fetched: 13 row(s)
> {code}
> Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part 
> of the number before dot is wrong. All looks correctly in Oracle as can be 
> seen on the attached screenshot.
> The problem seems to be in the java class 
> https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java
>  starting at line 437.
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = new BigDecimal(n.doubleValue(),
> MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code}
> all numbers, even those that are stored in BigDecimal are squeezed through 
> double which leads to the precision lost The same issue could be at some 
> places when working with large numbers.
> The following code fixes this issue:
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = val instanceof BigDecimal ?  (BigDecimal) val : new 
> BigDecimal(n.doubleValue(), MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code} . 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163027#comment-16163027
 ] 

Hudson commented on SQOOP-3014:
---

SUCCESS: Integrated in Jenkins build Sqoop-hadoop200 #1129 (See 
[https://builds.apache.org/job/Sqoop-hadoop200/1129/])
SQOOP-3014: Sqoop with HCatalog import loose precision for large numbers (bogi: 
[https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=f3783284217a75b5313e91b73bd183df4cdddff4])
* (edit) src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java
* (edit) src/test/org/apache/sqoop/hcat/HCatalogImportTest.java
* (add) src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java


> Sqoop with HCatalog import loose precision for large numbers that does not 
> fit into double
> --
>
> Key: SQOOP-3014
> URL: https://issues.apache.org/jira/browse/SQOOP-3014
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
>Reporter: Pavel Benes
>Assignee: Zoltán Tóth
>Priority: Critical
> Fix For: 1.5.0
>
> Attachments: oracle-sqoop-error.png, SQOOP-3014.patch
>
>
> When using sqoop with HCatalog to import data from JDBC (I have tried 
> Oracle11) all numbers that does not fit into double are loosing its precision 
> or are distorted.
> Steps to reproduce:
> 1) Create test table in Oracle and fill it with test data
> {code}
> CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5))
>  
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', 
> 454018528782.42006329)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', 
> 87658675864540185.123456789123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', 
> 87658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', 
> 87658675864540185.123)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', 
> 7658675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', 
> 7658675864540185.123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', 
> 658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', 
> 58675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', 
> 8675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', 
> 675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', 
> 75864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', 
> 35864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', 
> 5864540185.12345) 
> {code}
> 2) Create table in Hive database
> {code}
> CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) 
> STORED AS ORC;
> {code}
> 3) Import data from Oracle to Hive using sqoop
> {code}
> export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar
> sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username  XXX 
> --password  XXX --hcatalog-database  default --hcatalog-table 
> pbe_test_sqoop_error  --driver oracle.jdbc.OracleDriver  --query 'SELECT id, 
> type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1
> {code}
> 4) Display data from Hive table
> {code}
> hive> select * from pbe_test_sqoop_error;
> OK
> row-1   454018528782.42004
> row-2   87658675864540192
> row-3   87658675864540192
> row-4   87658675864540192
> row-5   7658675864540185
> row-6   7658675864540185
> row-7   658675864540185.125
> row-8   58675864540185.125
> row-9   8675864540185.12305
> row-10  675864540185.12341
> row-11  75864540185.12344
> row-12  35864540185.12345
> row-13  5864540185.12345
> Time taken: 0.455 seconds, Fetched: 13 row(s)
> {code}
> Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part 
> of the number before dot is wrong. All looks correctly in Oracle as can be 
> seen on the attached screenshot.
> The problem seems to be in the java class 
> https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java
>  starting at line 437.
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = new BigDecimal(n.doubleValue(),
> MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code}
> all numbers, even those that are stored in BigDecimal are squeezed through 
> double which leads to the precision lost The same issue could be at some 
> places when working with large numbers.
> The following code fixes this 

[jira] [Commented] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162998#comment-16162998
 ] 

ASF subversion and git services commented on SQOOP-3014:


Commit f3783284217a75b5313e91b73bd183df4cdddff4 in sqoop's branch 
refs/heads/trunk from [~BoglarkaEgyed]
[ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=f378328 ]

SQOOP-3014: Sqoop with HCatalog import loose precision for large numbers that 
does not fit into double

(Zoltan Toth via Boglarka Egyed)


> Sqoop with HCatalog import loose precision for large numbers that does not 
> fit into double
> --
>
> Key: SQOOP-3014
> URL: https://issues.apache.org/jira/browse/SQOOP-3014
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
>Reporter: Pavel Benes
>Assignee: Zoltán Tóth
>Priority: Critical
> Fix For: 1.5.0
>
> Attachments: oracle-sqoop-error.png, SQOOP-3014.patch
>
>
> When using sqoop with HCatalog to import data from JDBC (I have tried 
> Oracle11) all numbers that does not fit into double are loosing its precision 
> or are distorted.
> Steps to reproduce:
> 1) Create test table in Oracle and fill it with test data
> {code}
> CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5))
>  
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', 
> 454018528782.42006329)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', 
> 87658675864540185.123456789123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', 
> 87658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', 
> 87658675864540185.123)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', 
> 7658675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', 
> 7658675864540185.123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', 
> 658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', 
> 58675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', 
> 8675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', 
> 675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', 
> 75864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', 
> 35864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', 
> 5864540185.12345) 
> {code}
> 2) Create table in Hive database
> {code}
> CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) 
> STORED AS ORC;
> {code}
> 3) Import data from Oracle to Hive using sqoop
> {code}
> export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar
> sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username  XXX 
> --password  XXX --hcatalog-database  default --hcatalog-table 
> pbe_test_sqoop_error  --driver oracle.jdbc.OracleDriver  --query 'SELECT id, 
> type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1
> {code}
> 4) Display data from Hive table
> {code}
> hive> select * from pbe_test_sqoop_error;
> OK
> row-1   454018528782.42004
> row-2   87658675864540192
> row-3   87658675864540192
> row-4   87658675864540192
> row-5   7658675864540185
> row-6   7658675864540185
> row-7   658675864540185.125
> row-8   58675864540185.125
> row-9   8675864540185.12305
> row-10  675864540185.12341
> row-11  75864540185.12344
> row-12  35864540185.12345
> row-13  5864540185.12345
> Time taken: 0.455 seconds, Fetched: 13 row(s)
> {code}
> Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part 
> of the number before dot is wrong. All looks correctly in Oracle as can be 
> seen on the attached screenshot.
> The problem seems to be in the java class 
> https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java
>  starting at line 437.
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = new BigDecimal(n.doubleValue(),
> MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code}
> all numbers, even those that are stored in BigDecimal are squeezed through 
> double which leads to the precision lost The same issue could be at some 
> places when working with large numbers.
> The following code fixes this issue:
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = val instanceof BigDecimal ?  (BigDecimal) val : new 
> 

Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread Boglarka Egyed

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62057/#review185180
---


Ship it!




Ran unit and 3rd party tests successfully for the updated patch too.

- Boglarka Egyed


On Sept. 12, 2017, 12:19 p.m., Zoltán Tóth wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62057/
> ---
> 
> (Updated Sept. 12, 2017, 12:19 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed and Anna Szonyi.
> 
> 
> Bugs: SQOOP-3014
> https://issues.apache.org/jira/browse/SQOOP-3014
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog 
> doesn't change BigDecimals
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java 
> aba2458e 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java d784a205 
>   src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62057/diff/3/
> 
> 
> Testing
> ---
> 
> I ran unit tests and integration tests as well. New test cases were added to 
> test the change
> 
> 
> Thanks,
> 
> Zoltán Tóth
> 
>



[jira] [Updated] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Tóth updated SQOOP-3014:
---
Attachment: SQOOP-3014.patch

> Sqoop with HCatalog import loose precision for large numbers that does not 
> fit into double
> --
>
> Key: SQOOP-3014
> URL: https://issues.apache.org/jira/browse/SQOOP-3014
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
>Reporter: Pavel Benes
>Assignee: Zoltán Tóth
>Priority: Critical
> Fix For: 1.5.0
>
> Attachments: oracle-sqoop-error.png, SQOOP-3014.patch
>
>
> When using sqoop with HCatalog to import data from JDBC (I have tried 
> Oracle11) all numbers that does not fit into double are loosing its precision 
> or are distorted.
> Steps to reproduce:
> 1) Create test table in Oracle and fill it with test data
> {code}
> CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5))
>  
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', 
> 454018528782.42006329)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', 
> 87658675864540185.123456789123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', 
> 87658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', 
> 87658675864540185.123)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', 
> 7658675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', 
> 7658675864540185.123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', 
> 658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', 
> 58675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', 
> 8675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', 
> 675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', 
> 75864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', 
> 35864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', 
> 5864540185.12345) 
> {code}
> 2) Create table in Hive database
> {code}
> CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) 
> STORED AS ORC;
> {code}
> 3) Import data from Oracle to Hive using sqoop
> {code}
> export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar
> sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username  XXX 
> --password  XXX --hcatalog-database  default --hcatalog-table 
> pbe_test_sqoop_error  --driver oracle.jdbc.OracleDriver  --query 'SELECT id, 
> type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1
> {code}
> 4) Display data from Hive table
> {code}
> hive> select * from pbe_test_sqoop_error;
> OK
> row-1   454018528782.42004
> row-2   87658675864540192
> row-3   87658675864540192
> row-4   87658675864540192
> row-5   7658675864540185
> row-6   7658675864540185
> row-7   658675864540185.125
> row-8   58675864540185.125
> row-9   8675864540185.12305
> row-10  675864540185.12341
> row-11  75864540185.12344
> row-12  35864540185.12345
> row-13  5864540185.12345
> Time taken: 0.455 seconds, Fetched: 13 row(s)
> {code}
> Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part 
> of the number before dot is wrong. All looks correctly in Oracle as can be 
> seen on the attached screenshot.
> The problem seems to be in the java class 
> https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java
>  starting at line 437.
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = new BigDecimal(n.doubleValue(),
> MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code}
> all numbers, even those that are stored in BigDecimal are squeezed through 
> double which leads to the precision lost The same issue could be at some 
> places when working with large numbers.
> The following code fixes this issue:
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = val instanceof BigDecimal ?  (BigDecimal) val : new 
> BigDecimal(n.doubleValue(), MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code} . 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Tóth updated SQOOP-3014:
---
Attachment: (was: SQOOP-3014)

> Sqoop with HCatalog import loose precision for large numbers that does not 
> fit into double
> --
>
> Key: SQOOP-3014
> URL: https://issues.apache.org/jira/browse/SQOOP-3014
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
>Reporter: Pavel Benes
>Assignee: Zoltán Tóth
>Priority: Critical
> Fix For: 1.5.0
>
> Attachments: oracle-sqoop-error.png, SQOOP-3014.patch
>
>
> When using sqoop with HCatalog to import data from JDBC (I have tried 
> Oracle11) all numbers that does not fit into double are loosing its precision 
> or are distorted.
> Steps to reproduce:
> 1) Create test table in Oracle and fill it with test data
> {code}
> CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5))
>  
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', 
> 454018528782.42006329)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', 
> 87658675864540185.123456789123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', 
> 87658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', 
> 87658675864540185.123)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', 
> 7658675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', 
> 7658675864540185.123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', 
> 658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', 
> 58675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', 
> 8675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', 
> 675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', 
> 75864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', 
> 35864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', 
> 5864540185.12345) 
> {code}
> 2) Create table in Hive database
> {code}
> CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) 
> STORED AS ORC;
> {code}
> 3) Import data from Oracle to Hive using sqoop
> {code}
> export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar
> sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username  XXX 
> --password  XXX --hcatalog-database  default --hcatalog-table 
> pbe_test_sqoop_error  --driver oracle.jdbc.OracleDriver  --query 'SELECT id, 
> type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1
> {code}
> 4) Display data from Hive table
> {code}
> hive> select * from pbe_test_sqoop_error;
> OK
> row-1   454018528782.42004
> row-2   87658675864540192
> row-3   87658675864540192
> row-4   87658675864540192
> row-5   7658675864540185
> row-6   7658675864540185
> row-7   658675864540185.125
> row-8   58675864540185.125
> row-9   8675864540185.12305
> row-10  675864540185.12341
> row-11  75864540185.12344
> row-12  35864540185.12345
> row-13  5864540185.12345
> Time taken: 0.455 seconds, Fetched: 13 row(s)
> {code}
> Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part 
> of the number before dot is wrong. All looks correctly in Oracle as can be 
> seen on the attached screenshot.
> The problem seems to be in the java class 
> https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java
>  starting at line 437.
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = new BigDecimal(n.doubleValue(),
> MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code}
> all numbers, even those that are stored in BigDecimal are squeezed through 
> double which leads to the precision lost The same issue could be at some 
> places when working with large numbers.
> The following code fixes this issue:
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = val instanceof BigDecimal ?  (BigDecimal) val : new 
> BigDecimal(n.doubleValue(), MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code} . 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Tóth updated SQOOP-3014:
---
Attachment: SQOOP-3014

> Sqoop with HCatalog import loose precision for large numbers that does not 
> fit into double
> --
>
> Key: SQOOP-3014
> URL: https://issues.apache.org/jira/browse/SQOOP-3014
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
>Reporter: Pavel Benes
>Assignee: Zoltán Tóth
>Priority: Critical
> Fix For: 1.5.0
>
> Attachments: oracle-sqoop-error.png, SQOOP-3014
>
>
> When using sqoop with HCatalog to import data from JDBC (I have tried 
> Oracle11) all numbers that does not fit into double are loosing its precision 
> or are distorted.
> Steps to reproduce:
> 1) Create test table in Oracle and fill it with test data
> {code}
> CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5))
>  
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', 
> 454018528782.42006329)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', 
> 87658675864540185.123456789123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', 
> 87658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', 
> 87658675864540185.123)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', 
> 7658675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', 
> 7658675864540185.123456789) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', 
> 658675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', 
> 58675864540185.12345)
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', 
> 8675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', 
> 675864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', 
> 75864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', 
> 35864540185.12345) 
> INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', 
> 5864540185.12345) 
> {code}
> 2) Create table in Hive database
> {code}
> CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) 
> STORED AS ORC;
> {code}
> 3) Import data from Oracle to Hive using sqoop
> {code}
> export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar
> sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username  XXX 
> --password  XXX --hcatalog-database  default --hcatalog-table 
> pbe_test_sqoop_error  --driver oracle.jdbc.OracleDriver  --query 'SELECT id, 
> type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1
> {code}
> 4) Display data from Hive table
> {code}
> hive> select * from pbe_test_sqoop_error;
> OK
> row-1   454018528782.42004
> row-2   87658675864540192
> row-3   87658675864540192
> row-4   87658675864540192
> row-5   7658675864540185
> row-6   7658675864540185
> row-7   658675864540185.125
> row-8   58675864540185.125
> row-9   8675864540185.12305
> row-10  675864540185.12341
> row-11  75864540185.12344
> row-12  35864540185.12345
> row-13  5864540185.12345
> Time taken: 0.455 seconds, Fetched: 13 row(s)
> {code}
> Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part 
> of the number before dot is wrong. All looks correctly in Oracle as can be 
> seen on the attached screenshot.
> The problem seems to be in the java class 
> https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java
>  starting at line 437.
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = new BigDecimal(n.doubleValue(),
> MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code}
> all numbers, even those that are stored in BigDecimal are squeezed through 
> double which leads to the precision lost The same issue could be at some 
> places when working with large numbers.
> The following code fixes this issue:
> {code}
> } else if (hfsType == HCatFieldSchema.Type.DECIMAL) {
>   BigDecimal bd = val instanceof BigDecimal ?  (BigDecimal) val : new 
> BigDecimal(n.doubleValue(), MathContext.DECIMAL128);
>   return HiveDecimal.create(bd);
> }
> {code} . 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62057/#review185166
---


Ship it!




Ship It!

- Szabolcs Vasas


On Sept. 12, 2017, 12:19 p.m., Zoltán Tóth wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62057/
> ---
> 
> (Updated Sept. 12, 2017, 12:19 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed and Anna Szonyi.
> 
> 
> Bugs: SQOOP-3014
> https://issues.apache.org/jira/browse/SQOOP-3014
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog 
> doesn't change BigDecimals
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java 
> aba2458e 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java d784a205 
>   src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62057/diff/3/
> 
> 
> Testing
> ---
> 
> I ran unit tests and integration tests as well. New test cases were added to 
> test the change
> 
> 
> Thanks,
> 
> Zoltán Tóth
> 
>



Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double

2017-09-12 Thread Zoltán Tóth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62057/
---

(Updated Sept. 12, 2017, 12:19 p.m.)


Review request for Sqoop, Boglarka Egyed and Anna Szonyi.


Changes
---

Change updated based on review


Bugs: SQOOP-3014
https://issues.apache.org/jira/browse/SQOOP-3014


Repository: sqoop-trunk


Description
---

HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog 
doesn't change BigDecimals


Diffs (updated)
-

  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java aba2458e 
  src/test/org/apache/sqoop/hcat/HCatalogImportTest.java d784a205 
  src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/62057/diff/3/

Changes: https://reviews.apache.org/r/62057/diff/2-3/


Testing
---

I ran unit tests and integration tests as well. New test cases were added to 
test the change


Thanks,

Zoltán Tóth



Re: Review Request 61372: Expanded Metastore support for MySql, Oracle, Postgresql, MSSql, and DB2

2017-09-12 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61372/#review185159
---


Ship it!




Hi Zach,

Thank you for implementing this patch! I have reviewed it, ran the unit and 
third party test suite, executed some manual testing too and everything seems 
to be fine.

Regards,
Szabolcs

- Szabolcs Vasas


On Aug. 25, 2017, 11:53 a.m., Zach Berkowitz wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61372/
> ---
> 
> (Updated Aug. 25, 2017, 11:53 a.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Jarek Cecho, and Anna Szonyi.
> 
> 
> Bugs: SQOOP-3216
> https://issues.apache.org/jira/browse/SQOOP-3216
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Expanded Metastore support for MySql, Oracle, Postgresql, MSSql, and DB2
> 
> 
> Diffs
> -
> 
>   src/docs/user/metastore-purpose.txt e7eb23d3 
>   src/docs/user/saved-jobs.txt e8757801 
>   src/java/com/cloudera/sqoop/metastore/hsqldb/AutoHsqldbStorage.java 
> 259d9f63 
>   src/java/com/cloudera/sqoop/metastore/hsqldb/HsqldbJobStorage.java 083e2a37 
>   src/java/org/apache/sqoop/SqoopOptions.java 2eb3d8a4 
>   src/java/org/apache/sqoop/manager/CubridManager.java 5a1a0e83 
>   src/java/org/apache/sqoop/manager/Db2Manager.java 61b6868d 
>   src/java/org/apache/sqoop/manager/HsqldbManager.java 9b9c5822 
>   src/java/org/apache/sqoop/manager/JdbcDrivers.java PRE-CREATION 
>   src/java/org/apache/sqoop/manager/MySQLManager.java 3c2276fe 
>   src/java/org/apache/sqoop/manager/NetezzaManager.java 0ac77175 
>   src/java/org/apache/sqoop/manager/OracleManager.java 2f4585cc 
>   src/java/org/apache/sqoop/manager/PostgresqlManager.java 44e041ad 
>   src/java/org/apache/sqoop/manager/SQLServerManager.java 9a3d9183 
>   src/java/org/apache/sqoop/manager/SupportedManagers.java 8a6037af 
>   src/java/org/apache/sqoop/metastore/JobStorageFactory.java 2edc33b8 
>   src/java/org/apache/sqoop/metastore/hsqldb/AutoHsqldbStorage.java 49e30319 
>   src/java/org/apache/sqoop/metastore/hsqldb/HsqldbJobStorage.java a0f29fd0 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 1564bdcb 
>   src/java/org/apache/sqoop/tool/JobTool.java 054e274f 
>   src/test/com/cloudera/sqoop/TestIncrementalImport.java 52a55b78 
>   src/test/com/cloudera/sqoop/metastore/JobToolTestBase.java PRE-CREATION 
>   
> src/test/com/cloudera/sqoop/metastore/MetaConnectIncrementalImportTestBase.java
>  PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/SavedJobsTestBase.java PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/TestSavedJobs.java 61d8c97d 
>   src/test/com/cloudera/sqoop/metastore/db2/DB2JobToolTest.java PRE-CREATION 
>   
> src/test/com/cloudera/sqoop/metastore/db2/DB2MetaConnectIncrementalImportTest.java
>  PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/db2/DB2SavedJobsTest.java 
> PRE-CREATION 
>   
> src/test/com/cloudera/sqoop/metastore/hsqldb/HsqldbMetaConnectIncrementalImportTest.java
>  PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/hsqldb/HsqldbSavedJobsTest.java 
> PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/mysql/MySqlJobToolTest.java 
> PRE-CREATION 
>   
> src/test/com/cloudera/sqoop/metastore/mysql/MySqlMetaConnectIncrementalImportTest.java
>  PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/mysql/MySqlSavedJobsTest.java 
> PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/oracle/OracleJobToolTest.java 
> PRE-CREATION 
>   
> src/test/com/cloudera/sqoop/metastore/oracle/OracleMetaConnectIncrementalImportTest.java
>  PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/oracle/OracleSavedJobsTest.java 
> PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/postgres/PostgresJobToolTest.java 
> PRE-CREATION 
>   
> src/test/com/cloudera/sqoop/metastore/postgres/PostgresMetaConnectIncrementalImportTest.java
>  PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/postgres/PostgresSavedJobsTest.java 
> PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/sqlserver/SqlServerJobToolTest.java 
> PRE-CREATION 
>   
> src/test/com/cloudera/sqoop/metastore/sqlserver/SqlServerMetaConnectIncrementalImportTest.java
>  PRE-CREATION 
>   src/test/com/cloudera/sqoop/metastore/sqlserver/SqlServerSavedJobsTest.java 
> PRE-CREATION 
>   src/test/findbugsExcludeFile.xml a27ec378 
> 
> 
> Diff: https://reviews.apache.org/r/61372/diff/5/
> 
> 
> Testing
> ---
> 
> Three test classes SavedJobsTest, MetaConnectIncrementalImportTest, and 
> JobToolTest all pass for all supported databases.
> 
> 
> Thanks,
> 
> Zach Berkowitz
> 
>