[jira] [Commented] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double
[ https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163097#comment-16163097 ] Boglarka Egyed commented on SQOOP-3014: --- Thank you [~Swank] for this fix! Please feel free to close the related Review Request. > Sqoop with HCatalog import loose precision for large numbers that does not > fit into double > -- > > Key: SQOOP-3014 > URL: https://issues.apache.org/jira/browse/SQOOP-3014 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.6 >Reporter: Pavel Benes >Assignee: Zoltán Tóth >Priority: Critical > Fix For: 1.5.0 > > Attachments: oracle-sqoop-error.png, SQOOP-3014.patch > > > When using sqoop with HCatalog to import data from JDBC (I have tried > Oracle11) all numbers that does not fit into double are loosing its precision > or are distorted. > Steps to reproduce: > 1) Create test table in Oracle and fill it with test data > {code} > CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5)) > > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', > 454018528782.42006329) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', > 87658675864540185.123456789123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', > 87658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', > 87658675864540185.123) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', > 7658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', > 7658675864540185.123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', > 658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', > 58675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', > 8675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', > 675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', > 75864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', > 35864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', > 5864540185.12345) > {code} > 2) Create table in Hive database > {code} > CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) > STORED AS ORC; > {code} > 3) Import data from Oracle to Hive using sqoop > {code} > export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar > sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username XXX > --password XXX --hcatalog-database default --hcatalog-table > pbe_test_sqoop_error --driver oracle.jdbc.OracleDriver --query 'SELECT id, > type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1 > {code} > 4) Display data from Hive table > {code} > hive> select * from pbe_test_sqoop_error; > OK > row-1 454018528782.42004 > row-2 87658675864540192 > row-3 87658675864540192 > row-4 87658675864540192 > row-5 7658675864540185 > row-6 7658675864540185 > row-7 658675864540185.125 > row-8 58675864540185.125 > row-9 8675864540185.12305 > row-10 675864540185.12341 > row-11 75864540185.12344 > row-12 35864540185.12345 > row-13 5864540185.12345 > Time taken: 0.455 seconds, Fetched: 13 row(s) > {code} > Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part > of the number before dot is wrong. All looks correctly in Oracle as can be > seen on the attached screenshot. > The problem seems to be in the java class > https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > starting at line 437. > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = new BigDecimal(n.doubleValue(), > MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} > all numbers, even those that are stored in BigDecimal are squeezed through > double which leads to the precision lost The same issue could be at some > places when working with large numbers. > The following code fixes this issue: > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = val instanceof BigDecimal ? (BigDecimal) val : new > BigDecimal(n.doubleValue(), MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double
[ https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163027#comment-16163027 ] Hudson commented on SQOOP-3014: --- SUCCESS: Integrated in Jenkins build Sqoop-hadoop200 #1129 (See [https://builds.apache.org/job/Sqoop-hadoop200/1129/]) SQOOP-3014: Sqoop with HCatalog import loose precision for large numbers (bogi: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=f3783284217a75b5313e91b73bd183df4cdddff4]) * (edit) src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java * (edit) src/test/org/apache/sqoop/hcat/HCatalogImportTest.java * (add) src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java > Sqoop with HCatalog import loose precision for large numbers that does not > fit into double > -- > > Key: SQOOP-3014 > URL: https://issues.apache.org/jira/browse/SQOOP-3014 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.6 >Reporter: Pavel Benes >Assignee: Zoltán Tóth >Priority: Critical > Fix For: 1.5.0 > > Attachments: oracle-sqoop-error.png, SQOOP-3014.patch > > > When using sqoop with HCatalog to import data from JDBC (I have tried > Oracle11) all numbers that does not fit into double are loosing its precision > or are distorted. > Steps to reproduce: > 1) Create test table in Oracle and fill it with test data > {code} > CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5)) > > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', > 454018528782.42006329) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', > 87658675864540185.123456789123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', > 87658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', > 87658675864540185.123) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', > 7658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', > 7658675864540185.123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', > 658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', > 58675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', > 8675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', > 675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', > 75864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', > 35864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', > 5864540185.12345) > {code} > 2) Create table in Hive database > {code} > CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) > STORED AS ORC; > {code} > 3) Import data from Oracle to Hive using sqoop > {code} > export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar > sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username XXX > --password XXX --hcatalog-database default --hcatalog-table > pbe_test_sqoop_error --driver oracle.jdbc.OracleDriver --query 'SELECT id, > type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1 > {code} > 4) Display data from Hive table > {code} > hive> select * from pbe_test_sqoop_error; > OK > row-1 454018528782.42004 > row-2 87658675864540192 > row-3 87658675864540192 > row-4 87658675864540192 > row-5 7658675864540185 > row-6 7658675864540185 > row-7 658675864540185.125 > row-8 58675864540185.125 > row-9 8675864540185.12305 > row-10 675864540185.12341 > row-11 75864540185.12344 > row-12 35864540185.12345 > row-13 5864540185.12345 > Time taken: 0.455 seconds, Fetched: 13 row(s) > {code} > Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part > of the number before dot is wrong. All looks correctly in Oracle as can be > seen on the attached screenshot. > The problem seems to be in the java class > https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > starting at line 437. > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = new BigDecimal(n.doubleValue(), > MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} > all numbers, even those that are stored in BigDecimal are squeezed through > double which leads to the precision lost The same issue could be at some > places when working with large numbers. > The following code fixes this
[jira] [Commented] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double
[ https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162998#comment-16162998 ] ASF subversion and git services commented on SQOOP-3014: Commit f3783284217a75b5313e91b73bd183df4cdddff4 in sqoop's branch refs/heads/trunk from [~BoglarkaEgyed] [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=f378328 ] SQOOP-3014: Sqoop with HCatalog import loose precision for large numbers that does not fit into double (Zoltan Toth via Boglarka Egyed) > Sqoop with HCatalog import loose precision for large numbers that does not > fit into double > -- > > Key: SQOOP-3014 > URL: https://issues.apache.org/jira/browse/SQOOP-3014 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.6 >Reporter: Pavel Benes >Assignee: Zoltán Tóth >Priority: Critical > Fix For: 1.5.0 > > Attachments: oracle-sqoop-error.png, SQOOP-3014.patch > > > When using sqoop with HCatalog to import data from JDBC (I have tried > Oracle11) all numbers that does not fit into double are loosing its precision > or are distorted. > Steps to reproduce: > 1) Create test table in Oracle and fill it with test data > {code} > CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5)) > > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', > 454018528782.42006329) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', > 87658675864540185.123456789123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', > 87658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', > 87658675864540185.123) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', > 7658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', > 7658675864540185.123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', > 658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', > 58675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', > 8675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', > 675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', > 75864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', > 35864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', > 5864540185.12345) > {code} > 2) Create table in Hive database > {code} > CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) > STORED AS ORC; > {code} > 3) Import data from Oracle to Hive using sqoop > {code} > export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar > sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username XXX > --password XXX --hcatalog-database default --hcatalog-table > pbe_test_sqoop_error --driver oracle.jdbc.OracleDriver --query 'SELECT id, > type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1 > {code} > 4) Display data from Hive table > {code} > hive> select * from pbe_test_sqoop_error; > OK > row-1 454018528782.42004 > row-2 87658675864540192 > row-3 87658675864540192 > row-4 87658675864540192 > row-5 7658675864540185 > row-6 7658675864540185 > row-7 658675864540185.125 > row-8 58675864540185.125 > row-9 8675864540185.12305 > row-10 675864540185.12341 > row-11 75864540185.12344 > row-12 35864540185.12345 > row-13 5864540185.12345 > Time taken: 0.455 seconds, Fetched: 13 row(s) > {code} > Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part > of the number before dot is wrong. All looks correctly in Oracle as can be > seen on the attached screenshot. > The problem seems to be in the java class > https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > starting at line 437. > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = new BigDecimal(n.doubleValue(), > MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} > all numbers, even those that are stored in BigDecimal are squeezed through > double which leads to the precision lost The same issue could be at some > places when working with large numbers. > The following code fixes this issue: > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = val instanceof BigDecimal ? (BigDecimal) val : new >
Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62057/#review185180 --- Ship it! Ran unit and 3rd party tests successfully for the updated patch too. - Boglarka Egyed On Sept. 12, 2017, 12:19 p.m., Zoltán Tóth wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62057/ > --- > > (Updated Sept. 12, 2017, 12:19 p.m.) > > > Review request for Sqoop, Boglarka Egyed and Anna Szonyi. > > > Bugs: SQOOP-3014 > https://issues.apache.org/jira/browse/SQOOP-3014 > > > Repository: sqoop-trunk > > > Description > --- > > HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog > doesn't change BigDecimals > > > Diffs > - > > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > aba2458e > src/test/org/apache/sqoop/hcat/HCatalogImportTest.java d784a205 > src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/62057/diff/3/ > > > Testing > --- > > I ran unit tests and integration tests as well. New test cases were added to > test the change > > > Thanks, > > Zoltán Tóth > >
[jira] [Updated] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double
[ https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Tóth updated SQOOP-3014: --- Attachment: SQOOP-3014.patch > Sqoop with HCatalog import loose precision for large numbers that does not > fit into double > -- > > Key: SQOOP-3014 > URL: https://issues.apache.org/jira/browse/SQOOP-3014 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.6 >Reporter: Pavel Benes >Assignee: Zoltán Tóth >Priority: Critical > Fix For: 1.5.0 > > Attachments: oracle-sqoop-error.png, SQOOP-3014.patch > > > When using sqoop with HCatalog to import data from JDBC (I have tried > Oracle11) all numbers that does not fit into double are loosing its precision > or are distorted. > Steps to reproduce: > 1) Create test table in Oracle and fill it with test data > {code} > CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5)) > > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', > 454018528782.42006329) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', > 87658675864540185.123456789123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', > 87658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', > 87658675864540185.123) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', > 7658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', > 7658675864540185.123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', > 658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', > 58675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', > 8675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', > 675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', > 75864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', > 35864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', > 5864540185.12345) > {code} > 2) Create table in Hive database > {code} > CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) > STORED AS ORC; > {code} > 3) Import data from Oracle to Hive using sqoop > {code} > export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar > sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username XXX > --password XXX --hcatalog-database default --hcatalog-table > pbe_test_sqoop_error --driver oracle.jdbc.OracleDriver --query 'SELECT id, > type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1 > {code} > 4) Display data from Hive table > {code} > hive> select * from pbe_test_sqoop_error; > OK > row-1 454018528782.42004 > row-2 87658675864540192 > row-3 87658675864540192 > row-4 87658675864540192 > row-5 7658675864540185 > row-6 7658675864540185 > row-7 658675864540185.125 > row-8 58675864540185.125 > row-9 8675864540185.12305 > row-10 675864540185.12341 > row-11 75864540185.12344 > row-12 35864540185.12345 > row-13 5864540185.12345 > Time taken: 0.455 seconds, Fetched: 13 row(s) > {code} > Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part > of the number before dot is wrong. All looks correctly in Oracle as can be > seen on the attached screenshot. > The problem seems to be in the java class > https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > starting at line 437. > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = new BigDecimal(n.doubleValue(), > MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} > all numbers, even those that are stored in BigDecimal are squeezed through > double which leads to the precision lost The same issue could be at some > places when working with large numbers. > The following code fixes this issue: > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = val instanceof BigDecimal ? (BigDecimal) val : new > BigDecimal(n.doubleValue(), MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double
[ https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Tóth updated SQOOP-3014: --- Attachment: (was: SQOOP-3014) > Sqoop with HCatalog import loose precision for large numbers that does not > fit into double > -- > > Key: SQOOP-3014 > URL: https://issues.apache.org/jira/browse/SQOOP-3014 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.6 >Reporter: Pavel Benes >Assignee: Zoltán Tóth >Priority: Critical > Fix For: 1.5.0 > > Attachments: oracle-sqoop-error.png, SQOOP-3014.patch > > > When using sqoop with HCatalog to import data from JDBC (I have tried > Oracle11) all numbers that does not fit into double are loosing its precision > or are distorted. > Steps to reproduce: > 1) Create test table in Oracle and fill it with test data > {code} > CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5)) > > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', > 454018528782.42006329) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', > 87658675864540185.123456789123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', > 87658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', > 87658675864540185.123) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', > 7658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', > 7658675864540185.123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', > 658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', > 58675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', > 8675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', > 675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', > 75864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', > 35864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', > 5864540185.12345) > {code} > 2) Create table in Hive database > {code} > CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) > STORED AS ORC; > {code} > 3) Import data from Oracle to Hive using sqoop > {code} > export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar > sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username XXX > --password XXX --hcatalog-database default --hcatalog-table > pbe_test_sqoop_error --driver oracle.jdbc.OracleDriver --query 'SELECT id, > type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1 > {code} > 4) Display data from Hive table > {code} > hive> select * from pbe_test_sqoop_error; > OK > row-1 454018528782.42004 > row-2 87658675864540192 > row-3 87658675864540192 > row-4 87658675864540192 > row-5 7658675864540185 > row-6 7658675864540185 > row-7 658675864540185.125 > row-8 58675864540185.125 > row-9 8675864540185.12305 > row-10 675864540185.12341 > row-11 75864540185.12344 > row-12 35864540185.12345 > row-13 5864540185.12345 > Time taken: 0.455 seconds, Fetched: 13 row(s) > {code} > Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part > of the number before dot is wrong. All looks correctly in Oracle as can be > seen on the attached screenshot. > The problem seems to be in the java class > https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > starting at line 437. > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = new BigDecimal(n.doubleValue(), > MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} > all numbers, even those that are stored in BigDecimal are squeezed through > double which leads to the precision lost The same issue could be at some > places when working with large numbers. > The following code fixes this issue: > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = val instanceof BigDecimal ? (BigDecimal) val : new > BigDecimal(n.doubleValue(), MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (SQOOP-3014) Sqoop with HCatalog import loose precision for large numbers that does not fit into double
[ https://issues.apache.org/jira/browse/SQOOP-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Tóth updated SQOOP-3014: --- Attachment: SQOOP-3014 > Sqoop with HCatalog import loose precision for large numbers that does not > fit into double > -- > > Key: SQOOP-3014 > URL: https://issues.apache.org/jira/browse/SQOOP-3014 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.6 >Reporter: Pavel Benes >Assignee: Zoltán Tóth >Priority: Critical > Fix For: 1.5.0 > > Attachments: oracle-sqoop-error.png, SQOOP-3014 > > > When using sqoop with HCatalog to import data from JDBC (I have tried > Oracle11) all numbers that does not fit into double are loosing its precision > or are distorted. > Steps to reproduce: > 1) Create test table in Oracle and fill it with test data > {code} > CREATE TABLE TEST_SQOOP_ERROR(ID VARCHAR(10), TYPE_NUMBER DECIMAL(22,5)) > > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-1', > 454018528782.42006329) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-2', > 87658675864540185.123456789123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-3', > 87658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-4', > 87658675864540185.123) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-5', > 7658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-6', > 7658675864540185.123456789) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-7', > 658675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-8', > 58675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-9', > 8675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-10', > 675864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-11', > 75864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-12', > 35864540185.12345) > INSERT INTO MMDINGEST.TEST_SQOOP_ERROR(ID, TYPE_NUMBER) VALUES ( 'row-13', > 5864540185.12345) > {code} > 2) Create table in Hive database > {code} > CREATE TABLE pbe_test_sqoop_error(id string, type_number decimal(22,5)) > STORED AS ORC; > {code} > 3) Import data from Oracle to Hive using sqoop > {code} > export HADOOP_CLASSPATH=/opt/mantis/jdbc/oracle-11.2/ojdbc6.jar > sqoop import -connect jdbc:oracle:thin:@//1.1.1.1:1521/XE --username XXX > --password XXX --hcatalog-database default --hcatalog-table > pbe_test_sqoop_error --driver oracle.jdbc.OracleDriver --query 'SELECT id, > type_number FROM MMDINGEST.TEST_SQOOP_ERROR WHERE $CONDITIONS' -m 1 > {code} > 4) Display data from Hive table > {code} > hive> select * from pbe_test_sqoop_error; > OK > row-1 454018528782.42004 > row-2 87658675864540192 > row-3 87658675864540192 > row-4 87658675864540192 > row-5 7658675864540185 > row-6 7658675864540185 > row-7 658675864540185.125 > row-8 58675864540185.125 > row-9 8675864540185.12305 > row-10 675864540185.12341 > row-11 75864540185.12344 > row-12 35864540185.12345 > row-13 5864540185.12345 > Time taken: 0.455 seconds, Fetched: 13 row(s) > {code} > Only the values at line 1, 12, 13 are correct. At the lines 2-4 even the part > of the number before dot is wrong. All looks correctly in Oracle as can be > seen on the attached screenshot. > The problem seems to be in the java class > https://www.codatlas.com/github.com/apache/sqoop/branch-1.4.6/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > starting at line 437. > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = new BigDecimal(n.doubleValue(), > MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} > all numbers, even those that are stored in BigDecimal are squeezed through > double which leads to the precision lost The same issue could be at some > places when working with large numbers. > The following code fixes this issue: > {code} > } else if (hfsType == HCatFieldSchema.Type.DECIMAL) { > BigDecimal bd = val instanceof BigDecimal ? (BigDecimal) val : new > BigDecimal(n.doubleValue(), MathContext.DECIMAL128); > return HiveDecimal.create(bd); > } > {code} . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62057/#review185166 --- Ship it! Ship It! - Szabolcs Vasas On Sept. 12, 2017, 12:19 p.m., Zoltán Tóth wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62057/ > --- > > (Updated Sept. 12, 2017, 12:19 p.m.) > > > Review request for Sqoop, Boglarka Egyed and Anna Szonyi. > > > Bugs: SQOOP-3014 > https://issues.apache.org/jira/browse/SQOOP-3014 > > > Repository: sqoop-trunk > > > Description > --- > > HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog > doesn't change BigDecimals > > > Diffs > - > > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java > aba2458e > src/test/org/apache/sqoop/hcat/HCatalogImportTest.java d784a205 > src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/62057/diff/3/ > > > Testing > --- > > I ran unit tests and integration tests as well. New test cases were added to > test the change > > > Thanks, > > Zoltán Tóth > >
Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62057/ --- (Updated Sept. 12, 2017, 12:19 p.m.) Review request for Sqoop, Boglarka Egyed and Anna Szonyi. Changes --- Change updated based on review Bugs: SQOOP-3014 https://issues.apache.org/jira/browse/SQOOP-3014 Repository: sqoop-trunk Description --- HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog doesn't change BigDecimals Diffs (updated) - src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java aba2458e src/test/org/apache/sqoop/hcat/HCatalogImportTest.java d784a205 src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java PRE-CREATION Diff: https://reviews.apache.org/r/62057/diff/3/ Changes: https://reviews.apache.org/r/62057/diff/2-3/ Testing --- I ran unit tests and integration tests as well. New test cases were added to test the change Thanks, Zoltán Tóth
Re: Review Request 61372: Expanded Metastore support for MySql, Oracle, Postgresql, MSSql, and DB2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61372/#review185159 --- Ship it! Hi Zach, Thank you for implementing this patch! I have reviewed it, ran the unit and third party test suite, executed some manual testing too and everything seems to be fine. Regards, Szabolcs - Szabolcs Vasas On Aug. 25, 2017, 11:53 a.m., Zach Berkowitz wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61372/ > --- > > (Updated Aug. 25, 2017, 11:53 a.m.) > > > Review request for Sqoop, Boglarka Egyed, Jarek Cecho, and Anna Szonyi. > > > Bugs: SQOOP-3216 > https://issues.apache.org/jira/browse/SQOOP-3216 > > > Repository: sqoop-trunk > > > Description > --- > > Expanded Metastore support for MySql, Oracle, Postgresql, MSSql, and DB2 > > > Diffs > - > > src/docs/user/metastore-purpose.txt e7eb23d3 > src/docs/user/saved-jobs.txt e8757801 > src/java/com/cloudera/sqoop/metastore/hsqldb/AutoHsqldbStorage.java > 259d9f63 > src/java/com/cloudera/sqoop/metastore/hsqldb/HsqldbJobStorage.java 083e2a37 > src/java/org/apache/sqoop/SqoopOptions.java 2eb3d8a4 > src/java/org/apache/sqoop/manager/CubridManager.java 5a1a0e83 > src/java/org/apache/sqoop/manager/Db2Manager.java 61b6868d > src/java/org/apache/sqoop/manager/HsqldbManager.java 9b9c5822 > src/java/org/apache/sqoop/manager/JdbcDrivers.java PRE-CREATION > src/java/org/apache/sqoop/manager/MySQLManager.java 3c2276fe > src/java/org/apache/sqoop/manager/NetezzaManager.java 0ac77175 > src/java/org/apache/sqoop/manager/OracleManager.java 2f4585cc > src/java/org/apache/sqoop/manager/PostgresqlManager.java 44e041ad > src/java/org/apache/sqoop/manager/SQLServerManager.java 9a3d9183 > src/java/org/apache/sqoop/manager/SupportedManagers.java 8a6037af > src/java/org/apache/sqoop/metastore/JobStorageFactory.java 2edc33b8 > src/java/org/apache/sqoop/metastore/hsqldb/AutoHsqldbStorage.java 49e30319 > src/java/org/apache/sqoop/metastore/hsqldb/HsqldbJobStorage.java a0f29fd0 > src/java/org/apache/sqoop/tool/BaseSqoopTool.java 1564bdcb > src/java/org/apache/sqoop/tool/JobTool.java 054e274f > src/test/com/cloudera/sqoop/TestIncrementalImport.java 52a55b78 > src/test/com/cloudera/sqoop/metastore/JobToolTestBase.java PRE-CREATION > > src/test/com/cloudera/sqoop/metastore/MetaConnectIncrementalImportTestBase.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/SavedJobsTestBase.java PRE-CREATION > src/test/com/cloudera/sqoop/metastore/TestSavedJobs.java 61d8c97d > src/test/com/cloudera/sqoop/metastore/db2/DB2JobToolTest.java PRE-CREATION > > src/test/com/cloudera/sqoop/metastore/db2/DB2MetaConnectIncrementalImportTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/db2/DB2SavedJobsTest.java > PRE-CREATION > > src/test/com/cloudera/sqoop/metastore/hsqldb/HsqldbMetaConnectIncrementalImportTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/hsqldb/HsqldbSavedJobsTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/mysql/MySqlJobToolTest.java > PRE-CREATION > > src/test/com/cloudera/sqoop/metastore/mysql/MySqlMetaConnectIncrementalImportTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/mysql/MySqlSavedJobsTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/oracle/OracleJobToolTest.java > PRE-CREATION > > src/test/com/cloudera/sqoop/metastore/oracle/OracleMetaConnectIncrementalImportTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/oracle/OracleSavedJobsTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/postgres/PostgresJobToolTest.java > PRE-CREATION > > src/test/com/cloudera/sqoop/metastore/postgres/PostgresMetaConnectIncrementalImportTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/postgres/PostgresSavedJobsTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/sqlserver/SqlServerJobToolTest.java > PRE-CREATION > > src/test/com/cloudera/sqoop/metastore/sqlserver/SqlServerMetaConnectIncrementalImportTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/metastore/sqlserver/SqlServerSavedJobsTest.java > PRE-CREATION > src/test/findbugsExcludeFile.xml a27ec378 > > > Diff: https://reviews.apache.org/r/61372/diff/5/ > > > Testing > --- > > Three test classes SavedJobsTest, MetaConnectIncrementalImportTest, and > JobToolTest all pass for all supported databases. > > > Thanks, > > Zach Berkowitz > >