[jira] [Commented] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2020-08-04 Thread Mahesh Balakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171001#comment-17171001
 ] 

Mahesh Balakrishnan commented on SQOOP-3451:


[~dionusos],  Not sure if this is worked upon or not but wanted to bring to 
your attention that we need to address this for all the datatypes.

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Dénes Bodó
>Assignee: Dénes Bodó
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Creating HCatalog table 
> default.floattest for import
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: HCatalog Create table 
> statement:
> create external table `default`.`floattest` (
> `column1` decimal(30),
> `column2` decimal(30),
> `column3` decimal(30))
> stored as orc
> {noformat}
> From this output we can see that Oracle states about column1 has Type=2 which 
> is NUMERIC (regarding to 
> https://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.FLOAT).
>  Sqoop translates NUMERIC to DECIMAL 
> (https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java#L1050L1107).
>  Due to Oracle uses {{scale=-127}} to sign about a NUMERIC that it is a FLOAT 
> instead of stating {{Type=6}}, Sqoop creates integers (decimal with 0 scale) 
> from NUMBER.
> I think it is the fault of Oracle as it does not use Java Type=6 to sign type 
> of a float. What do you think?
> 
> Thank you for the details and investigation to [~mbalakrishnan] and Andrew 
> Miller



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2019-09-26 Thread Denes Bodo (Jira)


[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938609#comment-16938609
 ] 

Denes Bodo commented on SQOOP-3451:
---

Thanks for you answers [~mbalakrishnan] and [~fero]. I'll create a proposal 
change soon considering your suggestion.

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Denes Bodo
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Creating HCatalog table 
> default.floattest for import
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: HCatalog Create table 
> statement:
> create external table `default`.`floattest` (
> `column1` decimal(30),
> `column2` decimal(30),
> `column3` decimal(30))
> stored as orc
> {noformat}
> From this output we can see that Oracle states about column1 has Type=2 which 
> is NUMERIC (regarding to 
> https://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.FLOAT).
>  Sqoop translates NUMERIC to DECIMAL 
> (https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java#L1050L1107).
>  Due to Oracle uses {{scale=-127}} to sign about a NUMERIC that it is a FLOAT 
> instead of stating {{Type=6}}, Sqoop creates integers (decimal with 0 scale) 
> from NUMBER.
> I think it is the fault of Oracle as it does not use Java Type=6 to sign type 
> of a float. What do you think?
> 
> Thank you for the details and investigation to [~mbalakrishnan] and Andrew 
> Miller



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2019-09-25 Thread Mahesh Balakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937896#comment-16937896
 ] 

Mahesh Balakrishnan commented on SQOOP-3451:


[~dionusos],  Using oracle dump command for float 

CREATE TABLE "SQOOP"."FLOATTEST"
( "IFLOAT" FLOAT(30)
)

SELECT dump(IFLOAT) FROM SQOOP.FLOATTEST
*OUTPUT*  Typ=2 Len=2: 193,13,13

Then to understand more I looked into the Datatypes here: 
https://docs.oracle.com/cd/B28359_01/server.111/b28285/sqlqr06.htm#CHDBBHHE

Where they have mentioned that Code 2 is for both NUMBER and FLOAT but need to 
look at the scale which shows whether its a NUMBER or FLOAT

The NUMERIC and DECIMAL datatypes can specify only fixed-point numbers. For 
those datatypes, the scale (s) defaults to 0.

The FLOAT datatype is a floating-point number with a binary precision b. The 
default precision for this datatype is 126 binary, or 38 decimal.


Since Sqoop is able to get the Scale of the datatype, is it possible to get the 
scale and then map it across as Float?  I know it may take time to fix this but 
we need to document this so the users leveraging sqoop to load data from Oracle 
to Hive know what they need to do.

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Denes Bodo
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Creating HCatalog table 
> default.floattest for import
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: HCatalog Create table 
> statement:
> create external table `default`.`floattest` (
> `column1` decimal(30),
> `column2` decimal(30),
> `column3` decimal(30))
> stored as orc
> {noformat}
> From this output we can see that Oracle states about column1 has Type=2 which 
> is NUMERIC (regarding to 
> https://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.FLOAT).
>  Sqoop translates NUMERIC to DECIMAL 
> (https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java#L1050L1107).
>  Due to Oracle uses {{scale=-127}} to sign about a NUMERIC that it is a FLOAT 
> instead of stating {{Type=6}}, Sqoop creates integers (decimal with 0 scale) 
> from NUMBER.
> I think it is the fault of Oracle as it does not use Java Type=6 to sign type 
> of a float. What do you think?
> 
> Thank you for the details and investigation to [~mbalakrishnan] and Andrew 
> Miller



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2019-09-25 Thread Fero Szabo (Jira)


[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937767#comment-16937767
 ] 

Fero Szabo commented on SQOOP-3451:
---

Hi [~dionusos],

Yeah, I think you are right and Oracle is a pain in this regard to work with.

I've had the same issue when developing the floating point number support for 
Avro and Parquet, namely that a column defined as NUMBER (without precision and 
scale), comes back with invalid metadata from the database. (I believe 
something like -127 as scale, though please double check this). And under the 
hood, I suspect Oracle is using NUMBER, again, to store the type Float. 

In my case, the only missing thing was a proper scale to be able to pad a 
BigDecimal within sqoop. So, I created a flag to allow this to the user in 
SQOOP-2976. Not sure what to do in your case, as it's neither of those files, 
(it's orc, if I'm seeing this correctly). In any case, I believe you'll need to 
watch out for these "special" values for scale and precision returned by Oracle 
and implement a logic that maps these to proper values.

I used user input for this via properties. Seemed the best at the time, for 
that particular case. I'm not sure if my approach is the right one for you as 
well, though certainly an option.

So, TL;DR:

Track down where the Hive schema gets created and debug whether you can 
identify a Float coming from Oracle based on the precision and scale. You might 
want to check other number types, too.

Hope this helps!

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Denes Bodo
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Creating HCatalog table 
> default.floattest for import
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: HCatalog Create table 
> statement:
> create external table `default`.`floattest` (
> `column1` decimal(30),
> `column2` decimal(30),
> `column3` decimal(30))
> stored as orc
> {noformat}
> From this output we can see that Oracle states about column1 has Type=2 which 
> is NUMERIC (regarding to 
> https://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.FLOAT).
>  Sqoop translates NUMERIC to DECIMAL 
> (https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java#L1050L1107).
>  Due to Oracle uses {{scale=-127}} to sign about a NUMERIC that it is a FLOAT 
> instead of stating {{Type=6}}, Sqoop creates integers (decimal with 0 scale) 
> from NUMBER.
> I think it is the fault of Oracle as it does not use Java Type=6 to sign type 
> of a float. What do you think?
> 
> Thank you for the details and investigation to [~mbalakrishnan] and Andrew 
> Miller



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2019-09-25 Thread Denes Bodo (Jira)


[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937748#comment-16937748
 ] 

Denes Bodo commented on SQOOP-3451:
---

[~jarcec], [~BoglarkaEgyed], [~fero] any idea? Thanks.

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Denes Bodo
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Creating HCatalog table 
> default.floattest for import
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: HCatalog Create table 
> statement:
> create external table `default`.`floattest` (
> `column1` decimal(30),
> `column2` decimal(30),
> `column3` decimal(30))
> stored as orc
> {noformat}
> From this output we can see that Oracle states about column1 has Type=2 which 
> is NUMERIC (regarding to 
> https://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.FLOAT).
>  Sqoop translates NUMERIC to DECIMAL 
> (https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java#L1050L1107).
>  Due to Oracle uses {{scale=-127}} to sign about a NUMERIC that it is a FLOAT 
> instead of stating {{Type=6}}, Sqoop creates integers (decimal with 0 scale) 
> from NUMBER.
> I think it is the fault of Oracle as it does not use Java Type=6 to sign type 
> of a float. What do you think?
> 
> Thank you for the details and investigation to [~mbalakrishnan] and Andrew 
> Miller



--
This message was sent by Atlassian Jira
(v8.3.4#803005)