[ 
https://issues.apache.org/jira/browse/SQOOP-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16337400#comment-16337400
 ] 

Daniel Voros commented on SQOOP-3280:
-------------------------------------

Hi [~nitish.khanna]!

Thank you for reporting this and for the very detailed description!

I'm afraid the issue here is (at least in part) in MySQL Connector/J. Being a 
JDBC driver, it uses {{java.sql.Time}} that was designed to hold time of day 
values and not durations (as MySQL TIME does). The parsing of TIME columns to 
{{java.sql.Time}} has went through some changes in the recent versions of the 
MySQL Connector/J:

Here's a little comparison of different versions:
||MySQL Connector/J version||Max TIME value to parse into java.sql.Time||
|5.0.3|99:59:59|
|5.1.6|23:59:59|
|5.1.45|24:59:59|
|6.0.2|23:59:59|

Note that the exception thrown for unsupported values has changed in 6.x to:
{quote}The value '24:0:0' is an invalid TIME value. JDBC Time objects represent 
a wall-clock time and not a duration as MySQL treats them. If you are treating 
this type as a duration, consider retrieving this value as a string and dealing 
with it according to your requirements.
{quote}
It is possible to retrieve TIME columns as String in 6.x, but it isn't (for 
bigger values) in 5.x. Knowing this, we could move forward with the 6.x 
connector and retrieving TIME columns as Strings. I'm afraid it would have some 
unexpected side-effects, I'll try to take a look.

In the meantime, there seem to be two possible workarounds for the issue:
 - moving back to 5.0.3, if supporting two-digit times is enough for you. (Of 
course, be careful, this might break something else, I'm not familiar with 
differences between 5.x versions)
 - using [Free-form 
import|https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_free_form_query_imports]
 for this table and casting the TIME column to string as described in the [last 
comment here|https://bugs.mysql.com/bug.php?id=36051]: {{select concat('', 
timevalue) as timevalue from repro_time;}}

> Sqoop import with time data type from mysql is not working as expected.
> -----------------------------------------------------------------------
>
>                 Key: SQOOP-3280
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3280
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Niitsh Khanna
>            Priority: Major
>
> Hi Team,
> Hope you are doing good !!!
> ##########################
> Problem Statement
> ##########################
> Sqoop import with time data type from mysql is not working as expected.
> ##########################
> Detailed Problem Statement
> ##########################
> We are trying to import the time datatype from Mysql via Sqoop and it is not 
> working as expected and as mentioned in the Mysql document the value time 
> data type can import.
> If we set the time(hour) more than 24 then it doesn't work fine but if we set 
> the hour less than 24 then it imports well.
> Now if we see the mysql 
> document(https://dev.mysql.com/doc/refman/5.7/en/time.html) for Time data 
> range which is '-838:59:59' to '838:59:59' but Sqoop is not working as per 
> this range set.
> Note:- I am creating 2 scenarios (working and non-working) to give more 
> details on this with replication steps that will help you to replicate this 
> in house.
>  
> ##########################
> Replication Steps ----> Working Scenario
> ##########################
> Step 1:- Create table in Mysql.
> ----------------------------------
> mysql> create table repro_time( timevalue time);
> Query OK, 0 rows affected (0.08 sec)
> mysql> insert into repro_time values('24:24:24');
> Query OK, 1 row affected (0.06 sec)
> mysql> select * from repro_time;
> +-----------+
> | timevalue |
> +-----------+
> | 24:24:24 |
> +-----------+
> 1 row in set (0.01 sec)
> Step 2:- Sqoop import into HDFS
> ---------------------------------
> [root@host-10-17-101-232 ~]# export 
> MYCONN=jdbc:mysql://host-10-17-101-231.coe.cloudera.com/test
> [root@host-10-17-101-232 ~]# export MYUSER=*****
> [root@host-10-17-101-232 ~]# export MYPSWD=*****
> [root@host-10-17-101-232 ~]# sqoop import --connect $MYCONN --username 
> $MYUSER --password $MYPSWD --table repro_time --target-dir 
> '/user/root/repro_time' --delete-target-dir -m 1
> Bytes Read=0
>  File Output Format Counters 
>  Bytes Written=9
> 18/01/22 21:41:57 INFO mapreduce.ImportJobBase: Transferred 9 bytes in 
> 17.5695 seconds (0.5123 bytes/sec)
> 18/01/22 21:41:57 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> [root@host-10-17-101-232 ~]# hadoop fs -cat repro_time/p*
> 00:24:24
> Note:- We set the hour as 24 so that's why it has sett 00 over here which is 
> normal behaviour.
> ##########################
> Replication Steps ----> Non-Working Scenario
> ##########################
> Step1:- Create table in Mysql
> ------------------------------
> mysql> create table repro_time( timevalue time);
> Query OK, 0 rows affected (0.08 sec)
> mysql> insert into repro_time values('24:24:24');
> Query OK, 1 row affected (0.06 sec)
> mysql> select * from repro_time;
> +-----------+
> | timevalue |
> +-----------+
> | 24:24:24 |
> +-----------+
> 1 row in set (0.01 sec)
> mysql> insert into repro_time values('25:24:24');
> Query OK, 1 row affected (0.02 sec)
> mysql> select * from repro_time;
> +-----------+
> | timevalue |
> +-----------+
> | 24:24:24 |
> | 25:24:24 |
> +-----------+
> 2 rows in set (0.00 sec)
> Note:- Here you can see I have inserted a second value of hour as 25 which is 
> more than 24 and it has inserted successfully because of the time range for 
> Mysql is -838:59:59' to '838:59:59'.
> Step 2:- Sqoop import into HDFS
> ---------------------------------
> [root@host-10-17-101-232 ~]# export 
> MYCONN=jdbc:mysql://host-10-17-101-231.coe.cloudera.com/test
> [root@host-10-17-101-232 ~]# export MYUSER=****
> [root@host-10-17-101-232 ~]# export MYPSWD=****
> [root@host-10-17-101-232 ~]# sqoop import --connect $MYCONN --username 
> $MYUSER --password $MYPSWD --table repro_time --target-dir 
> '/user/root/repro_time1' --delete-target-dir -m 1
> 18/01/22 21:42:34 INFO mapreduce.Job: Job job_1516093492107_2868 running in 
> uber mode : false
> 18/01/22 21:42:34 INFO mapreduce.Job: map 0% reduce 0%
> 18/01/22 21:42:40 INFO mapreduce.Job: Task Id : 
> attempt_1516093492107_2868_m_000000_0, Status : FAILED
> Error: java.io.IOException: SQLException in nextKeyValue
>  at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
>  at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>  at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>  at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.sql.SQLException: Illegal hour value '25' for java.sql.Time 
> type in value '25:24:24.
>  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
>  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:898)
>  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:887)
>  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:861)
>  at com.mysql.jdbc.TimeUtil.fastTimeCreate(TimeUtil.java:286)
>  at com.mysql.jdbc.ResultSetImpl.fastTimeCreate(ResultSetImpl.java:979)
>  at com.mysql.jdbc.ResultSetRow.getTimeFast(ResultSetRow.java:884)
>  at com.mysql.jdbc.BufferRow.getTimeFast(BufferRow.java:543)
>  at com.mysql.jdbc.ResultSetImpl.getTimeInternal(ResultSetImpl.java:5575)
>  at com.mysql.jdbc.ResultSetImpl.getTime(ResultSetImpl.java:5343)
>  at 
> org.apache.sqoop.lib.JdbcWritableBridge.readTime(JdbcWritableBridge.java:106)
>  at 
> com.cloudera.sqoop.lib.JdbcWritableBridge.readTime(JdbcWritableBridge.java:78)
>  at repro_time.readFields(repro_time.java:90)
>  at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)
> Note:- Here according to Sqoop it is an invalid hour 25 and because of that 
> it has thrown the error.
> I will be attaching the full Sqoop output(Sqoop_time_mysql_output.txt) and 
> Yarn job 
> bundle(212c5296-1ef9-4081-a65a-0fde003b341f.default.20180123-06-00-37.support-bundle.zip)
>  for your reference that will help you to get more details.
> #########################
> Ask From the Team
> ########################
> This looks like a bug to me but I would be needing your help to fix this bug 
> so that Sqoop can take the time which is a range for mysql.
> Please let me know if you have any questions.
> Regards
> Nitish Khanna



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to