zhaohong created SQOOP-801:
------------------------------
Summary: Missing data when Importing data split by the column of
the date type from oracle
Key: SQOOP-801
URL: https://issues.apache.org/jira/browse/SQOOP-801
Project: Sqoop
Issue Type: Bug
Affects Versions: 1.4.2
Environment: source dababase: oracle
os: centos 5.4
hadoop version: 0.20.2-cdh3u2
Reporter: zhaohong
Priority: Critical
The command is below:
============================
sqoop import --connect 'jdbc:oracle:thin:@???.???.???.???:1521:staging1'
--username v_stage --password 'v_stage'
--table shoppingcart
--columns 'trace_id,checkout_status ,entered_price,promotion_id ,permanent_id
,from_platform,points,shop_id ,creation_date,productid ,category_path,data_date
,cust_id ,cart_id ,last_changed_date
,order_from_ip,counts,reference_url,cart_item_id ,special_sale'
--where "last_changed_date >= TO_DATE('2012-12-11','yyyy-mm-dd') and
last_changed_date < TO_DATE('2012-12-12','yyyy-mm-dd')"
--fields-terminated-by ''
--lines-terminated-by '
' --split-by last_changed_date
--target-dir /share/comm/customer/zhao/
--hive-delims-replacement ' '
-m 20
============================
The DataBase is Oracle and the type of last_changed_date is date.
There are 600010 records between 2012-12-11 and 2012-12-12 but I can only pull
about 12 records using this command.
the BoundingValsQuery is select min(last_change_date), max(last_change_date)
from shoppingcart where last_chage_date >=TO_DATA('2012-12-11','yyyy-mm-dd')
and last_chage_date<TO_DATA('2012-12-12','yyyy-mm-dd')
I debuged the source code and found that the min and max val are both equal to
2012-12-11 and The query condition look like :
where last_chage_date>=TO_TIMESTAMP(‘2012-12-11’,‘YYYY-MM-DD’) and
last_chage_date<=TO_TIMESTAMP(‘2012-12-11’,‘YYYY-MM-DD’) that's why I can only
get 12 records whose he last_chage_date is equal to 2012-12-11 00:00:00 exactly.
I do some modified and the result is right.
diff --git a/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java
b/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java
index 31e9351..812c6bb 100644
--- a/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java
+++ b/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java
@@ -140,7 +144,7 @@ public class DateSplitter extends IntegerSplitter {
try {
switch (sqlDataType) {
case Types.DATE:
- return rs.getDate(colNum).getTime();
+ return rs.getTimestamp(colNum).getTime();
case Types.TIME:
return rs.getTime(colNum).getTime();
case Types.TIMESTAMP:
@@ -160,7 +164,7 @@ public class DateSplitter extends IntegerSplitter {
private Date longToDate(long val, int sqlDataType) {
switch (sqlDataType) {
case Types.DATE:
- return new java.sql.Date(val);
+ return new java.sql.Timestamp(val);
case Types.TIME:
return new java.sql.Time(val);
case Types.TIMESTAMP:
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira