zhaohong created SQOOP-801:
------------------------------

             Summary: Missing data  when Importing data split by  the column of 
the date type from oracle  
                 Key: SQOOP-801
                 URL: https://issues.apache.org/jira/browse/SQOOP-801
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.4.2
         Environment: source dababase: oracle
os: centos 5.4
hadoop version: 0.20.2-cdh3u2
            Reporter: zhaohong
            Priority: Critical


The command is below:
============================
sqoop import --connect 'jdbc:oracle:thin:@???.???.???.???:1521:staging1' 
--username v_stage --password 'v_stage' 
--table shoppingcart 
--columns 'trace_id,checkout_status ,entered_price,promotion_id ,permanent_id 
,from_platform,points,shop_id ,creation_date,productid ,category_path,data_date 
,cust_id ,cart_id ,last_changed_date 
,order_from_ip,counts,reference_url,cart_item_id ,special_sale' 
--where "last_changed_date >= TO_DATE('2012-12-11','yyyy-mm-dd') and 
last_changed_date < TO_DATE('2012-12-12','yyyy-mm-dd')" 
--fields-terminated-by '' 
--lines-terminated-by '
' --split-by last_changed_date 
--target-dir /share/comm/customer/zhao/ 
--hive-delims-replacement ' ' 
-m 20
============================
The DataBase is Oracle and the type of last_changed_date is date. 
There are 600010 records between 2012-12-11 and 2012-12-12  but I can only pull 
about 12 records using this command.

the BoundingValsQuery is select min(last_change_date), max(last_change_date)
from shoppingcart where last_chage_date >=TO_DATA('2012-12-11','yyyy-mm-dd') 
and last_chage_date<TO_DATA('2012-12-12','yyyy-mm-dd')

I debuged the source code and found that the min and max val are both equal to 
2012-12-11 and The query condition look like : 
where last_chage_date>=TO_TIMESTAMP(‘2012-12-11’,‘YYYY-MM-DD’) and 
last_chage_date<=TO_TIMESTAMP(‘2012-12-11’,‘YYYY-MM-DD’) that's why I can only 
get 12 records whose he last_chage_date is equal to 2012-12-11 00:00:00 exactly.

I do some modified and the result is right.


 diff --git a/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java 
b/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java
index 31e9351..812c6bb 100644
--- a/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java
+++ b/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java
@@ -140,7 +144,7 @@ public class DateSplitter extends IntegerSplitter {
     try {
       switch (sqlDataType) {
       case Types.DATE:
-        return rs.getDate(colNum).getTime();
+       return rs.getTimestamp(colNum).getTime();
       case Types.TIME:
         return rs.getTime(colNum).getTime();
       case Types.TIMESTAMP:
@@ -160,7 +164,7 @@ public class DateSplitter extends IntegerSplitter {
   private Date longToDate(long val, int sqlDataType) {
     switch (sqlDataType) {
     case Types.DATE:
-      return new java.sql.Date(val);
+      return new java.sql.Timestamp(val);
     case Types.TIME:
       return new java.sql.Time(val);
     case Types.TIMESTAMP:






 




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to