Michele Giusto created SQOOP-1420:
-------------------------------------
Summary: Wrong timestamp upper bound due to timezone
Key: SQOOP-1420
URL: https://issues.apache.org/jira/browse/SQOOP-1420
Project: Sqoop
Issue Type: Bug
Components: connectors/oracle
Affects Versions: 1.4.4
Reporter: Michele Giusto
Hi,
I am using Sqoop 1.4.4 to do incremental imports from an Oracle database,
checking a timestamp column representing the instant of last update. The
timestamp column is in GMT, while the server running Sqoop is in GMT+2, and I
am looking for updates every 15 minutes. Due to the different timezones, Sqoop
launches queries filtering on timestamp ranges that are in the future,
resulting in no rows exported.
Looking at the source code of Sqoop, it seems that to calculate the upper bound
(lower bound at next iteration) for the query, the program gets the current
time in millisecond (getCurrentDbTimestamp() in
org.apache.sqoop.manager.ConnManager) and creates a java.sql.Timestamp object
from it. Then, org.apache.sqoop.tool.ImportTool calls
datetimeToQueryString(String datetime, int columnType) of
org.apache.sqoop.manager.OracleManager passing the timestamp as a string, and
get back the sql filter to put in the final query. The problem seems to be that
the toString() method of a java.sql.Timestamp object considers the
user.timezone of the JVM (Europe/Rome in my case), so I get a String
representing a date that is in the future if considered as GMT (as the Oracle
database does with the SQL function TO_TIMESTAMP).
To solve this problem, it may be enough to force the conversion from Timestamp
to String to work in GMT timezone (like setting user.timezone=GMT). Another
solution, when interacting with Oracle, may be to use the TO_TIMESTAMP_TZ
function, that allows to specify the timezone of the String representing the
timestamp.
--
This message was sent by Atlassian JIRA
(v6.2#6252)