Warren Chisholm created SQOOP-2748:
--------------------------------------

             Summary: Remove parentheses from WHERE clause
                 Key: SQOOP-2748
                 URL: https://issues.apache.org/jira/browse/SQOOP-2748
             Project: Sqoop
          Issue Type: Wish
          Components: codegen
    Affects Versions: 1.4.6
         Environment: Redhat 6
Cloudera 5.5.0
            Reporter: Warren Chisholm
            Priority: Minor


When performing a sqoop import with option "-m 1" the following where clause is 
added by sqoop:
WHERE ( 1=1 ) AND ( 1=1 )

This where clause is redundant and creates an incompatibility with the IBM IMS 
Universal JDBC driver.
The IMS JDBC Driver does not allow the use of parentheses in the where clause. 
The below quote was taken from 2nd dot point here: 
https://www-01.ibm.com/support/knowledgecenter/SSEPH2_13.1.0/com.ibm.ims13.doc.apg/ims_odbjdbcusagewhere.htm.
"Do not use parentheses. Qualification statements are evaluated from left to 
right. The order of evaluation for operators is the IMS evaluation order for 
segment search arguments."

I have modified the below commands and output to anonymize the details.

Sample sqoop command:
sqoop import --connect 
'jdbc:ims://myserver:9999/MYDATABASE:currentSchema=MYSCHEMA;' -username 
???????? -password ???????? --null-string '\\N' --null-non-string '\\N' 
--hive-import --create-hive-table --table MYIMSTABLE --hive-table MYHIVETABLE 
--hive-drop-import-delims --driver com.ibm.ims.jdbc.IMSDriver 
--delete-target-dir -m 1

Extract of log produced:
...
15/12/16 17:20:05 INFO mapreduce.Job:  map 0% reduce 0%
15/12/16 17:20:16 INFO mapreduce.Job: Task Id : 
attempt_1449625205240_0003_m_000000_0, Status : FAILED
Error: java.io.IOException: SQLException in nextKeyValue
        at 
org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
        at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at 
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.sql.SQLException: Error parsing the following SQL statement: 
SELECT FIELD1 FROM MYIMSTABLE AS MYIMSTABLE WHERE ( 1=1 ) AND ( 1=1 ). 
Encountered " <ONEEQUALONE> "1=1 "" at line 1, column 1167.
Was expecting:
    <ID> ...

        at com.ibm.ims.jdbc.SQLForDLIParser.parse(SQLForDLIParser.java:651)
        at 
com.ibm.ims.jdbc.PreparedStatementImpl.parseSQLQuery(PreparedStatementImpl.java:133)
        at 
com.ibm.ims.jdbc.PreparedStatementImpl.<init>(PreparedStatementImpl.java:87)
        at 
com.ibm.ims.jdbc.ConnectionImpl.prepareStatement(ConnectionImpl.java:978)
        at 
org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:101)
        at 
org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
        ... 12 more


In summary, for sqoop to be compatible with the IBM IMS JDBC driver, I am 
asking for either the where clause "( 1=1 ) AND ( 1=1 )" to be removed or for 
the parentheses to be removed.


Workaround
The only workaround I have been able to come up with is to switch to using the 
--query option instead of --table and hack the $CONDITIONS variable into the 
select part of the query - to get rid of the parentheses from the where clause.

For example:
sqoop import --connect 
'jdbc:ims://myserver:9999/MYDATABASE:currentSchema=MYSCHEMA;' -username 
???????? -password ???????? --null-string '\\N' --null-non-string '\\N' 
--hive-import --create-hive-table --query "select FIELD1,'\$CONDITIONS' as 
CONDITIONS from MYIMSTABLE" --hive-table MYHIVETABLE --hive-drop-import-delims 
--driver com.ibm.ims.jdbc.IMSDriver --delete-target-dir -m 1 --target-dir test

Obviously this is not desirable as the commands to retrieve the metadata (WHERE 
1 = 0) will retrieve the data with it. The sql will look like this:
select FIELD1,' (1 = 0) ' as CONDITIONS from MYIMSTABLE
however it does import the data successfully (albeit inefficiently).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to