Mahesh Balakrishnan created OOZIE-3218:
------------------------------------------

             Summary: Oozie Sqoop action with command splits the select clause 
into multiple parts due to delimiter being space
                 Key: OOZIE-3218
                 URL: https://issues.apache.org/jira/browse/OOZIE-3218
             Project: Oozie
          Issue Type: Bug
          Components: action, workflow
    Affects Versions: 4.3.0, 4.2.0, 4.1.0, 3.3.2
         Environment: Hortonworks Hadoop HDP-2.6.4.x release 

 oozie admin -version: Oozie server build version: 4.2.0.2.6.4.0-91
            Reporter: Mahesh Balakrishnan


When running a oozie sqoop action which has command with --query in place the 
query is split into multiple parts causing "Unrecognized argument:" and in-turn 
fails.

<sqoop
 xmlns="uri:oozie:sqoop-action:0.4">
 <job-tracker>${resourceManager}</job-tracker>
 <name-node>${nameNode}</name-node>
 <command>import --verbose --connect jdbc:mysql://test.openstacklocal/db 
--query select * from abc where $CONDITIONS --username test --password test 
--driver com.mysql.jdbc.Driver -m 1 </command>
 </sqoop>
 <ok to="end"/>

 

Oozie Launcher logs:

++++++++++++++++++++++++++++++++
Sqoop command arguments :
 import
 --verbose
 --connect
 jdbc:mysql://test.openstacklocal/db
 --query
 "select
 *
 from
 abc
 where
 $CONDITIONS"
 --username
 hive
 --password
 ********
 --driver
 com.mysql.jdbc.Driver
 -m
 1
Fetching child yarn jobs
tag id : oozie-a1bbe03a0983b9e822d12ae7bb269ee3
2791 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to 
ResourceManager at hdp263-3.openstacklocal/172.26.105.248:8050
Child yarn jobs are found - 
=================================================================

>>> Invoking Sqoop command line now >>>

3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been 
set in the environment. Cannot check for additional configuration.
3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been 
set in the environment. Cannot check for additional configuration.
3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 
1.4.6.2.6.4.0-91
3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 
1.4.6.2.6.4.0-91
3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging.
3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging.
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments 
for import:
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments 
for import:
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: *
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: *
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
from
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
from
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
abc
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
abc
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
where
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
where
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
$CONDITIONS"
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
$CONDITIONS"
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
--username
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
--username
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
abc
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
abc
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
--password
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
--password
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
abc
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
abc
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
--driver
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
--driver
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
com.mysql.jdbc.Driver
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
com.mysql.jdbc.Driver
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
-m
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 
-m
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1
3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1
3289 [main] DEBUG org.apache.sqoop.Sqoop - 
Try --help for usage instructions.

+++++++++++++++++++++++++++++++++++++++++++


The code piece which causes the issue is (SqoopActionExecutor.java):
++++++++++++++++++++++++++++++++++
 String[] args;
 if (actionXml.getChild("command", ns) != null) {
 String command = actionXml.getChild("command", ns).getTextTrim();
 StringTokenizer st = new StringTokenizer(command, " ");
 List<String> l = new ArrayList<String>();
 while (st.hasMoreTokens()) {
 l.add(st.nextToken());
 }
 args = l.toArray(new String[l.size()]);
 }
 else {
 List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns);
 args = new String[eArgs.size()];
 for (int i = 0; i < eArgs.size(); i++) {
 args[i] = eArgs.get(i).getTextTrim();
 }
 }

setSqoopCommand(actionConf, args);
 return actionConf;
 }
++++++++++++++++++++++++++++++++++

Since the delimiter is a space, the code splits the select * from table into 
select, *, from, table as nextToken and adds them seperatly into the array 
causing the issue.

I have made a code change locally to address this issue and did some testing 
around this and it seem to work fine, hence submitting the code change for this

 

String[] args;
 if (actionXml.getChild("command", ns) != null) {
 String command = actionXml.getChild("command", ns).getTextTrim();

// Added this to get the value for select clause to be appended
 String QueryAppendStr =""; // Added this to get the value for select clause to 
be appended

StringTokenizer st = new StringTokenizer(command, " ");
 List<String> l = new ArrayList<String>();

while (st.hasMoreTokens()) {
 
 // added to get the command delimited value to check and see if it needs to be 
appended or could it be directly added to the list.
 String QueryStr = (String) st.nextToken();

if(!(QueryStr.contains("--") || QueryStr.contains("-") || 
QueryStr.contains("-D")))
 {
 QueryAppendStr = QueryAppendStr + QueryStr + " ";
 }
 else {
 if (!(QueryAppendStr.trim().equals(null) || QueryAppendStr.trim().equals(""))) 
{
 LOG.debug("Append : [\{0}]", QueryAppendStr.trim());
 l.add(QueryAppendStr.trim());
 QueryAppendStr="";
 }
 LOG.debug("Actual : [\{0}]", QueryStr);
 //l.add(st.nextToken());
 l.add(QueryStr);
 }
 }
 l.add(QueryAppendStr.trim());
 args = l.toArray(new String[l.size()]);
 }
 else {
 List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns);
 args = new String[eArgs.size()];
 for (int i = 0; i < eArgs.size(); i++) {
 args[i] = eArgs.get(i).getTextTrim();
 }
 }

setSqoopCommand(actionConf, args);
 return actionConf;
 }

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to