[ 
https://issues.apache.org/jira/browse/OOZIE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahesh Balakrishnan updated OOZIE-3218:
---------------------------------------
    Attachment: OOZIE-3218.patch

> Oozie Sqoop action with command splits the select clause into multiple parts 
> due to delimiter being space
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-3218
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3218
>             Project: Oozie
>          Issue Type: Bug
>          Components: action, workflow
>    Affects Versions: 3.3.2, 4.1.0, 4.2.0, 4.3.0
>         Environment: Hortonworks Hadoop HDP-2.6.4.x release 
>  oozie admin -version: Oozie server build version: 4.2.0.2.6.4.0-91
>            Reporter: Mahesh Balakrishnan
>            Priority: Major
>         Attachments: OOZIE-3218.patch
>
>
> When running a oozie sqoop action which has command with --query in place the 
> query is split into multiple parts causing "Unrecognized argument:" and 
> in-turn fails.
> <sqoop
>  xmlns="uri:oozie:sqoop-action:0.4">
>  <job-tracker>${resourceManager}</job-tracker>
>  <name-node>${nameNode}</name-node>
>  <command>import --verbose --connect jdbc:mysql://test.openstacklocal/db 
> --query select * from abc where $CONDITIONS --username test --password test 
> --driver com.mysql.jdbc.Driver -m 1 </command>
>  </sqoop>
>  <ok to="end"/>
>  
> Oozie Launcher logs:
> ++++++++++++++++++++++++++++++++
> Sqoop command arguments :
>  import
>  --verbose
>  --connect
>  jdbc:mysql://test.openstacklocal/db
>  --query
>  "select
>  *
>  from
>  abc
>  where
>  $CONDITIONS"
>  --username
>  hive
>  --password
>  ********
>  --driver
>  com.mysql.jdbc.Driver
>  -m
>  1
> Fetching child yarn jobs
> tag id : oozie-a1bbe03a0983b9e822d12ae7bb269ee3
> 2791 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to 
> ResourceManager at hdp263-3.openstacklocal/172.26.105.248:8050
> Child yarn jobs are found - 
> =================================================================
> >>> Invoking Sqoop command line now >>>
> 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not 
> been set in the environment. Cannot check for additional configuration.
> 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not 
> been set in the environment. Cannot check for additional configuration.
> 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 
> 1.4.6.2.6.4.0-91
> 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 
> 1.4.6.2.6.4.0-91
> 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging.
> 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging.
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing 
> arguments for import:
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing 
> arguments for import:
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: *
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: *
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: from
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: from
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: abc
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: abc
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: where
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: where
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: $CONDITIONS"
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: $CONDITIONS"
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: --username
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: --username
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: abc
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: abc
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: --password
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: --password
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: abc
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: abc
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: --driver
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: --driver
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: com.mysql.jdbc.Driver
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: com.mysql.jdbc.Driver
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: -m
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: -m
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: 1
> 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized 
> argument: 1
> 3289 [main] DEBUG org.apache.sqoop.Sqoop - 
> Try --help for usage instructions.
> +++++++++++++++++++++++++++++++++++++++++++
> The code piece which causes the issue is (SqoopActionExecutor.java):
> ++++++++++++++++++++++++++++++++++
>  String[] args;
>  if (actionXml.getChild("command", ns) != null) {
>  String command = actionXml.getChild("command", ns).getTextTrim();
>  StringTokenizer st = new StringTokenizer(command, " ");
>  List<String> l = new ArrayList<String>();
>  while (st.hasMoreTokens()) {
>  l.add(st.nextToken());
>  }
>  args = l.toArray(new String[l.size()]);
>  }
>  else {
>  List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns);
>  args = new String[eArgs.size()];
>  for (int i = 0; i < eArgs.size(); i++) {
>  args[i] = eArgs.get(i).getTextTrim();
>  }
>  }
> setSqoopCommand(actionConf, args);
>  return actionConf;
>  }
> ++++++++++++++++++++++++++++++++++
> Since the delimiter is a space, the code splits the select * from table into 
> select, *, from, table as nextToken and adds them seperatly into the array 
> causing the issue.
> I have made a code change locally to address this issue and did some testing 
> around this and it seem to work fine, hence submitting the code change for 
> this
>  
> String[] args;
>  if (actionXml.getChild("command", ns) != null) {
>  String command = actionXml.getChild("command", ns).getTextTrim();
> // Added this to get the value for select clause to be appended
>  String QueryAppendStr =""; // Added this to get the value for select clause 
> to be appended
> StringTokenizer st = new StringTokenizer(command, " ");
>  List<String> l = new ArrayList<String>();
> while (st.hasMoreTokens()) {
>  
>  // added to get the command delimited value to check and see if it needs to 
> be appended or could it be directly added to the list.
>  String QueryStr = (String) st.nextToken();
> if(!(QueryStr.contains("--") || QueryStr.contains("-") || 
> QueryStr.contains("-D")))
>  {
>  QueryAppendStr = QueryAppendStr + QueryStr + " ";
>  }
>  else {
>  if (!(QueryAppendStr.trim().equals(null) || 
> QueryAppendStr.trim().equals(""))) {
>  LOG.debug("Append : [\{0}]", QueryAppendStr.trim());
>  l.add(QueryAppendStr.trim());
>  QueryAppendStr="";
>  }
>  LOG.debug("Actual : [\{0}]", QueryStr);
>  //l.add(st.nextToken());
>  l.add(QueryStr);
>  }
>  }
>  l.add(QueryAppendStr.trim());
>  args = l.toArray(new String[l.size()]);
>  }
>  else {
>  List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns);
>  args = new String[eArgs.size()];
>  for (int i = 0; i < eArgs.size(); i++) {
>  args[i] = eArgs.get(i).getTextTrim();
>  }
>  }
> setSqoopCommand(actionConf, args);
>  return actionConf;
>  }
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to