Shubham created OOZIE-3273:
------------------------------

             Summary: FsAction should fail on retry if destination path exists
                 Key: OOZIE-3273
                 URL: https://issues.apache.org/jira/browse/OOZIE-3273
             Project: Oozie
          Issue Type: Bug
            Reporter: Shubham


This FsAction fails with error code FS008 if the source files already exist in 
target folder.

The expected behavior should be that Oozie will try this action once again 
after 1 minute, and marked the action as failed because the error is still 
there.

However, Oozie marks the action as success on retry. (we didn't clean up the 
target folder)

Logs:

{code}
2018-05-15 00:08:05,187 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Start action 
[0000061-180514024838863-oozie-oozi-W@loading] with user-retry state : 
userRetryCount [0], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:08:05,201 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Error starting action 
[load-staging]. ErrorType [ERROR], ErrorCode [FS008], Message [FS008: move, 
could not move 
[hdfs://nn:8020/user/hive/audit/data/ingestion/USER_ACCOUNT_AF_A/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
 to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]]
org.apache.oozie.action.ActionExecutorException: FS008: move, could not move 
[hdfs://nn:8020/user/hive/audit/data/ingestion/SAMPLE_WF/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
 to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]
 at 
org.apache.oozie.action.hadoop.FsActionExecutor.move(FsActionExecutor.java:509)
 at 
org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
 at 
org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:609)
 at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
 at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:65)
 at org.apache.oozie.command.XCommand.call(XCommand.java:287)
 at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
 at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
2018-05-15 00:08:05,202 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Setting Action Status to 
[DONE]
2018-05-15 00:08:05,202 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Preparing retry this 
action [0000061-180514024838863-oozie-oozi-W@loading], errorCode [FS008], 
userRetryCount [0], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:09:05,254 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Start action 
[0000061-180514024838863-oozie-oozi-W@loading] with user-retry state : 
userRetryCount [1], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:09:05,276 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@loading] 
[***0000061-180514024838863-oozie-oozi-W@loading***]Action status=DONE
2018-05-15 00:09:05,277 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@loading] 
[***0000061-180514024838863-oozie-oozi-W@loading***]Action updated in DB!
2018-05-15 00:09:05,314 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@end] Start action 
[0000061-180514024838863-oozie-oozi-W@end] with user-retry state : 
userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2018-05-15 00:09:05,314 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@end] 
[***0000061-180514024838863-oozie-oozi-W@end***]Action status=DONE
2018-05-15 00:09:05,314 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[0000061-180514024838863-oozie-oozi-W] 
ACTION[0000061-180514024838863-oozie-oozi-W@end] 
[***0000061-180514024838863-oozie-oozi-W@end***]Action updated in DB!
{code}


Observations:

- First time recovery parameter is 'false' so it executes the method 
`fs.rename(p, target) && !recovery)`
- Second time `recovery` is true, so it does do into for loop `for (Path p : 
pathArr)` and does not execute rename. So it is successful second time.

{code}

public void move(Context context, XConfiguration fsConf, Path nameNodePath, 
Path source, Path target, boolean recovery)
 throws ActionExecutorException {
 try {
 source = resolveToFullPath(nameNodePath, source, true);
 validateSameNN(source, target);
 FileSystem fs = getFileSystemFor(source, context, fsConf);
 Path[] pathArr = FileUtil.stat2Paths(fs.globStatus(source));
 if (( pathArr == null || pathArr.length == 0 ) ){
 if (!recovery) {
 throw new ActionExecutorException(ActionExecutorException.ErrorType.ERROR, 
"FS006",
 "move, source path [\{0}] does not exist", source);
 } else {
 return;
 }
 }
 if (pathArr.length > 1 && (!fs.exists(target) || fs.isFile(target))) {
 if(!recovery) {
 throw new ActionExecutorException(ActionExecutorException.ErrorType.ERROR, 
"FS012",
 "move, could not rename multiple sources to the same target name");
 } else {
 return;
 }
 }
 checkGlobMax(pathArr);

for (Path p : pathArr) {
 if (!fs.rename(p, target) && !recovery) {
 throw new ActionExecutorException(ActionExecutorException.ErrorType.ERROR, 
"FS008",
 "move, could not move [\{0}] to [\{1}]", p, target);
 }
 }
 }
 catch (Exception ex) {
 throw convertException(ex);
 }
 }

{code}

I think It should fail even second time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to