Shubham created OOZIE-3273:
------------------------------
Summary: FsAction should fail on retry if destination path exists
Key: OOZIE-3273
URL: https://issues.apache.org/jira/browse/OOZIE-3273
Project: Oozie
Issue Type: Bug
Reporter: Shubham
This FsAction fails with error code FS008 if the source files already exist in
target folder.
The expected behavior should be that Oozie will try this action once again
after 1 minute, and marked the action as failed because the error is still
there.
However, Oozie marks the action as success on retry. (we didn't clean up the
target folder)
Logs:
{code}
2018-05-15 00:08:05,187 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Start action
[0000061-180514024838863-oozie-oozi-W@loading] with user-retry state :
userRetryCount [0], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:08:05,201 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Error starting action
[load-staging]. ErrorType [ERROR], ErrorCode [FS008], Message [FS008: move,
could not move
[hdfs://nn:8020/user/hive/audit/data/ingestion/USER_ACCOUNT_AF_A/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]]
org.apache.oozie.action.ActionExecutorException: FS008: move, could not move
[hdfs://nn:8020/user/hive/audit/data/ingestion/SAMPLE_WF/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]
at
org.apache.oozie.action.hadoop.FsActionExecutor.move(FsActionExecutor.java:509)
at
org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
at
org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:609)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:65)
at org.apache.oozie.command.XCommand.call(XCommand.java:287)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2018-05-15 00:08:05,202 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Setting Action Status to
[DONE]
2018-05-15 00:08:05,202 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Preparing retry this
action [0000061-180514024838863-oozie-oozi-W@loading], errorCode [FS008],
userRetryCount [0], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:09:05,254 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@loading] Start action
[0000061-180514024838863-oozie-oozi-W@loading] with user-retry state :
userRetryCount [1], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:09:05,276 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@loading]
[***0000061-180514024838863-oozie-oozi-W@loading***]Action status=DONE
2018-05-15 00:09:05,277 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@loading]
[***0000061-180514024838863-oozie-oozi-W@loading***]Action updated in DB!
2018-05-15 00:09:05,314 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@end] Start action
[0000061-180514024838863-oozie-oozi-W@end] with user-retry state :
userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2018-05-15 00:09:05,314 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@end]
[***0000061-180514024838863-oozie-oozi-W@end***]Action status=DONE
2018-05-15 00:09:05,314 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv]
USER[hive] GROUP[-] TOKEN[] APP[sample-wf]
JOB[0000061-180514024838863-oozie-oozi-W]
ACTION[0000061-180514024838863-oozie-oozi-W@end]
[***0000061-180514024838863-oozie-oozi-W@end***]Action updated in DB!
{code}
Observations:
- First time recovery parameter is 'false' so it executes the method
`fs.rename(p, target) && !recovery)`
- Second time `recovery` is true, so it does do into for loop `for (Path p :
pathArr)` and does not execute rename. So it is successful second time.
{code}
public void move(Context context, XConfiguration fsConf, Path nameNodePath,
Path source, Path target, boolean recovery)
throws ActionExecutorException {
try {
source = resolveToFullPath(nameNodePath, source, true);
validateSameNN(source, target);
FileSystem fs = getFileSystemFor(source, context, fsConf);
Path[] pathArr = FileUtil.stat2Paths(fs.globStatus(source));
if (( pathArr == null || pathArr.length == 0 ) ){
if (!recovery) {
throw new ActionExecutorException(ActionExecutorException.ErrorType.ERROR,
"FS006",
"move, source path [\{0}] does not exist", source);
} else {
return;
}
}
if (pathArr.length > 1 && (!fs.exists(target) || fs.isFile(target))) {
if(!recovery) {
throw new ActionExecutorException(ActionExecutorException.ErrorType.ERROR,
"FS012",
"move, could not rename multiple sources to the same target name");
} else {
return;
}
}
checkGlobMax(pathArr);
for (Path p : pathArr) {
if (!fs.rename(p, target) && !recovery) {
throw new ActionExecutorException(ActionExecutorException.ErrorType.ERROR,
"FS008",
"move, could not move [\{0}] to [\{1}]", p, target);
}
}
}
catch (Exception ex) {
throw convertException(ex);
}
}
{code}
I think It should fail even second time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)