[
https://issues.apache.org/jira/browse/OOZIE-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067116#comment-15067116
]
Purshotam Shah commented on OOZIE-2394:
---------------------------------------
{quote}
Removing that will impact performance very badly and we can't do that.
{quote}
None of wfstart,end command uses eager and eager verify. So, I don't think that
there will be any performance issue. In fact, wherever I saw eager and eager
verify, they have incorrect logic or there are multiple verify/load
Ex. BundleJobChangeXCommand does prevalidate in eagerVerifyPrecondition which
is incorrect.
eagerVerifyPrecondition happens before acquiring lock. So if try to pause a
bundle, which has SUCCEDED, it possible.
{code BundleJobChangeXCommand.java}
@Override
protected void verifyPrecondition() throws CommandException,
PreconditionException {
}
@Override
protected void eagerLoadState() throws CommandException {
try {
this.bundleJob =
BundleJobQueryExecutor.getInstance().get(BundleJobQuery.GET_BUNDLE_JOB_STATUS,
jobId);
LogUtils.setLogInfo(bundleJob);
}
catch (JPAExecutorException ex) {
throw new CommandException(ex);
}
}
@Override
protected void eagerVerifyPrecondition() throws CommandException,
PreconditionException {
validateChangeValue(changeValue);
if (bundleJob == null) {
LOG.info("BundleChangeCommand not succeeded - " + "job " + jobId +
" does not exist");
throw new PreconditionException(ErrorCode.E1314, jobId);
}
if (isChangePauseTime) {
if (bundleJob.getStatus() == Job.Status.SUCCEEDED ||
bundleJob.getStatus() == Job.Status.FAILED
|| bundleJob.getStatus() == Job.Status.KILLED ||
bundleJob.getStatus() == Job.Status.DONEWITHERROR) {
LOG.info("BundleChangeCommand not succeeded for changing
pausetime- " + "job " + jobId + " finished, status is "
+ bundleJob.getStatusStr());
throw new PreconditionException(ErrorCode.E1312, jobId,
bundleJob.getStatus().toString());
}
}
else if(isChangeEndTime){
if (bundleJob.getStatus() == Job.Status.KILLED) {
LOG.info("BundleChangeCommand not succeeded for changing
endtime- " + "job " + jobId + " finished, status is "
+ bundleJob.getStatusStr());
throw new PreconditionException(ErrorCode.E1312, jobId,
bundleJob.getStatus().toString());
}
}
}
{code}
bq. Even if locks are re-entrant, acquiring them has minor cost associated with
it and so it was skipping that as well which is good.
No, I don't think so. This information is done at client side. I guess curator
framework don't even have to call the ZK to check that.
Custom doing this, which is already done as part of java/curator framework
might introduce more bug.
Only issue is see is https://issues.apache.org/jira/browse/OOZIE-1922. Will fix
this as well.
> Oozie can execute command without holding lock
> ----------------------------------------------
>
> Key: OOZIE-2394
> URL: https://issues.apache.org/jira/browse/OOZIE-2394
> Project: Oozie
> Issue Type: Bug
> Reporter: Purshotam Shah
> Assignee: Purshotam Shah
> Priority: Critical
> Attachments: OOZIE-2394-V1.patch
>
>
> To speedup job submission ( not the the forked actions) we create workflow
> actions synchronously. We call ActionStartXCommand from SignalXCommand by
> setting isSynchronous = true. This will bypass lock acquiring, which is Ok,
> SignalXCommand will have the job lock.
> If there is transient error. Same command is requeued which will have
> isSynchronous flag set to true.
> Requeued command will wake-up and started executing without acquiring lock.
> If the job submission takes more than 2 min, then we might have issue.
> Action recovery is set to 2 min ( default), Recovery service will run and
> submitted new the command. since the first command didn't acquire any lock.
> Recovery will be able to run the new command.
> We will have two same command running parallely.
> All our commands are reentrant, we don't have to have set synchronized flag
> to run multiple command from same thread.
> Because of reentrant, command running in same thread should be able to
> acquire same lock.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)