Mona Chitnis created OOZIE-1911:
-----------------------------------
Summary: SLA calculation in HA mode does wrong bit comparison for
'start' and 'duration'
Key: OOZIE-1911
URL: https://issues.apache.org/jira/browse/OOZIE-1911
Project: Oozie
Issue Type: Bug
Affects Versions: trunk
Reporter: Mona Chitnis
Assignee: Mona Chitnis
Fix For: trunk
In chronological order:
Server 1:
Job's SLA eventProcessed set to 0101 => Start and End sla processed.
Server 2:
Receives above job's status event, processes remaining 'duration' sla.
eventProcessed now = 0111, but incremented to 1000 due to
{code}
SLACalculatorMemory.addJobStatus() : 762
if (slaCalc.getEventProcessed() == 7) {
slaInfo.setEventProcessed(8);
slaMap.remove(jobId);
}
{code}
Back to Server 1: (doing periodic SLA checks)
{code}
SLACalculatorMemory.updateJobSla() : 483
if ((eventProc & 1) == 0) { // first bit (start-processed) unset
if (reg.getExpectedStart() != null) {
if (reg.getExpectedStart().getTime() + jobEventLatency <
System.currentTimeMillis()) {
// goes ahead and enqueues another START_MISS event and
DURATION_MET event
{code}
Conclusion, need to fix that check for least significant bit (and next to it)
for 'start' and 'duration' to avoid duplicate events
--
This message was sent by Atlassian JIRA
(v6.2#6252)