Andras Piros created OOZIE-3132:
-----------------------------------
Summary: Instrument SLAService and SLACalculatorMemory
Key: OOZIE-3132
URL: https://issues.apache.org/jira/browse/OOZIE-3132
Project: Oozie
Issue Type: Improvement
Components: core
Affects Versions: 4.3.0
Reporter: Andras Piros
Assignee: Andras Piros
Fix For: 5.0.0b1
When there are lots of {{WorkflowJobBean}} and {{CoordinatorJobBean}} instances
that have to be followed up on creating {{SLASummaryBean}} instances, following
can occur:
* we set {{oozie.sla.service.SLAService.capacity}} to a sane value like
{{10000}} to preserve heap consumption
* {{SLACalculatorMemory#addRegistration()}} and
{{SLACalculatorMemory#updateRegistration}} would:
** either emit {{TRACE}} level logs like {{SLA Registration Event - Job:}}
showing the add / update of {{SLARegistrationBean}} was successful
** or emit {{ERROR}} level logs like {{SLACalculator memory capacity reached.
Cannot add or update new SLA Registration entry for job}} showing the add /
update of {{SLARegistrationBean}} was not successful
Since sometimes stale or already processed {{SLAEvent}} entries from
{{SLACalculatorMemory#slaMap}} get removed, it's pretty hard to say what is its
the actual size - that is, whether the next add or update command will succeed
We need an {{Instrumentation.Counter}} instance that gets incremented when
there is an {{SLACalculatorMemory#slaMap#put()}} with a new entry added, and
gets decremented when there happens a {{SLACalculatorMemory#slaMap#remove()}}
with an existing entry removed. This counter will be automatically present
within REST interface, and Oozie client.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)