[
https://issues.apache.org/jira/browse/OOZIE-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844814#comment-13844814
]
Robert Kanter commented on OOZIE-1492:
--------------------------------------
{quote}
So we need to ensure that the same hcat server that materialized the action
runs input check and launches job.
{quote}
If the Oozie server that materialized the action goes down, some other Oozie
server would have to pick this up. So I don't think we can do that...
{quote}
We will have to split that between HA servers based on what jobs they will
process.
{quote}
The jobs aren't assigned to specific Oozie servers; they are processed by
whichever Oozie server happens to process them at a given time.
Instead of distributing the HCat and SLA data between the Oozie servers, what
if we have just one server handle them? There's a method that returns true
only for one server (the first one that registered in ZK), so we could have
this server handle the HCat and SLA stuff. If it goes down, the next
registered server would become the first server and we could rebuild the data
structures.
Another related question: For the JMS messages (which includes SLA
notifications), when we have multiple Oozie servers, I suppose they should all
be able to publish to the same topics in the JMS provider, right? Do you know
if that will work out-of-the-box or will we need to do some work on that too?
> Make sure HA works with HCat and SLA notifications
> --------------------------------------------------
>
> Key: OOZIE-1492
> URL: https://issues.apache.org/jira/browse/OOZIE-1492
> Project: Oozie
> Issue Type: Improvement
> Components: HA
> Affects Versions: trunk
> Reporter: Robert Kanter
>
> We need to make sure HA works with HCat integration and SLA notifications.
> Both have in-memory datastructures and HA will impact them.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)