[ https://issues.apache.org/jira/browse/OOZIE-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798280#comment-13798280 ]
Mona Chitnis commented on OOZIE-1561: ------------------------------------- Good elaboration on thoughts. However, bq. Log messages that contain a job ID go to their own log file (and also to the oozie.log file?) and all other log messages go to the oozie.log file. Is this the case? I was under the impression that all logs go to oozie.log and are parsed by matching jobid to stream job logs > When using Oozie HA, the logs should also be HA > ----------------------------------------------- > > Key: OOZIE-1561 > URL: https://issues.apache.org/jira/browse/OOZIE-1561 > Project: Oozie > Issue Type: Improvement > Components: HA > Affects Versions: trunk > Reporter: Robert Kanter > Assignee: Robert Kanter > Priority: Critical > > Currently, if an Oozie server goes down, the logs from that server become > unavailable until the server comes back up. In the meantime, the user may or > may not be aware that log messages could be missing when Oozie streams logs > to the user. > We should come up with a way to make the logs HA. > Some ideas: > # When rolling the logs, copy them into HDFS; Oozie servers can then read the > log files directly from HDFS instead of each other > #- The downside to this is that there will be a window where logs could still > be missing as they only show up in HDFS after rolling over (default = 1hr) > and Oozie servers would still have to contact each other for the last hour of > logs > #- The upside is that it minimizes the amount of logs that could be missing > and would be fairly straightforward to implement > # Log directly to HDFS > #- The downside is that this may be complicated or tricky to get working > properly > #-- This also introduces a strict dependency on HDFS > #- The upside is that this would completely solve the issue and Oozie servers > would simply get all logs directly from HDFS > # Log to ZooKeeper or a database > #- I think the log files will be too big to do this > I've assigned this to myself, but if someone wants to tackle this, feel free > to reassign it. I think idea 2 is the most practical, but I'm also open to > other ideas on how to do this. -- This message was sent by Atlassian JIRA (v6.1#6144)