Niklas Quarfot Nielsen created MESOS-739:
--------------------------------------------

             Summary: Structure framework logging for aggregation
                 Key: MESOS-739
                 URL: https://issues.apache.org/jira/browse/MESOS-739
             Project: Mesos
          Issue Type: Story
          Components: master, slave
            Reporter: Niklas Quarfot Nielsen
            Priority: Minor


It could be useful to structure framework logging such that it can be 
interleaved and presented in fast cluster-wide log views.

One example view could be framework-wide log tailing.

If mesos was to be aware of its logging mechanism, one way to go about it could 
be:
 - Redirection of stdout and stderr to a dedicated log script (instead of 
freopen to stderr and stdout files). The log script can be opened with popen, 
such that failing executor processes doesn't leave orphan processes around. 
This opens opportunity to log to multiple destinations as well (syslog, files, 
streams and pipes). See: MESOS-588

 - Initially, the log script can be relatively naïve - only augmenting standard 
out/error with timestamps. However, it might be useful to structure logs even 
further and store logs by time intervals for rotation and faster lookups.

- In a subsequent phase where a consistent log format has been reached i.e. 
there is a way to read log data back, interfaces in both master and slaves can 
provide ways to query logs. For example, Scheduler.tailLog(int n) which 
broadcasts log requests to its slaves with a total upper bound of n entries 
(every slave provides n / #Slaves entries). A more general API is probably 
desirable.

The last sub-task could be left out, and we could let another set of dedicated 
log processes run on the slaves which provides interfaces for log retrieval.
There are probably many ways to attack this. More log awareness in mesos 
introduce complexity but could provide great and fast diagnostics.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to