[
https://issues.apache.org/jira/browse/STORM-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118650#comment-15118650
]
Erik Weathers commented on STORM-1342:
--------------------------------------
STORM-1494 is adding support for the supervisor logs to be linked from the
Nimbus UI. So this will likely be another area to adjust when (if!?) this is
fixed.
> support multiple logviewers per host for container-isolated worker logs
> -----------------------------------------------------------------------
>
> Key: STORM-1342
> URL: https://issues.apache.org/jira/browse/STORM-1342
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-core
> Reporter: Erik Weathers
> Priority: Minor
>
> h3. Storm-on-Mesos Worker Logs are in varying directories
> When using [storm-on-mesos|https://github.com/mesos/storm] with cgroups, each
> topology's workers are isolated into separate containers. By default the
> worker logs will be saved into container-specific sandbox directories. These
> directories are also topology-specific by definition, because, as just
> stated, the containers are specific to each topology.
> h3. Problem: Storm supports 1-and-only-1 Logviewer per Worker Host
> A challenge with this different way of running Storm is that the [Storm
> logviewer|https://github.com/apache/storm/blob/768a85926373355c15cc139fd86268916abc6850/docs/_posts/2013-12-08-storm090-released.md#log-viewer-ui]
> runs as a single instance on each worker host. This doesn't play well with
> having the topology worker logs in separate per-topology containers. The one
> logviewer doesn't know about the various sandbox directories that the Storm
> Workers are writing to. And if we just spawned new logviewers for each
> container, the problem is that the Storm UI only knows about 1 global port
> the logviewer, so you cannot just direct.
> These problems are documented (or linked to) from [Issue #6 in the
> storm-on-mesos project|https://github.com/mesos/storm/issues/6]
> h3. Possible Solutions I can envision
> # configure the Storm workers to write to log directories that exist on the
> raw host outside of the container sandbox, and run a single logviewer on a
> host, which serves up the contents of that directory.
> #* violates one of the basic reasons for using containers: isolation.
> #* also prevents allow a standard use case for Mesos: running more than 1
> instance of a Mesos Framework (e.g., "Storm Cluster") at once on same Mesos
> Cluster. e.g., for Blue-Green deployments.
> #* a variation on this proposal is to somehow expose the sandbox dirs of all
> storm containers to this singleton logviewer process (still has above
> problems)
> # launch a separate logviewers in each container, and somehow register those
> logviewers with Storm such that Storm knows for a given host which logviewer
> port is assigned to a given topology.
> #* this is the proposed solution
> h3. Storm Changes for the Proposed Solution
> Nimbus or ZooKeeper could serve as a registrar, recording the association
> between a slot (host + worker port) and the logviewer port that is serving
> the workers logs. And the Storm-on-Mesos framework could update this registry
> when launching a new worker. (This proposal definitely calls for thorough
> vetting and thinking.)
> h3. Storm-on-Mesos Framework Changes for the Proposed Solution
> Along with the interaction with the "registrar" proposed above, the
> storm-on-mesos framework can be enhanced to launch multiple logviewers on a
> given worker host, where each logviewer is dedicated to serving the worker
> logs from a specific topology's container/sandbox directory. This would be
> done by launching a logviewer process within the topology's container, and
> assigning it an arbitrary listening port that has been determined dynamically
> through mesos (which treats ports as one of the schedulable resource
> primitives of a worker host). [Code implementing this
> logviewer-port-allocation logic already
> exists|https://github.com/mesos/storm/commit/af8c49beac04b530c33c1401c829caaa8e368a35],
> but [that specific portion of the code was
> reverted|https://github.com/mesos/storm/commit/dc3eee0f0e9c06f6da7b2fe697a8e4fc05b5227e]
> because of the issues that inspired this ticket.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)