[
https://issues.apache.org/jira/browse/STORM-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430751#comment-15430751
]
Robert Joseph Evans commented on STORM-1985:
--------------------------------------------
I personally would prefer a tool that pretends to be Nimbus and acts like
Nimbus. Meaning it would connect to zookeeper, blobstore, local caches, etc.
just as if it were the nimbus daemon. We have run into situations in the past
where nimbus is down because of bad state stored somewhere. Having a tool that
can do everything nimbus does is important. Having a separate daemon to do
this feels too complicated, and also exposes a lot more potential for attack.
At the beginning I would say just have a command line tool that will create a
[ClusterState|https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/cluster/IStormClusterState.java]
a
[BlobStore|https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/blobstore/BlobStore.java]
and possibly a
[LocalState|https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/utils/LocalState.java]
like nimbus currently does. Once those are created for this project we would
just then run through some code very similar to
[cleanup-corrupt-topologies!|https://github.com/apache/storm/pull/1572/files].
In the future we could have it do many more things. Having a UI in the future
would probably need a separate daemon for security reasons, but we use this
type of operation so rarely that I don't see much value in setting up an RPC
daemon for it. If we want a UI have the UI be baked into the admin command so
it would be a web process that is running with the same privlages as nimbus,
and there is no need for RPC at all, just run it locally.
> Provide a tool for showing and killing corrupted topology
> ---------------------------------------------------------
>
> Key: STORM-1985
> URL: https://issues.apache.org/jira/browse/STORM-1985
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Reporter: Jungtaek Lim
> Assignee: Kamal
> Labels: newbie
> Attachments: proposal_admin_tool_design.docx
>
>
> After STORM-1976, Nimbus doesn't clean up corrupted topologies.
> (corrupted topology means the topology whose codes are not available on
> blobstore.)
> Also after STORM-1977, no Nimbus is gaining leadership if one or more
> topologies are corrupted, which means all nimbuses will be no-op.
> So we should provide a tool to kill specific topology without accessing
> leader nimbus (because there's no leader nimbus at that time). The tool
> should also determine which topologies are corrupted, and show its list or
> clean up automatically.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)