[ 
https://issues.apache.org/jira/browse/STORM-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16068798#comment-16068798
 ] 

Stig Rohde Døssing commented on STORM-2600:
-------------------------------------------

[~pshah] The only new issue I see with it is that users may choose to implement 
their own Subscriptions, and then it wouldn't work. It also has the same issues 
as mentioned previously.

I've thought about alternatives a bunch, and I can't think of any that I like 
that we could implement immediately. Passing offset lag by metrics seems like 
it has a bunch of roadblocks and bad side effects to me, so it doesn't seem 
like a good option to me anymore. Adding support for Pattern subscriptions and 
maybe adding a Windows script to call storm-kafka-monitor feels like band aid 
fixing to me. I think it's probably our best option at the moment though.

Kafka seems to be implementing an AdminClient that is supposed to be backwards 
compatible for more than 1 release 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-117%3A+Add+a+public+AdminClient+API+for+Kafka+admin+operations.
 I think if that client is updated to include consumer group operations, we can 
just use it directly in Storm UI and solve these problems. It reads to me like 
the plan is for that client to support a lot of the same functionality as the 
included Kafka shell scripts. The current kafka-consumer-groups.sh included 
with a Kafka install already shows lag for every topic a group is subscribed 
to, so we might not have to pass along more than the bootstrap servers and 
consumer group to Storm UI.. 

I'm happy to wait a bit to see where the AdminClient goes before trying to come 
up with a more solid solution here.

> Improve or replace storm-kafka-monitor
> --------------------------------------
>
>                 Key: STORM-2600
>                 URL: https://issues.apache.org/jira/browse/STORM-2600
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-kafka-monitor
>    Affects Versions: 2.0.0
>            Reporter: Stig Rohde Døssing
>            Priority: Minor
>
> The storm-kafka-monitor module, which is used by Storm UI to show offset lag 
> for topologies with Kafka spouts, has some shortcomings:
> * The Storm UI integration code doesn't seem to be able to support topic 
> subscriptions that change after topology submission. The UI code 
> (https://github.com/apache/storm/blob/64e29f365c9b5d3e15b33f33ab64e200345333e4/storm-core/src/jvm/org/apache/storm/utils/TopologySpoutLag.java#L91)
>  gets the topic list it should request offset lag for via the spout's 
> getComponentConfiguration method, as far as I can tell through this call 
> https://github.com/apache/storm/blob/9e31509d47c4e91c1009f55c7ccf321d7d7e63aa/storm-client/src/jvm/org/apache/storm/topology/TopologyBuilder.java#L541.
>  It seems like the component configuration is intended to be static once the 
> topology has started running. This prevents us from showing the right topic 
> list for subscriptions that are not known at submission time, which is 
> currently the case for Pattern subscriptions. The topic list for that type of 
> subscription isn't known until the spout has started the KafkaConsumer in 
> {{ISpout.open()}}. I don't see a way to fix this, unless there is some way to 
> update the component configuration when the subscription changes.
> * The jar is installed along with the cluster, and depends on the Kafka 
> version specified in Storm's root POM. Kafka guarantees backwards compatible 
> client-server communication for one release only, so there's a potential 
> coupling between Storm cluster version and Kafka version. If users want to 
> update the Kafka version in storm-kafka-monitor, they have to rebuild that 
> module and replace the jar in their Storm install.
> * The UI integration uses the storm-kafka-monitor Bash script to start the 
> monitoring code, in order to avoid a dependency between storm-core and 
> storm-kafka-monitor. This prevents the UI integration from working on 
> Windows. We could supply a Windows script as well, but then we'd need to keep 
> the two in sync.
> I am wondering if these problems could be solved by implementing offset lag 
> monitoring via the metrics system instead. The spout could periodically seek 
> to the log end offset and submit a metric for how far behind the committed 
> offset is, then seek back to where it left off.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to