I debugged issue further, I put some logs in KafkaUtil class & found that storm was not receiving partition information (It was null). As partition information is stored in zookeeper, I decided to play with zookeepers sequence in properties. So I have moved first zookeeper from start & put it into end of list, then topology started processing data. I believe somehow the zookeeper did not had the partition information.
However, after this run topology worked with old config. As part of metadata updates storm might added information in all zookeepers. Hope this analysis is helpful. From: Sachin Pasalkar <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Wednesday, 6 April 2016 12:22 am To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>, Bobby Evans <[email protected]<mailto:[email protected]>> Subject: Re: Not enough data to calculate spout lag I checked in latest code KafkaUtils is class which is showing up this error (Same in previous version too) https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/org/apache/storm/kafka/KafkaUtils.java From: Sachin Pasalkar <[email protected]<mailto:[email protected]><mailto:[email protected]>> Date: Wednesday, 6 April 2016 12:15 am To: "[email protected]<mailto:[email protected]><mailto:[email protected]>" <[email protected]<mailto:[email protected]><mailto:[email protected]>>, Bobby Evans <[email protected]<mailto:[email protected]><mailto:[email protected]>> Subject: Re: Not enough data to calculate spout lag Sorry, I missed your mail. We are using the 0.8 version ok Kafka & 0.10 version of storm. Yes its Trident topologies. From: Bobby Evans <[email protected]<mailto:[email protected]><mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]><mailto:[email protected]>" <[email protected]<mailto:[email protected]><mailto:[email protected]>>, Bobby Evans <[email protected]<mailto:[email protected]><mailto:[email protected]>> Date: Tuesday, 22 March 2016 12:24 am To: "[email protected]<mailto:[email protected]><mailto:[email protected]>" <[email protected]<mailto:[email protected]><mailto:[email protected]>> Subject: Re: Not enough data to calculate spout lag I am not super familiar with that code what version of the kafka spout are you using? Is this with Trident or with regular storm? The code honestly seems over complicated for what it is doing, but I would have to dig more deeply into exactly how the partitions are managed to possibly see why it is doing this. For me though the latest code looks like there is no way what this should happen. https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/org/apache/storm/kafka/KafkaSpout.java#L98-L104 But I didn't look a trident. - Bobby On Sunday, March 20, 2016 7:43 AM, Sachin Pasalkar <[email protected]<mailto:[email protected]><mailto:[email protected]>> wrote: Can someone help me out with this? -----Original Message----- From: Sachin Pasalkar [mailto:[email protected]] Sent: Friday, March 18, 2016 9:37 PM To: [email protected]<mailto:[email protected]><mailto:[email protected]> Subject: Not enough data to calculate spout lag Hi, I found log "Metrics Tick: Not enough data to calculate spout lag." in my topology and then topology becomes inactive. I check the source: if (_partitions != null && _partitions.size() == _partitionToOffset.size()) { ......}else { LOG.info("Metrics Tick: Not enough data to calculate spout lag."); } What situation will cause _partitions != null or _partitions.size() ==_partitionToOffset.size()? Thanks, Sachin
