Hi - Thanks - it turns out that the JSON parsing is actually fine with HEAD, although inaccurate without the required message format (comments mention expecting an “s” property with timestamp value).
My problem was that I was not specifying the spout root properly, i.e. --spoutroot /transactional/<spout id>/user/ (in my case I had specified a path that was valid, but not a spout) Now I get offset info properly via monitor.py - Thanks! Tyson On May 28, 2014, at 10:12 AM, Cody A. Ray <[email protected]<mailto:[email protected]>> wrote: Right, its trying to read your kafka messages and parse as JSON. See the error: simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0) If you want to use the BrightTag branch, you'll need to go a couple commits back. Try this: git clone https://github.com/BrightTag/stormkafkamon git checkout 07eede9ec72329fe2cad893d087541b583e11148 -Cody On Wed, May 28, 2014 at 10:39 AM, Tyson Norris <[email protected]<mailto:[email protected]>> wrote: Thanks Cody - I tried the BrightTag fork and still have problems with storm 0.9.1-incubating and kafka 0.8.1, I get an error with my trident topology (haven’t tried non-trident yet): (venv)tnorris-osx:stormkafkamon tnorris$ ./monitor.py --topology TrendingTagTopology --spoutroot storm --friendly Traceback (most recent call last): File "./monitor.py", line 112, in <module> sys.exit(main()) File "./monitor.py", line 96, in main zk_data = process(zc.spouts(options.spoutroot, options.topology)) File "/git/github/stormkafkamon/stormkafkamon/zkclient.py", line 76, in spouts j = json.loads(self.client.get(self._zjoin([spout_root, c, p]))[0]) File "/git/github/stormkafkamon/venv/lib/python2.7/site-packages/simplejson/__init__.py", line 501, in loads return _default_decoder.decode(s) File "/git/github/stormkafkamon/venv/lib/python2.7/site-packages/simplejson/decoder.py", line 370, in decode obj, end = self.raw_decode(s) File "/git/github/stormkafkamon/venv/lib/python2.7/site-packages/simplejson/decoder.py", line 389, in raw_decode return self.scan_once(s, idx=_w(s, idx).end()) simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0) (venv)tnorris-osx:stormkafkamon tnorris$ I’m not too familiar with python but will try to debug it as time allows - let me know if you have advice. Thanks Tyson On May 28, 2014, at 7:20 AM, Cody A. Ray <[email protected]<mailto:[email protected]>> wrote: You can also use stormkafkamon to track this stuff. Its not good for historical analysis like graphite/ganglia, but its good if you just want to see how things currently stand. The original: https://github.com/otoolep/stormkafkamon This didn't work for us without some updates (incompatibility with the latest python-kafka dep). Here are those updates: https://github.com/BrightTag/stormkafkamon/commit/07eede9ec72329fe2cad893d087541b583e11148 (Our branch has a couple more things that parse the kafka messages with our format (which embeds a timestamp) to determine how long (in time) storm is behind... planning to clean that up soon so it can be a bit more reusable) https://github.com/BrightTag/stormkafkamon -Cody On Wed, May 28, 2014 at 4:50 AM, Danijel Schiavuzzi <[email protected]<mailto:[email protected]>> wrote: Yes, Trident Kafka spouts give you the same metrics. Take a look at the code to find out what's available. On Wed, May 28, 2014 at 3:55 AM, Tyson Norris <[email protected]<mailto:[email protected]>> wrote: Do Trident variants of kafka spouts do something similar? Thanks Tyson > On May 27, 2014, at 3:19 PM, "Harsha" > <[email protected]<mailto:[email protected]>> wrote: > > Raphael, > kafka spout sends metrics for kafkaOffset and kafkaPartition you can > look at those by using LoggingMetrics or setting up a ganglia. Kafka uses its > own zookeeper to store state info per topic & group.id<http://group.id/> you > can look at kafka offsets using > kafka/bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker > -Harsha > > >> On Tue, May 27, 2014, at 03:01 PM, Raphael Hsieh wrote: >> Is there a way to tell where in the kafka stream my topology is starting >> from? >> From my understanding Storm will use zookeeper in order to tell its place in >> the Kafka stream. Where can I find metrics on this ? >> How can I see how large the stream is? What how much data is sitting in the >> stream and what the most recent/oldest position is? >> >> Thanks >> >> -- >> Raphael Hsieh -- Danijel Schiavuzzi E: [email protected]<mailto:[email protected]> W: www.schiavuzzi.com<http://www.schiavuzzi.com/> T: +385989035562 Skype: danijels7 -- Cody A. Ray, LEED AP [email protected]<mailto:[email protected]> 215.501.7891<tel:215.501.7891> -- Cody A. Ray, LEED AP [email protected]<mailto:[email protected]> 215.501.7891
