Re: Position in Kafka Stream

Tyson Norris Thu, 29 May 2014 00:08:22 -0700

Hi -
Thanks - it turns out that the JSON parsing is actually fine with HEAD, 
although inaccurate without the required message format (comments mention 
expecting an “s” property with timestamp value).

My problem was that I was not specifying the spout root properly, i.e. 
--spoutroot /transactional/<spout id>/user/  (in my case I had specified a path 
that was valid, but not a spout)

Now I get offset info properly via monitor.py - Thanks!

Tyson

On May 28, 2014, at 10:12 AM, Cody A. Ray 
<[email protected]<mailto:[email protected]>> wrote:

Right, its trying to read your kafka messages and parse as JSON.  See the error:

simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

If you want to use the BrightTag branch, you'll need to go a couple commits 
back. Try this:

git clone https://github.com/BrightTag/stormkafkamon
git checkout 07eede9ec72329fe2cad893d087541b583e11148

-Cody

On Wed, May 28, 2014 at 10:39 AM, Tyson Norris 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Cody -
I tried the BrightTag fork and still have problems with storm 0.9.1-incubating 
and kafka 0.8.1, I get an error with my trident topology (haven’t tried 
non-trident yet):
(venv)tnorris-osx:stormkafkamon tnorris$ ./monitor.py --topology 
TrendingTagTopology --spoutroot storm --friendly
Traceback (most recent call last):
  File "./monitor.py", line 112, in <module>
    sys.exit(main())
  File "./monitor.py", line 96, in main
    zk_data = process(zc.spouts(options.spoutroot, options.topology))
  File "/git/github/stormkafkamon/stormkafkamon/zkclient.py", line 76, in spouts
    j = json.loads(self.client.get(self._zjoin([spout_root, c, p]))[0])
  File 
"/git/github/stormkafkamon/venv/lib/python2.7/site-packages/simplejson/__init__.py",
 line 501, in loads
    return _default_decoder.decode(s)
  File 
"/git/github/stormkafkamon/venv/lib/python2.7/site-packages/simplejson/decoder.py",
 line 370, in decode
    obj, end = self.raw_decode(s)
  File 
"/git/github/stormkafkamon/venv/lib/python2.7/site-packages/simplejson/decoder.py",
 line 389, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
(venv)tnorris-osx:stormkafkamon tnorris$

I’m not too familiar with python but will try to debug it as time allows - let 
me know if you have advice.

Thanks
Tyson

On May 28, 2014, at 7:20 AM, Cody A. Ray 
<[email protected]<mailto:[email protected]>> wrote:

You can also use stormkafkamon to track this stuff. Its not good for historical 
analysis like graphite/ganglia, but its good if you just want to see how things 
currently stand.

The original: https://github.com/otoolep/stormkafkamon

This didn't work for us without some updates (incompatibility with the latest 
python-kafka dep). Here are those updates: 
https://github.com/BrightTag/stormkafkamon/commit/07eede9ec72329fe2cad893d087541b583e11148

(Our branch has a couple more things that parse the kafka messages with our 
format (which embeds a timestamp) to determine how long (in time) storm is 
behind... planning to clean that up soon so it can be a bit more reusable)

https://github.com/BrightTag/stormkafkamon

-Cody

On Wed, May 28, 2014 at 4:50 AM, Danijel Schiavuzzi 
<[email protected]<mailto:[email protected]>> wrote:
Yes, Trident Kafka spouts give you the same metrics. Take a look at the code to 
find out what's available.

On Wed, May 28, 2014 at 3:55 AM, Tyson Norris 
<[email protected]<mailto:[email protected]>> wrote:
Do Trident variants of kafka spouts do something similar?
Thanks
Tyson

> On May 27, 2014, at 3:19 PM, "Harsha" 
> <[email protected]<mailto:[email protected]>> wrote:
>
> Raphael,
>        kafka spout sends metrics for kafkaOffset and kafkaPartition you can 
> look at those by using LoggingMetrics or setting up a ganglia. Kafka uses its 
> own zookeeper to store state info per topic & group.id<http://group.id/> you 
> can look at kafka offsets using
> kafka/bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker
> -Harsha
>
>
>> On Tue, May 27, 2014, at 03:01 PM, Raphael Hsieh wrote:
>> Is there a way to tell where in the kafka stream my topology is starting 
>> from?
>> From my understanding Storm will use zookeeper in order to tell its place in 
>> the Kafka stream. Where can I find metrics on this ?
>> How can I see how large the stream is? What how much data is sitting in the 
>> stream and what the most recent/oldest position is?
>>
>> Thanks
>>
>> --
>> Raphael Hsieh

--
Danijel Schiavuzzi

E: [email protected]<mailto:[email protected]>
W: www.schiavuzzi.com<http://www.schiavuzzi.com/>
T: +385989035562
Skype: danijels7

--
Cody A. Ray, LEED AP
[email protected]<mailto:[email protected]>
215.501.7891<tel:215.501.7891>

--
Cody A. Ray, LEED AP
[email protected]<mailto:[email protected]>
215.501.7891

Re: Position in Kafka Stream

Reply via email to