Re: Need help with troubleshooting a worker not processing messages

Girish Joshi Fri, 20 Mar 2015 10:41:03 -0700

Thanks Taylor for your response.

In my case, I have seen that 4 of my 15 kafka executors do not process any
data; I will check what the kafka # of partitions is but looks like it may
be just 11 in which case I should reduce the number of kafka executors.


around 50 of the 550 mapperBoltExecutors I have do not process anything and
I am now guessing that is because if my maxSpoutPending (500) is low and so
there are not enough tuples to be processed in 550 mapperBoltExecutors.

Do you know if maxSpoutPending is the maximum number of unacked tuples from
a single spout executor or from all spout executors combined? If it is the
later, then my guess makes sense since if there are only 500 tuples
unacked, they will need only 500 more bolts to process them.

kafkaSpoutExecutors: 15
mapperBoltExecutors: 550
workers: 9
maxSpoutPending: 500



On Thu, Mar 19, 2015 at 8:55 PM, P. Taylor Goetz <[email protected]> wrote:

> More information about your topology would help, but..
>
> I’ll assume you’re using a core API topology (spouts/bolts).
>
> On the kafka spout side, does the spout parallelism == the # of kafka
> partitions? (It should.)
>
>  On the bolt side, are you using fields groupings at all, and if so, what
> does the distribution of those fields look like?
>
> To changel the logging level, edit the log back config files in
> .//storm/logback, if running locally, add or edit a logback config file in
> your project.
>
> -Taylor
>
> On Mar 19, 2015, at 7:11 PM, Girish Joshi <[email protected]> wrote:
>
> I am trying to troubleshoot an issue with our storm cluster where a worker
> process on one of the machines in the cluster does not perform any work.
> All the counts(emitted/transferred/executed) for all executors in that
> worker are 0 as shown below. Even if I restart the worker, storm supervisor
> starts a new one and that does not process any work either.
>
> [120-120]26m 17sstorm6-prod6702
> <http://watson-storm6-prod.lup1:8000/log?file=worker-6702.log>000.000
> 0.00000.00000
>
> Supervisor logs shows that the worker is started and the worker log just
> has a bunch of zookeeper messages printed every minute.
>
> 2015-03-19 22:25:07 s.k.ZkCoordinator [INFO] Refreshing partition manager
> connections
> 2015-03-19 22:25:07 s.k.ZkCoordinator [INFO] Deleted partition managers: []
> 2015-03-19 22:25:07 s.k.ZkCoordinator [INFO] New partition managers: []
> 2015-03-19 22:25:07 s.k.ZkCoordinator [INFO] Finished refreshing
>
> I am looking for some debugging help and have following questions. If you
> have any suggestions , I will appreciate that.
>
> - From the storm UI, it looks like the worker process is up and running
> and is assigned to executing tasks from all bolts and spouts in the
> topology. But it does not get any messages to work on. Is there a way I can
> find out why is storm infrastructure routing any messages to any of the
> bolts running in that process? For spouts, since they are reading from
> kafka, I could understand that there are no partitions left for this worker
> to read from and so it does not have anything to read. But I would expect
> messages from other kafka spouts to be routed to bolts in this worker
> process.
>
> - Is there a way I can enable debug logging for storm which can tell me
> why a particular worker process is not getting any messages/tuples to
> execute?
>
> Thanks,
>
> Girish.
>
>
>

Re: Need help with troubleshooting a worker not processing messages

Reply via email to