Re: Topology dies immediately upon deployment when configured with two workers instead of one

bijoy deb Fri, 31 Jan 2014 08:56:37 -0800

Hi Mark,

Ideally it should work once you use the correct (older) version of ZMQ.But
since you say you are getting the error with 2.1.7 version,I would suggest
you to try the below steps and let me know if that worked:


1) Build the jzmq using steps mentioned in below site:
https://github.com/nathanmarz/jzmq

After that restart Storm daemons and see if that resolves the issue.

2) If the issue still persists,you should try downgrading to 2.1.4 version
of zeromq,followed by building jzmq (as in Step 1 above), as suggested by
Nathan Marz in below link:

https://github.com/nathanmarz/storm/wiki/Setting-up-a-Storm-cluster , which
says:
 *Note that you should not install version 2.1.10, as that version has some
serious bugs that can cause strange issues for a Storm cluster. In some
rare cases, users have reported an "IllegalArgumentException" bubbling up
from the ZeroMQ code when using 2.1.7 - in these cases downgrading to 2.1.4
fixed the problem*

Let me know how it goes.

Thanks
Bijoy


On Fri, Jan 31, 2014 at 8:17 PM, Mark Greene <[email protected]> wrote:

>  Storm uses the internal queuing (through ZMQ) only when there is a
>> communication between two worker processes is required,which is why this
>> error comes up only when you set num_workers>1.
>
>
> I'm a little confused by the answer, are you suggesting that storm cannot
> run more than one worker even with the correct (older) version of ZMQ?
>
> What's unique about the environment I was having trouble with is it only
> had 1 supervisor where as my prod environment has multiple supervisors and
> I am not seeing a problem there.
>
>
> On Thu, Jan 30, 2014 at 11:59 PM, bijoy deb <[email protected]>wrote:
>
>>    Hi Mark,
>>
>> Storm uses the internal queuing (through ZMQ) only when there is a
>> communication between two worker processes is required,which is why this
>> error comes up only when you set num_workers>1.
>>
>> Though I won't be able to help with with an exact solution for this,I can
>> provide some pointers:
>>
>> a) Regarding the reason for the error,documentation says that The
>> ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
>> b) In a future version of Storm (don't recollect the exact version number
>> or if it has already been released),they are supposed to remove the ZMQ
>> dependency at all,so the above error should not be coming then.
>>
>> Thanks
>> Bijoy
>>
>>
>> On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <[email protected]> wrote:
>>
>>> Exception in log:
>>>
>>>  2014-01-31 02:58:14 task [INFO] Emitting: change-spout default
>>> [[B@38fc659c]
>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
>>> [1863657906985036001 0 2]
>>> 2014-01-31 02:58:14 util [ERROR] Async loop died!
>>>  java.lang.RuntimeException: org.zeromq.ZMQException: Invalid
>>> argument(0x16)
>>> at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>>> at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>>> at
>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>>> at
>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
>>> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>> at clojure.lang.AFn.run(AFn.java:24)
>>> at java.lang.Thread.run(Thread.java:744)
>>> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>>> at org.zeromq.ZMQ$Socket.send(Native Method)
>>> at zilch.mq$send.invoke(mq.clj:93)
>>> at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
>>> at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>>> at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
>>> at
>>> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>>> at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>>> ... 6 more
>>> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
>>> 2014-01-31 02:58:24 executor [INFO] Processing received message source:
>>> __system:-1, stream: __tick, id: {}, [30]
>>>
>>> I see the above exception almost immediately upon which my spout emits
>>> the first tuple from the queue. I have pared down my topology so there is
>>> just one spout and no bolts so as to narrow the problem down but the only
>>> time I can keep the spout running is if I omit the collector.emit call
>>> itself.
>>>
>>> I'm not sure if it would make a difference but the supervisor has three
>>> slots and this topology would occupy two of them, however, when configured
>>> with two I get the above exception, when configured with one, everything
>>> works fine.
>>>
>>>
>>>
>>
>

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Reply via email to