Re: Topology dies immediately upon deployment when configured with two workers instead of one

Nathan Leung Fri, 31 Jan 2014 08:36:01 -0800

Are you also using the forked version of jzmq?


On Fri, Jan 31, 2014 at 11:26 AM, Mark Greene <[email protected]> wrote:

> We are on 2.1.7 on all environments, they are all managed by chef.
>
>
> On Fri, Jan 31, 2014 at 9:51 AM, Nathan Leung <[email protected]> wrote:
>
>> It can work with ZMQ, but you MUST use the version specified (2.1.7).
>>  Newer versions change the API which causes errors, which might be what you
>> are seeing.  Is the version of libzmq you installed the same as the one you
>> are using in production?
>>
>>
>> On Fri, Jan 31, 2014 at 9:47 AM, Mark Greene <[email protected]> wrote:
>>
>>> Storm uses the internal queuing (through ZMQ) only when there is a
>>>> communication between two worker processes is required,which is why this
>>>> error comes up only when you set num_workers>1.
>>>
>>>
>>> I'm a little confused by the answer, are you suggesting that storm
>>> cannot run more than one worker even with the correct (older) version of
>>> ZMQ?
>>>
>>> What's unique about the environment I was having trouble with is it only
>>> had 1 supervisor where as my prod environment has multiple supervisors and
>>> I am not seeing a problem there.
>>>
>>>
>>> On Thu, Jan 30, 2014 at 11:59 PM, bijoy deb 
>>> <[email protected]>wrote:
>>>
>>>> Hi Mark,
>>>>
>>>> Storm uses the internal queuing (through ZMQ) only when there is a
>>>> communication between two worker processes is required,which is why this
>>>> error comes up only when you set num_workers>1.
>>>>
>>>> Though I won't be able to help with with an exact solution for this,I
>>>> can provide some pointers:
>>>>
>>>> a) Regarding the reason for the error,documentation says that The
>>>> ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
>>>> b) In a future version of Storm (don't recollect the exact version
>>>> number or if it has already been released),they are supposed to remove the
>>>> ZMQ dependency at all,so the above error should not be coming then.
>>>>
>>>> Thanks
>>>> Bijoy
>>>>
>>>>
>>>> On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <[email protected]> wrote:
>>>>
>>>>> Exception in log:
>>>>>
>>>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout default
>>>>> [[B@38fc659c]
>>>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
>>>>> [1863657906985036001 0 2]
>>>>> 2014-01-31 02:58:14 util [ERROR] Async loop died!
>>>>> java.lang.RuntimeException: org.zeromq.ZMQException: Invalid
>>>>> argument(0x16)
>>>>> at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>>>>>  at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>>>>> at
>>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>>>>>  at
>>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
>>>>> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>>>>  at clojure.lang.AFn.run(AFn.java:24)
>>>>> at java.lang.Thread.run(Thread.java:744)
>>>>> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>>>>>  at org.zeromq.ZMQ$Socket.send(Native Method)
>>>>> at zilch.mq$send.invoke(mq.clj:93)
>>>>>  at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
>>>>> at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>>>>>  at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
>>>>> at
>>>>> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>>>>>  at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>>>>> ... 6 more
>>>>> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
>>>>> 2014-01-31 02:58:24 executor [INFO] Processing received message
>>>>> source: __system:-1, stream: __tick, id: {}, [30]
>>>>>
>>>>> I see the above exception almost immediately upon which my spout emits
>>>>> the first tuple from the queue. I have pared down my topology so there is
>>>>> just one spout and no bolts so as to narrow the problem down but the only
>>>>> time I can keep the spout running is if I omit the collector.emit call
>>>>> itself.
>>>>>
>>>>> I'm not sure if it would make a difference but the supervisor has
>>>>> three slots and this topology would occupy two of them, however, when
>>>>> configured with two I get the above exception, when configured with one,
>>>>> everything works fine.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Reply via email to