Re: Worker Reassignment - Difference between Storm 2.x and Storm 1.x

Surajeet Dev Thu, 27 May 2021 14:49:01 -0700

Hi Kishor,

Below is the test I performed . Worker reassignments happened with below
settings.  Is there any config settings which I missed or should any of
these combinations have had different values?

*Storm Cluster*:
   4 Node cluster  (with pacemaker) , with 8 topologies running ( 1-1
mapping between topologies and workers).

   I changed the below settings from their default value

   nimbus.task.timeout.secs: 60 (defalt: 30)
   supervisor.heartbeat.frequency.secs 40 (default: 5)
   supervisor.worker.shutdown.sleep.secs 15  (default: 3)
   task.heartbeat.frequency.secs 40 (default:3 , i see that this field has
been deprecated , but changed it anyways in case it still takes effect)
   worker.heartbeat.frequency.secs 40  (default: 1)
   supervisor.monitor.frequency.secs 40 (default: 3)

  (All the above values are copied from Storm UI , to ensure the config
changes did reflect)

*Sequence of Events*:
 Node1 is the leader , the others are non-leader
*Node1 nimbus log*:
2021-05-27 20:59:20.179 o.a.s.n.LeaderListenerCallback [INFO] Accepting
leadership, all active topologies and corresponding dependencies found
locally.
2021-05-27 20:59:20.179 o.a.s.z.LeaderListenerCallbackFactory [INFO] node1
gained leadership.
2021-05-27 20:59:20.180 o.a.s.m.StormMetricsRegistry [INFO] Started
statistics report plugin...
2021-05-27 20:59:20.182 o.a.s.d.n.Nimbus [INFO] Starting nimbus server for
storm version '2.2.0'
...
2021-05-27 20:59:54.206 o.a.s.c.PaceMakerStateStorage [DEBUG] Successful
get_worker_hb_children

It received heartbeats from workers through pacemaker

*I then stopped nimbus/supervisor/pacemaker in node2 (non-leader) . There
was no reassignment. Brought them back up I then stopped
nimbus/supervisor/pacemaker in node1 (leader) . Node3 became leader.  Below
are the logs*
*Node1 nimbus log*:
2021-05-27 21:18:39.569 o.a.s.s.i.n.u.c.D.rejectedExecution [ERROR] Failed
to submit a listener notification task. Event loop shut down?
java.util.concurrent.RejectedExecutionException: event executor terminated

*Node3 logs*:
2021-05-27 21:18:39.818 o.a.s.n.LeaderListenerCallback [INFO] Accepting
leadership, all active topologies and corresponding dependencies found
locally.
2021-05-27 21:18:39.818 o.a.s.z.LeaderListenerCallbackFactory [INFO] node3
gained leadership.
2021-05-27 21:18:39.978 o.a.s.s.o.a.z.ClientCnxn [DEBUG] Got ping response
for sessionid: 0x800d3cd26e37703 after 2ms

As you can see from above between node1 shutdown and noder3 gaining
leadership there was a gap of around 363 ms
But there were worker reassignments.

I also tried keeping heartbeat frequency as default values and only
increased timeout
  supervisor.heartbeat.frequency.secs 3
  task.heartbeat.frequency.secs 3
  worker.heartbeat.frequency.secs 1
  supervisor.monitor.frequency.secs 3

The other default timeout values are >=60 secs

Regards
Surajeet

On Wed, May 26, 2021 at 5:48 PM Kishor Patil <[email protected]>
wrote:

> Hello Surajeet,
>
> This is a design change in the Storm 2.x as part of reducing Zookeeper
> dependency. We route worker heartbeats via Supervisor to nimbus. if
> supervisor is down, the worker attempts to send heartbeats directly to
> nimbus. In this case, it looks like change in leadership of nimbus could
> have taken longer than heartbeat timeouts making new nimbus think of these
> workers requiring rescheduling on other supervisor nodes.
> Current Storm version allows to fall back to using Pacemaker instead of
> Zookeeper if you prefer that option, but it requires pacemaker setup on
> cluster to avoid overloading zookeeper.
>
> Please let me know if you have any further questions.
>
> -Kishor
>
> On 2021/05/26 20:22:36, Surajeet Dev <[email protected]> wrote:
> > We are currently on Storm 1.2.1 and was in the process of upgrading it to
> > Storm 2.2.0
> > Observed the below while upgrading it to 2.2.0:
> >
> > 1) In a storm cluster (4 nodes) with 8 topologies running  ( with a
> mapping
> > of 1-1 between worker and topologies), when i bring down
> nimbus,supervisor
> > in one of the node (let's say Node 1, which is not nimbus leader) the
> > workers running on that node gets reassigned to other 3, even though it
> is
> > running on that node (Node 1). So i have 2 worker process for the same
> > topology running at the same time ( saw the behaviour with or without
> using
> > pacemaker). The worker process does get killed when nimbus and supervisor
> > is brought up in Node 1
> >
> > 2) Observed from worker logs that it sends heartbeat to local supervisor
> > and nimbus leader , which with 1.2.1 used to happen using Zookeeper ( i
> saw
> > this behaviour in 2.2.0 with or without using Pacemaker).
> > If i bring down nimbus and supervisor on node where nimbus is a leader,
> it
> > reassigns worker processes and in some cases leads to zombie worker
> > processess ( is not killed when storm kill is executed)
> >
> > These above behaviour (reassignment of worker) doesn't happen with Storm
> > 1.2.1
> >
> > Since this is a fundamental design change between 1.x and 2.x , are there
> > any documentation which describes it in detail? ( couldn't find from
> > Release Notes)
> >
>

Re: Worker Reassignment - Difference between Storm 2.x and Storm 1.x

Reply via email to