Re: Primary node vs shutdown

Mark Payne Thu, 09 Sep 2021 10:35:30 -0700

Phil,

When you perform a queue listing, it will only show you data that is sitting in 
the 'active queue' - i.e., the queue that processors can pull from. Data that 
is queued up to go to another node will not show up there. So if the queue 
shows that there are FlowFiles on a load-balanced queue, but the queue listing 
won't show them, it likely means that the data is waiting to be pushed to 
another node.


When you choose to offload the node, it *Should* push the data to other nodes 
in the cluster. The other nodes in the cluster will then hold onto the data, 
waiting for that node to come back and when it comes back, they will send the 
data to that node. There are a couple of fixes that went into 1.14.0 around 
offloading nodes [1]. So updating to 1.14.0, I suspect, will address any issues 
where the node sits there, not fully offloading its data.

It is important to note, though, that by design the data on other nodes will 
remain there, queued up. The nodes will not choose another node to push data 
to. That will only happen if the chosen node is removed from the cluster 
entirely. If the intent is to completely remove the node from the cluster, in 
1.14.0, you can also use "bin/nifi.sh decommission" to shut it down. That will 
disconnect the node, offload it, remove it from the cluster, and then shutdown.

Cheers
-Mark

[1] 
https://issues.apache.org/jira/issues/?jql=project%20%3D%2012316020%20AND%20fixVersion%20%3D%2012349644%20AND%20summary%20~%20offload%20ORDER%20BY%20priority%20DESC%2C%20key%20ASC


On 2021/09/05 04:50:09, Phil H <[email protected]> wrote: 
> Thanks Joe,
> 
> So is it possibly the the processor is not correctly handling the flow
> files, and that this only becomes apparent when one tries to offload data?
> Because in normal operation they appear to work fine (have been operating
> outside of a cluster context for some years)
> 
> Thanks,
> Phil
> 
> On Thu, 2 Sep 2021 at 13:48, Joe Witt <[email protected]> wrote:
> 
> > Phil.  The behavior you mentioned sounds like that processor pulled flow
> > files from the queue but had not yet transferred them anywhere.  If you see
> > that again I strongly recommend you gather a thread dump.
> >
> > Joe
> >
> > On Wed, Sep 1, 2021 at 7:56 PM Phil H <[email protected]> wrote:
> >
> > > Hi Joe,
> > >
> > > It’s a custom one, but it is effectively just a routing filter component
> > > (read the data, send the flow file out on relationship A or B based on
> > what
> > > it finds).  Nothing exotic in terms of how it interacts with the
> > flowfiles.
> > >
> > > After restarting all nodes, the queue worked normally again.
> > >
> > > Phil
> > >
> > > > On 2 Sep 2021, at 12:02 pm, Joe Witt <[email protected]> wrote:
> > > >
> > > > Phil
> > > >
> > > > What processor reads from that queue that appears unmoving?
> > > >
> > > > Thanks
> > > >
> > > > On Wed, Sep 1, 2021 at 3:51 PM Phil H <[email protected]> wrote:
> > > >
> > > >> And once reconnected again, no data passes that queue - it all just
> > > piles
> > > >> up there (the queue count matching the number of items sent into the
> > > >> cluster). However if I try and list the queue, it claims there are no
> > > files
> > > >> in it. Very very confused!
> > > >>
> > > >> On Thu, 2 Sep 2021 at 08:39, Phil H <[email protected]> wrote:
> > > >>
> > > >>> Okay, found the offload, but the data is still stuck on the
> > “offloaded”
> > > >>> node, in a “single node” queue (I am bringing the data to a single
> > node
> > > >> to
> > > >>> deduplicate multiple parallel inputs).
> > > >>>
> > > >>> If I refresh the UI, I can see the missing items numbered in the
> > queue,
> > > >>> but can’t open the queue because the other node is “currently
> > > offloaded”.
> > > >>>
> > > >>> I’m sure I’m just missing something here??
> > > >>>
> > > >>> On Thu, 2 Sep 2021 at 08:20, Shawn Weeks <[email protected]>
> > > >>> wrote:
> > > >>>
> > > >>>> On newer versions there is an option in the UI to Offload the data
> > if
> > > >> you
> > > >>>> have NiFi's cluster load balancing setup. Then you'd disconnect the
> > > node
> > > >>>> and shut it down.
> > > >>>>
> > > >>>> Thanks
> > > >>>> Shawn
> > > >>>>
> > > >>>> -----Original Message-----
> > > >>>> From: Phil H <[email protected]>
> > > >>>> Sent: Wednesday, September 1, 2021 4:36 PM
> > > >>>> To: [email protected]
> > > >>>> Subject: Primary node vs shutdown
> > > >>>>
> > > >>>> Hi there,
> > > >>>>
> > > >>>> I am noticing a number of situations where shutting down one node
> > in a
> > > >>>> cluster is leaving data stranded in the flows on that shut down
> > > server.
> > > >>>>
> > > >>>> Is there any way to tell NiFi to ship data off to other cluster
> > > members
> > > >>>> before it shuts down?  Note I am restarting via the nifi.sh script,
> > > not
> > > >>>> just killing the process/host with no notice
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Phil
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: Primary node vs shutdown

Reply via email to