Re: Restarting NiFi causing SiteToSiteBulletinReportingTask to fail

Woodhead, Chad Thu, 19 Apr 2018 07:23:04 -0700

Hi Pierre,

Thanks for the updates and HCC comments as well.


-Chad

On 4/18/18, 5:36 AM, "Pierre Villard" <[email protected]> wrote:

    I created to https://issues.apache.org/jira/browse/NIFI-5092 to track the
    issue. Will submit a fix really soon.
    Current workaround: after a NiFi restart, stop the reporting task, clear
    the state of the reporting task and start the reporting task.
    
    Pierre
    
    2018-04-18 0:04 GMT+02:00 Pierre Villard <[email protected]>:
    
    > Hi Chad,
    >
    > I confirm that I can reproduce the issue on my side with a NiFi 1.5.0
    > cluster and I don't see anything that would fix it in NiFi 1.6.0.
    >
    > I had a closer look and it does not seem related to the Site-to-Site
    > mechanism: the thread in charge of refreshing the peers is correctly
    > running and you should see logs like "Successfully refreshed Peer Status;
    > remote instance consists of X peers".
    >
    > As far as I can see, it sounds related to how we are caching the ID of the
    > last bulletin sent and how we retrieve this value to "restart" the task
    > after the NiFi node restarted. That's why you have to delete the task and
    > create it again: it'll delete the associated cache.
    >
    > That's just an assumption after a quick look, I'll keep digging tomorrow
    > and open a JIRA for that.
    >
    > Thanks for reporting it!
    >
    > Pierre
    >
    >
    > 2018-04-12 23:41 GMT+02:00 Pierre Villard <[email protected]>:
    >
    >> Hi Chad,
    >>
    >> I believe this could have been fixed recently but I've very limited
    >> access right now (and for the next few days) and can't be sure...
    >> I will check next week if no one gave you feedbacks before.
    >>
    >> Pierre
    >>
    >> 2018-04-12 19:57 GMT+02:00 Woodhead, Chad <[email protected]>:
    >>
    >>> I am running HDF 
https://urldefense.proofpoint.com/v2/url?u=http-3A__3.0.1.1&d=DwIBaQ&c=gJN2jf8AyP5Q6Np0yWY19w&r=MJ04HXP0mOz9-J4odYRNRx3ln4A_OnHTjJvmsZOEG64&m=HjckJSegMO_Vjm51wNuSBdY4V9QxOuWuJGoOWv-Q1hs&s=EHpb-XSM3jNvt8gU9Ozx8o9sSTZF0V4BgIZqCBDSn2g&e=
 which comes with NiFi 1.2.0.3.0.1.1-5. We are
    >>> using SiteToSiteBulletinReportingTask to monitor bulletins (for things
    >>> like Disk Usage and Memory Usage). When we restart NiFi via Ambari 
(either
    >>> with a Restart or Stop and then Start), when NiFi comes back up the
    >>> SiteToSiteBulletinReportingTask no longer works. It throws the
    >>> following error when it is first trying to start up:
    >>>
    >>> SiteToSiteBulletinReportingTask[id=ba6b4499-0162-1000-0000-00003ccd7573]
    >>> org.apache.nifi.remote.client.PeerSelector@34e976af Unable to refresh
    >>> Remote Group's peers due to response code 409:Conflict with explanation:
    >>> null
    >>>
    >>> No matter how long we wait, it never works. The ways I have been able to
    >>> get it to start working again are as follows:
    >>>
    >>>   *   Stop and then Start the Remote Input Port the
    >>> SiteToSiteBulletinReportingTask is using
    >>>   *   Delete the SiteToSiteBulletinReportingTask and create a new one
    >>>   *   Wait a while and stop and start the 
SiteToSiteBulletinReportingTask
    >>> (however this doesn't work consistently)
    >>>
    >>> I have tested the same flow steps using a process that uses a Remote
    >>> Process Group and a different Remote Input Port, and that RPG throws the
    >>> same error when first coming up but then starts working after a period 
of
    >>> time. So maybe the SiteToSiteBulletinReportingTask isn't trying enough
    >>> times to connect to the Remote Input Port?
    >>>
    >>> Sincerely,
    >>> Chad Woodhead
    >>>
    >>
    >>
    >

Re: Restarting NiFi causing SiteToSiteBulletinReportingTask to fail

Reply via email to