Re: Restarting NiFi causing SiteToSiteBulletinReportingTask to fail

2018-04-19 Thread Woodhead, Chad
Hi Pierre,

Thanks for the updates and HCC comments as well.

-Chad

On 4/18/18, 5:36 AM, "Pierre Villard" <pierre.villard...@gmail.com> wrote:

I created to https://issues.apache.org/jira/browse/NIFI-5092 to track the
issue. Will submit a fix really soon.
Current workaround: after a NiFi restart, stop the reporting task, clear
the state of the reporting task and start the reporting task.

Pierre

2018-04-18 0:04 GMT+02:00 Pierre Villard <pierre.villard...@gmail.com>:

> Hi Chad,
>
> I confirm that I can reproduce the issue on my side with a NiFi 1.5.0
> cluster and I don't see anything that would fix it in NiFi 1.6.0.
>
> I had a closer look and it does not seem related to the Site-to-Site
> mechanism: the thread in charge of refreshing the peers is correctly
> running and you should see logs like "Successfully refreshed Peer Status;
> remote instance consists of X peers".
>
> As far as I can see, it sounds related to how we are caching the ID of the
> last bulletin sent and how we retrieve this value to "restart" the task
> after the NiFi node restarted. That's why you have to delete the task and
> create it again: it'll delete the associated cache.
>
> That's just an assumption after a quick look, I'll keep digging tomorrow
> and open a JIRA for that.
>
> Thanks for reporting it!
>
> Pierre
>
>
> 2018-04-12 23:41 GMT+02:00 Pierre Villard <pierre.villard...@gmail.com>:
>
>> Hi Chad,
>>
>> I believe this could have been fixed recently but I've very limited
>> access right now (and for the next few days) and can't be sure...
>> I will check next week if no one gave you feedbacks before.
>>
>> Pierre
>>
>> 2018-04-12 19:57 GMT+02:00 Woodhead, Chad <chad.woodh...@ncr.com>:
>>
>>> I am running HDF 
https://urldefense.proofpoint.com/v2/url?u=http-3A__3.0.1.1=DwIBaQ=gJN2jf8AyP5Q6Np0yWY19w=MJ04HXP0mOz9-J4odYRNRx3ln4A_OnHTjJvmsZOEG64=HjckJSegMO_Vjm51wNuSBdY4V9QxOuWuJGoOWv-Q1hs=EHpb-XSM3jNvt8gU9Ozx8o9sSTZF0V4BgIZqCBDSn2g=
 which comes with NiFi 1.2.0.3.0.1.1-5. We are
>>> using SiteToSiteBulletinReportingTask to monitor bulletins (for things
>>> like Disk Usage and Memory Usage). When we restart NiFi via Ambari 
(either
>>> with a Restart or Stop and then Start), when NiFi comes back up the
>>> SiteToSiteBulletinReportingTask no longer works. It throws the
>>> following error when it is first trying to start up:
>>>
>>> SiteToSiteBulletinReportingTask[id=ba6b4499-0162-1000--3ccd7573]
>>> org.apache.nifi.remote.client.PeerSelector@34e976af Unable to refresh
>>> Remote Group's peers due to response code 409:Conflict with explanation:
>>> null
>>>
>>> No matter how long we wait, it never works. The ways I have been able to
>>> get it to start working again are as follows:
>>>
>>>   *   Stop and then Start the Remote Input Port the
>>> SiteToSiteBulletinReportingTask is using
>>>   *   Delete the SiteToSiteBulletinReportingTask and create a new one
>>>   *   Wait a while and stop and start the 
SiteToSiteBulletinReportingTask
>>> (however this doesn't work consistently)
>>>
>>> I have tested the same flow steps using a process that uses a Remote
>>> Process Group and a different Remote Input Port, and that RPG throws the
>>> same error when first coming up but then starts working after a period 
of
>>> time. So maybe the SiteToSiteBulletinReportingTask isn't trying enough
>>> times to connect to the Remote Input Port?
>>>
>>> Sincerely,
>>> Chad Woodhead
>>>
>>
>>
>




Restarting NiFi causing SiteToSiteBulletinReportingTask to fail

2018-04-12 Thread Woodhead, Chad
I am running HDF 3.0.1.1 which comes with NiFi 1.2.0.3.0.1.1-5. We are using 
SiteToSiteBulletinReportingTask to monitor bulletins (for things like Disk 
Usage and Memory Usage). When we restart NiFi via Ambari (either with a Restart 
or Stop and then Start), when NiFi comes back up the 
SiteToSiteBulletinReportingTask no longer works. It throws the following error 
when it is first trying to start up:

SiteToSiteBulletinReportingTask[id=ba6b4499-0162-1000--3ccd7573] 
org.apache.nifi.remote.client.PeerSelector@34e976af Unable to refresh Remote 
Group's peers due to response code 409:Conflict with explanation: null

No matter how long we wait, it never works. The ways I have been able to get it 
to start working again are as follows:

  *   Stop and then Start the Remote Input Port the 
SiteToSiteBulletinReportingTask is using
  *   Delete the SiteToSiteBulletinReportingTask and create a new one
  *   Wait a while and stop and start the SiteToSiteBulletinReportingTask 
(however this doesn't work consistently)

I have tested the same flow steps using a process that uses a Remote Process 
Group and a different Remote Input Port, and that RPG throws the same error 
when first coming up but then starts working after a period of time. So maybe 
the SiteToSiteBulletinReportingTask isn't trying enough times to connect to the 
Remote Input Port?

Sincerely,
Chad Woodhead