[jira] [Commented] (NIFI-7716) Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect
[ https://issues.apache.org/jira/browse/NIFI-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229264#comment-17229264 ] Kourge commented on NIFI-7716: -- Hi [~V1ncent24], Are you sure that the same Twitter Consumer Key/Secret and Access Token/Secret are not used by another GetTwitter processor or by another application? In my experience the "420 Enhance Your Calm" exception often occurs when the very same Twitter API credentials are used simultaneously. > Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to > reconnect > > > Key: NIFI-7716 > URL: https://issues.apache.org/jira/browse/NIFI-7716 > Project: Apache NiFi > Issue Type: Bug > Components: NiFi Stateless >Affects Versions: 1.11.3 > Environment: Nifi on AWS EC2 instance >Reporter: Vincent Naveen >Priority: Critical > Attachments: ErrorOnGetTwitter.JPG > > > We are using python script in the "ExecuteStreamCommand" processor which > takes the Twitter user from the database from the incoming flow file > generated by the "QueryTableRecord" process. We are able to successfully stop > the processor and update the IDs_To_Follow via parameter context in > GetTwitter processor and to start the GetTwitter processor.The issue is once > the GetTwitter processor starts, it is throwing the error as below,Received > error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect > Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not > attempt to reconnect > 12:42:46 ISTERRORbe2a0f28-0173-1000-0bbc-4401fbf6d330We came to know that, > wait time needs to be added whenever doing the REST Api calls and which we > have implemented but still we are getting the same issue. We changed the Max > Client Retry count to 50 as well. Still the issue is not fixed.While browsing > the below link, we have to do something in the java coding languague but we > are using python script in the ExecuteStreamCommand. > https://issues.apache.org/jira/browse/NIFI-5953 > [https://github.com/apache/nifi/pull/3276] > Could you please help us here, how we need to handle this scenario. Thanks in > Advance! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-6905) GetTwitter processor, configured to run on primary node only, initializes connection to Twitter API from every NiFi cluster node, even on non-primary nodes
[ https://issues.apache.org/jira/browse/NIFI-6905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985090#comment-16985090 ] Kourge commented on NIFI-6905: -- I have implemented a solution (#2 in the ticket description). I updated the *GetTwitter* processor `*onScheduled()*` method to only create a `*clientBuilder*` without connecting it to the Twitter API. Connection is now initialized by the `*onTrigger()*` method when it needs it (in primary node only mode `*onTrigger()*` never run on non primary nodes). Added `*onPrimaryNodeChange()`* to close connection on `*PRIMARY_NODE_REVOKED*` events. Please review the pull request. > GetTwitter processor, configured to run on primary node only, initializes > connection to Twitter API from every NiFi cluster node, even on non-primary > nodes > --- > > Key: NIFI-6905 > URL: https://issues.apache.org/jira/browse/NIFI-6905 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0 >Reporter: Kourge >Assignee: Kourge >Priority: Major > Labels: getTwitter > Time Spent: 10m > Remaining Estimate: 0h > > I have a *GetTwitter* processor running on a 3-nodes NiFi cluster and > configured to be executed on the primary node only. > The symptom is that there is a too high frequency of HTTP 420 ("Enhance Your > Calm") exceptions on GetTwitter processor start. > I made the following tests: > * With only 1 NiFi node. I was able to start/stop GetTwitter processor 10 > times in a raw without any errors. > * With 2 NiFi nodes running, HTTP 420 errors occurred after a few start/stop > (sometimes even after a single start). > After an analysis of the source code and knowing > https://issues.apache.org/jira/browse/NIFI-2592 I came to the conclusion that > the GetTwitter processor is initializing the connection to Twitter API on > each node of the cluster, even to non-primary nodes. > The `*onScheduled()*` method is run on every node (see: NIFI-2592) making > connections to Twitter with `*client.connect()*`. Then the `*onTrigger()*` > method consumes the tweets normally from the primary node. > Issue is that having more that one node initializing connections make Twitter > API raise HTTP 420 errors. > {code:java} > ERROR > org.apache.nifi.processors.twitter.GetTwitter > GetTwitter[id=XYZ] Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. > Will attempt to reconnect > {code} > +*Proposed solutions:*+ > # Change the behavior of `*onScheduled()*` method to run only on primary > node (as proposed in NIFI-2592) > # Update GetTwitter processor implementation to not call > `*client.connect()*` anymore from the `*onScheduled()*` method but only when > *PrimaryNodeState* changes to *ELECTED_PRIMARY_NODE* (And when > *PrimaryNodeState* changes to *PRIMARY_NODE_REVOKED*: perform a > `*client.stop()*`) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (NIFI-6905) GetTwitter processor, configured to run on primary node only, initializes connection to Twitter API from every NiFi cluster node, even on non-primary nodes
[ https://issues.apache.org/jira/browse/NIFI-6905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kourge reassigned NIFI-6905: Assignee: Kourge > GetTwitter processor, configured to run on primary node only, initializes > connection to Twitter API from every NiFi cluster node, even on non-primary > nodes > --- > > Key: NIFI-6905 > URL: https://issues.apache.org/jira/browse/NIFI-6905 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0 >Reporter: Kourge >Assignee: Kourge >Priority: Major > Labels: getTwitter > Time Spent: 10m > Remaining Estimate: 0h > > I have a *GetTwitter* processor running on a 3-nodes NiFi cluster and > configured to be executed on the primary node only. > The symptom is that there is a too high frequency of HTTP 420 ("Enhance Your > Calm") exceptions on GetTwitter processor start. > I made the following tests: > * With only 1 NiFi node. I was able to start/stop GetTwitter processor 10 > times in a raw without any errors. > * With 2 NiFi nodes running, HTTP 420 errors occurred after a few start/stop > (sometimes even after a single start). > After an analysis of the source code and knowing > https://issues.apache.org/jira/browse/NIFI-2592 I came to the conclusion that > the GetTwitter processor is initializing the connection to Twitter API on > each node of the cluster, even to non-primary nodes. > The `*onScheduled()*` method is run on every node (see: NIFI-2592) making > connections to Twitter with `*client.connect()*`. Then the `*onTrigger()*` > method consumes the tweets normally from the primary node. > Issue is that having more that one node initializing connections make Twitter > API raise HTTP 420 errors. > {code:java} > ERROR > org.apache.nifi.processors.twitter.GetTwitter > GetTwitter[id=XYZ] Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. > Will attempt to reconnect > {code} > +*Proposed solutions:*+ > # Change the behavior of `*onScheduled()*` method to run only on primary > node (as proposed in NIFI-2592) > # Update GetTwitter processor implementation to not call > `*client.connect()*` anymore from the `*onScheduled()*` method but only when > *PrimaryNodeState* changes to *ELECTED_PRIMARY_NODE* (And when > *PrimaryNodeState* changes to *PRIMARY_NODE_REVOKED*: perform a > `*client.stop()*`) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (NIFI-6906) GetTwitter processor throws "invalid cookie header" exceptions when connecting to Twitter API
Kourge created NIFI-6906: Summary: GetTwitter processor throws "invalid cookie header" exceptions when connecting to Twitter API Key: NIFI-6906 URL: https://issues.apache.org/jira/browse/NIFI-6906 Project: Apache NiFi Issue Type: Bug Components: Extensions Affects Versions: 1.9.2 Reporter: Kourge GetTwitter processor throws "invalid cookie header" exceptions (WARN) when connecting to Twitter API: {code:java} org.apache.http.client.protocol.ResponseProcessCookies Invalid cookie header: "set-cookie: guest_id=XYZ; Max-Age=63072000; Expires=Thu, 25 Nov 2021 11:08:52 GMT; Path=/; Domain=.twitter.com". Invalid 'expires' attribute: Thu, 25 Nov 2021 11:08:52 GMT {code} HttpClient's CookieSpec may need to be configured. [https://stackoverflow.com/questions/36473478/fixing-httpclient-warning-invalid-expires-attribute-using-fluent-api/40697322] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (NIFI-6905) GetTwitter processor, configured to run on primary node only, initializes connection to Twitter API from every NiFi cluster node, even on non-primary nodes
Kourge created NIFI-6905: Summary: GetTwitter processor, configured to run on primary node only, initializes connection to Twitter API from every NiFi cluster node, even on non-primary nodes Key: NIFI-6905 URL: https://issues.apache.org/jira/browse/NIFI-6905 Project: Apache NiFi Issue Type: Bug Components: Extensions Affects Versions: 1.0.0 Reporter: Kourge I have a *GetTwitter* processor running on a 3-nodes NiFi cluster and configured to be executed on the primary node only. The symptom is that there is a too high frequency of HTTP 420 ("Enhance Your Calm") exceptions on GetTwitter processor start. I made the following tests: * With only 1 NiFi node. I was able to start/stop GetTwitter processor 10 times in a raw without any errors. * With 2 NiFi nodes running, HTTP 420 errors occurred after a few start/stop (sometimes even after a single start). After an analysis of the source code and knowing https://issues.apache.org/jira/browse/NIFI-2592 I came to the conclusion that the GetTwitter processor is initializing the connection to Twitter API on each node of the cluster, even to non-primary nodes. The `*onScheduled()*` method is run on every node (see: NIFI-2592) making connections to Twitter with `*client.connect()*`. Then the `*onTrigger()*` method consumes the tweets normally from the primary node. Issue is that having more that one node initializing connections make Twitter API raise HTTP 420 errors. {code:java} ERROR org.apache.nifi.processors.twitter.GetTwitter GetTwitter[id=XYZ] Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect {code} +*Proposed solutions:*+ # Change the behavior of `*onScheduled()*` method to run only on primary node (as proposed in NIFI-2592) # Update GetTwitter processor implementation to not call `*client.connect()*` anymore from the `*onScheduled()*` method but only when *PrimaryNodeState* changes to *ELECTED_PRIMARY_NODE* (And when *PrimaryNodeState* changes to *PRIMARY_NODE_REVOKED*: perform a `*client.stop()*`) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (NIFI-5953) GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted
[ https://issues.apache.org/jira/browse/NIFI-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kourge updated NIFI-5953: - Fix Version/s: 1.9.0 > GetTwitter processor throws Enhance Your Calm exceptions then fails with > Retries exhausted > -- > > Key: NIFI-5953 > URL: https://issues.apache.org/jira/browse/NIFI-5953 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Kourge >Priority: Major > Fix For: 1.9.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Hi, > I am using the GetTwitter processor, with the Filter Endpoint. > The issue is that I am often getting series of `*Received error HTTP_ERROR: > HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions. > These are followed by one `*Received error STOPPED_BY_ERROR: Retries > exhausted due to null. Will not attempt to reconnect*` exception and then the > processor don't get any more tweet from Twitter endpoint. > I am getting rate limited by Twitter API. I am running a NiFi cluster so I am > running GetTwitter process on the Primary Node only to prevent using the same > credentials several times in parallel. > I tried to apply the configuration recommendation from this mailing list: > > <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E] > But raising "run schedule" parameter to 60 seconds does not help in my case > since I target reading between 100 and 200 tweets per minute. Setting "run > schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't > be able to consume Twitter API tweets queue. > +*Proposed solution*+ > I analyzed the `*GetTwitter.java*` implementation and noticed that the > `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter > endpoint on `*HTTP_ERROR*`. > The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are > `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already > manage reconnection. > Twitter HBC library client is making retries with an increasing wait delay > by its own; with 5 retries by default. > More, it seam that the `*client.reconnect();*` don't work in my case and this > brings to be kicked off the Twitter API earlier because that method is called > too often. > My proposed solution is the following (tested on my local development) > *1. Letting Twitter HBC library client making the connection retries on > `HTTP/1.1 420 Enhance Your Calm` messages.* > The `*onTrigger()*` method should be updated to not try to reconnect in case > of `*HTTP_ERROR*` with message equal to `*HTTP/1.1 420 Enhance Your Calm*`: > {code:java} > case HTTP_ERROR: > if (!event.getMessage().equals("HTTP/1.1 420 Enhance Your Calm")) { > getLogger().error("Received error {}: {}. Will attempt to > reconnect", new Object[{event.getEventType(), event.getMessage()}); > client.reconnect(); > } > else { > getLogger().error("Received error {}: {}. Will not attempt to > reconnect", new Object[]{event.getEventType(), event.getMessage()}); > } > break; > {code} > *2. Parameterize maximum number of connection retries* > I also noticed that the default number of retries on the Twitter HBC library > is sometimes too low (5 times). > So it would be useful to add a GetTwitter processor property named `*Max > Connection Retries*`. In my usage I found that `*10*` is a good value. > Then update the `*onSchedule()*` method with this line (replacing `*10*` by > the value of `*Max Connection Retries*`) > {code:java} > clientBuilder.retries(10); // default value is 5 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5953) GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted
[ https://issues.apache.org/jira/browse/NIFI-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765887#comment-16765887 ] Kourge commented on NIFI-5953: -- Hello, Is anyone available to review this PR? https://github.com/apache/nifi/pull/3276 > GetTwitter processor throws Enhance Your Calm exceptions then fails with > Retries exhausted > -- > > Key: NIFI-5953 > URL: https://issues.apache.org/jira/browse/NIFI-5953 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Kourge >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Hi, > I am using the GetTwitter processor, with the Filter Endpoint. > The issue is that I am often getting series of `*Received error HTTP_ERROR: > HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions. > These are followed by one `*Received error STOPPED_BY_ERROR: Retries > exhausted due to null. Will not attempt to reconnect*` exception and then the > processor don't get any more tweet from Twitter endpoint. > I am getting rate limited by Twitter API. I am running a NiFi cluster so I am > running GetTwitter process on the Primary Node only to prevent using the same > credentials several times in parallel. > I tried to apply the configuration recommendation from this mailing list: > > <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E] > But raising "run schedule" parameter to 60 seconds does not help in my case > since I target reading between 100 and 200 tweets per minute. Setting "run > schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't > be able to consume Twitter API tweets queue. > +*Proposed solution*+ > I analyzed the `*GetTwitter.java*` implementation and noticed that the > `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter > endpoint on `*HTTP_ERROR*`. > The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are > `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already > manage reconnection. > Twitter HBC library client is making retries with an increasing wait delay > by its own; with 5 retries by default. > More, it seam that the `*client.reconnect();*` don't work in my case and this > brings to be kicked off the Twitter API earlier because that method is called > too often. > My proposed solution is the following (tested on my local development) > *1. Letting Twitter HBC library client making the connection retries on > `HTTP/1.1 420 Enhance Your Calm` messages.* > The `*onTrigger()*` method should be updated to not try to reconnect in case > of `*HTTP_ERROR*` with message equal to `*HTTP/1.1 420 Enhance Your Calm*`: > {code:java} > case HTTP_ERROR: > if (!event.getMessage().equals("HTTP/1.1 420 Enhance Your Calm")) { > getLogger().error("Received error {}: {}. Will attempt to > reconnect", new Object[{event.getEventType(), event.getMessage()}); > client.reconnect(); > } > else { > getLogger().error("Received error {}: {}. Will not attempt to > reconnect", new Object[]{event.getEventType(), event.getMessage()}); > } > break; > {code} > *2. Parameterize maximum number of connection retries* > I also noticed that the default number of retries on the Twitter HBC library > is sometimes too low (5 times). > So it would be useful to add a GetTwitter processor property named `*Max > Connection Retries*`. In my usage I found that `*10*` is a good value. > Then update the `*onSchedule()*` method with this line (replacing `*10*` by > the value of `*Max Connection Retries*`) > {code:java} > clientBuilder.retries(10); // default value is 5 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5953) GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted
[ https://issues.apache.org/jira/browse/NIFI-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kourge updated NIFI-5953: - Description: Hi, I am using the GetTwitter processor, with the Filter Endpoint. The issue is that I am often getting series of `*Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions. These are followed by one `*Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not attempt to reconnect*` exception and then the processor don't get any more tweet from Twitter endpoint. I am getting rate limited by Twitter API. I am running a NiFi cluster so I am running GetTwitter process on the Primary Node only to prevent using the same credentials several times in parallel. I tried to apply the configuration recommendation from this mailing list: <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E] But raising "run schedule" parameter to 60 seconds does not help in my case since I target reading between 100 and 200 tweets per minute. Setting "run schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't be able to consume Twitter API tweets queue. +*Proposed solution*+ I analyzed the `*GetTwitter.java*` implementation and noticed that the `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter endpoint on `*HTTP_ERROR*`. The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already manage reconnection. Twitter HBC library client is making retries with an increasing wait delay by its own; with 5 retries by default. More, it seam that the `*client.reconnect();*` don't work in my case and this brings to be kicked off the Twitter API earlier because that method is called too often. My proposed solution is the following (tested on my local development) *1. Letting Twitter HBC library client making the connection retries on `HTTP/1.1 420 Enhance Your Calm` messages.* The `*onTrigger()*` method should be updated to not try to reconnect in case of `*HTTP_ERROR*` with message equal to `*HTTP/1.1 420 Enhance Your Calm*`: {code:java} case HTTP_ERROR: if (!event.getMessage().equals("HTTP/1.1 420 Enhance Your Calm")) { getLogger().error("Received error {}: {}. Will attempt to reconnect", new Object[{event.getEventType(), event.getMessage()}); client.reconnect(); } else { getLogger().error("Received error {}: {}. Will not attempt to reconnect", new Object[]{event.getEventType(), event.getMessage()}); } break; {code} *2. Parameterize maximum number of connection retries* I also noticed that the default number of retries on the Twitter HBC library is sometimes too low (5 times). So it would be useful to add a GetTwitter processor property named `*Max Connection Retries*`. In my usage I found that `*10*` is a good value. Then update the `*onSchedule()*` method with this line (replacing `*10*` by the value of `*Max Connection Retries*`) {code:java} clientBuilder.retries(10); // default value is 5 {code} was: Hi, I am using the GetTwitter processor, with the Filter Endpoint. The issue is that I am often getting series of `*Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions. These are followed by one `*Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not attempt to reconnect*` exception and then the processor don't get any more tweet from Twitter endpoint. I am getting rate limited by Twitter API. I am running a NiFi cluster so I am running GetTwitter process on the Primary Node only to prevent using the same credentials several times in parallel. I tried to apply the configuration recommendation from this mailing list: <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E] But raising "run schedule" parameter to 60 seconds does not help in my case since I target reading between 100 and 200 tweets per minute. Setting "run schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't be able to consume Twitter API tweets queue. +*Proposed solution*+ I analyzed the `*GetTwitter.java*` implementation and noticed that the `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter endpoint on `*HTTP_ERROR*`. The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already manage reconnection. Twitter HBC library client is making retries with an increasing wait delay by its own; with 5 retries by default. More, it seam that the `*client.reconnect();*` don't work in my case and this brings to be kicked off the Twitter API earlier because that method is called
[jira] [Updated] (NIFI-5953) GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted
[ https://issues.apache.org/jira/browse/NIFI-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kourge updated NIFI-5953: - Description: Hi, I am using the GetTwitter processor, with the Filter Endpoint. The issue is that I am often getting series of `*Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions. These are followed by one `*Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not attempt to reconnect*` exception and then the processor don't get any more tweet from Twitter endpoint. I am getting rate limited by Twitter API. I am running a NiFi cluster so I am running GetTwitter process on the Primary Node only to prevent using the same credentials several times in parallel. I tried to apply the configuration recommendation from this mailing list: <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E] But raising "run schedule" parameter to 60 seconds does not help in my case since I target reading between 100 and 200 tweets per minute. Setting "run schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't be able to consume Twitter API tweets queue. +*Proposed solution*+ I analyzed the `*GetTwitter.java*` implementation and noticed that the `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter endpoint on `*HTTP_ERROR*`. The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already manage reconnection. Twitter HBC library client is making retries with an increasing wait delay by its own; with 5 retries by default. More, it seam that the `*client.reconnect();*` don't work in my case and this brings to be kicked off the Twitter API earlier because that method is called too often. My proposed solution is the following (tested on my local development) *1. Letting Twitter HBC library client making the connection retries on `HTTP/1.1 420 Enhance Your Calm` messages.* The `*onTrigger()*` method should be updated to not try to reconnect in case of `*HTTP_ERROR*` with message equal to `*HTTP/1.1 420 Enhance Your Calm*`: {code:java} case HTTP_ERROR: if (!event.getMessage().equals("HTTP/1.1 420 Enhance Your Calm")) { getLogger().error("Received error {}: {}. Will attempt to reconnect", new Object[]{event.getEventType(),event.getMessage()}); client.reconnect(); } else { getLogger().error("Received error {}: {}. Will not attempt to reconnect", new Object[]\{event.getEventType(), event.getMessage()}); } break; {code} *2. Parameterize maximum number of connection retries* I also noticed that the default number of retries on the Twitter HBC library is sometimes too low (5 times). So it would be useful to add a GetTwitter processor property named `*Max Connection Retries*`. In my usage I found that `*10*` is a good value. Then update the `*onSchedule()*` method with this line (replacing `*10*` by the value of `*Max Connection Retries*`) {code:java} clientBuilder.retries(10); // default value is 5 {code} was: Hi, I am using the GetTwitter processor, with the Filter Endpoint. The issue is that I am often getting series of `*Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions. These are followed by one `*Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not attempt to reconnect*` exception and then the processor don't get any more tweet from Twitter endpoint. I am getting rate limited by Twitter API. I am running a NiFi cluster so I am running GetTwitter process on the Primary Node only to prevent using the same credentials several times in parallel. I tried to apply the configuration recommendation from this mailing list: <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E] But raising "run schedule" parameter to 60 seconds does not help in my case since I target reading between 100 and 200 tweets per minute. Setting "run schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't be able to consume Twitter API tweets queue. +*Proposed solution*+ I analyzed the `*GetTwitter.java*` implementation and noticed that the `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter endpoint on `*HTTP_ERROR*`. The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already manage reconnection. Twitter HBC library client is making retries with an increasing wait delay by its own; with 5 retries by default. More, it seam that the `*client.reconnect();*` don't work in my case and this brings to be kicked off the Twitter API earlier because that method is called
[jira] [Created] (NIFI-5953) GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted
Kourge created NIFI-5953: Summary: GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted Key: NIFI-5953 URL: https://issues.apache.org/jira/browse/NIFI-5953 Project: Apache NiFi Issue Type: Bug Components: Extensions Affects Versions: 1.7.0 Reporter: Kourge Hi, I am using the GetTwitter processor, with the Filter Endpoint. The issue is that I am often getting series of `*Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions. These are followed by one `*Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not attempt to reconnect*` exception and then the processor don't get any more tweet from Twitter endpoint. I am getting rate limited by Twitter API. I am running a NiFi cluster so I am running GetTwitter process on the Primary Node only to prevent using the same credentials several times in parallel. I tried to apply the configuration recommendation from this mailing list: <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E] But raising "run schedule" parameter to 60 seconds does not help in my case since I target reading between 100 and 200 tweets per minute. Setting "run schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't be able to consume Twitter API tweets queue. +*Proposed solution*+ I analyzed the `*GetTwitter.java*` implementation and noticed that the `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter endpoint on `*HTTP_ERROR*`. The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already manage reconnection. Twitter HBC library client is making retries with an increasing wait delay by its own; with 5 retries by default. More, it seam that the `*client.reconnect();*` don't work in my case and this brings to be kicked off the Twitter API earlier because that method is called too often. My proposed solution is the following (tested on my local development) *1. Letting Twitter HBC library client making the connection retries on `HTTP/1.1 420 Enhance Your Calm` messages.* The `*onTrigger()*` method should be updated to not try to reconnect in case of `*HTTP_ERROR*` with message equal to `*HTTP/1.1 420 Enhance Your Calm*`: {code:java} case HTTP_ERROR: if (!event.getMessage().equals("HTTP/1.1 420 Enhance Your Calm")) { getLogger().error("Received error {}: {}. Will attempt to reconnect", new Object[] {event.getEventType(), event.getMessage()}); client.reconnect(); } else { getLogger().error("Received error {}: {}. Will not attempt to reconnect", new Object[]\{event.getEventType(), event.getMessage()} ); } break; {code} *2. Parameterize maximum number of connection retries* I also noticed that the default number of retries on the Twitter HBC library is sometimes too low (5 times). So it would be useful to add a GetTwitter processor property named `*Max Connection Retries*`. In my usage I found that `*10*` is a good value. Then update the `*onSchedule()*` method with this line (replacing `*10*` by the value of `*Max Connection Retries*`) {code:java} clientBuilder.retries(10); // default value is 5 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)