[
https://issues.apache.org/jira/browse/NIFI-6905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16985090#comment-16985090
]
Kourge commented on NIFI-6905:
------------------------------
I have implemented a solution (#2 in the ticket description).
I updated the *GetTwitter* processor `*onScheduled()*` method to only create a
`*clientBuilder*` without connecting it to the Twitter API.
Connection is now initialized by the `*onTrigger()*` method when it needs it
(in primary node only mode `*onTrigger()*` never run on non primary nodes....).
Added `*onPrimaryNodeChange()`* to close connection on
`*PRIMARY_NODE_REVOKED*` events.
Please review the pull request.
> GetTwitter processor, configured to run on primary node only, initializes
> connection to Twitter API from every NiFi cluster node, even on non-primary
> nodes
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: NIFI-6905
> URL: https://issues.apache.org/jira/browse/NIFI-6905
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.0.0
> Reporter: Kourge
> Assignee: Kourge
> Priority: Major
> Labels: getTwitter
> Time Spent: 10m
> Remaining Estimate: 0h
>
> I have a *GetTwitter* processor running on a 3-nodes NiFi cluster and
> configured to be executed on the primary node only.
> The symptom is that there is a too high frequency of HTTP 420 ("Enhance Your
> Calm") exceptions on GetTwitter processor start.
> I made the following tests:
> * With only 1 NiFi node. I was able to start/stop GetTwitter processor 10
> times in a raw without any errors.
> * With 2 NiFi nodes running, HTTP 420 errors occurred after a few start/stop
> (sometimes even after a single start).
> After an analysis of the source code and knowing
> https://issues.apache.org/jira/browse/NIFI-2592 I came to the conclusion that
> the GetTwitter processor is initializing the connection to Twitter API on
> each node of the cluster, even to non-primary nodes.
> The `*onScheduled()*` method is run on every node (see: NIFI-2592) making
> connections to Twitter with `*client.connect()*`. Then the `*onTrigger()*`
> method consumes the tweets normally from the primary node.
> Issue is that having more that one node initializing connections make Twitter
> API raise HTTP 420 errors.
> {code:java}
> ERROR
> org.apache.nifi.processors.twitter.GetTwitter
> GetTwitter[id=XYZ] Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm.
> Will attempt to reconnect
> {code}
> +*Proposed solutions:*+
> # Change the behavior of `*onScheduled()*` method to run only on primary
> node (as proposed in NIFI-2592)
> # Update GetTwitter processor implementation to not call
> `*client.connect()*` anymore from the `*onScheduled()*` method but only when
> *PrimaryNodeState* changes to *ELECTED_PRIMARY_NODE* (And when
> *PrimaryNodeState* changes to *PRIMARY_NODE_REVOKED*: perform a
> `*client.stop()*`)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)