[
https://issues.apache.org/jira/browse/NIFI-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639938#comment-16639938
]
ASF GitHub Bot commented on NIFI-5663:
--------------------------------------
GitHub user markap14 opened a pull request:
https://github.com/apache/nifi/pull/3047
NIFI-5663: Ensure that when sort Node Identifiers that we use both the
node's API Address as well as API Port, in case 2 nodes are running on same
host. Also ensure that when Local Node ID is determined that we update all Load
Balancing Partitions, if necessary
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
- [ ] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number
you are trying to resolve? Pay particular attention to the hyphen "-" character.
- [ ] Has your PR been rebased against the latest commit within the target
branch (typically master)?
- [ ] Is your initial contribution a single, squashed commit?
### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the LICENSE file, including the main
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to
.name (programmatic access) for each of the new properties?
### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in
which it is rendered?
### Note:
Please ensure that once the PR is submitted, you check travis-ci for build
issues and submit an update to your PR as soon as possible.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/markap14/nifi NIFI-5663
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nifi/pull/3047.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3047
----
commit 619f1ffe8fbbca61bc5545f13920190a77006e08
Author: Mark Payne <markap14@...>
Date: 2018-06-14T15:57:21Z
NIFI-5516: Implement Load-Balanced Connections
Refactoring StandardFlowFileQueue to have an AbstractFlowFileQueue
Refactored more into AbstractFlowFileQueue
Added documentation, cleaned up code some
Refactored FlowFileQueue so that there is SwappablePriorityQueue
Several unit tests written
Added REST API Endpoint to allow PUT to update connection to use load
balancing or not. When enabling load balancing, though, I saw the queue size go
from 9 to 18. Then was only able to process 9 FlowFiles.
Bug fixes
Code refactoring
Added integration tests, bug fixes
Refactored clients to use NIO
Bug fixes. Appears to finally be working with NIO Client!!!!!
NIFI-5516: Refactored some code from NioAsyncLoadBalanceClient to
LoadBalanceSession
Bug fixes and allowed load balancing socket connections to be reused
Implemented ability to compress Nothing, Attributes, or Content +
Attributes when performing load-balancing
Added flag to ConnectionDTO to indicate Load Balance Status
Updated Diagnostics DTO for connections
Store state about cluster topology in NodeClusterCoordinator so that the
state is known upon restart
Code cleanup
Fixed checkstyle and unit tests
NIFI-5516: Updating logic for Cluster Node Firewall so that the node's
identity comes from its certificate, not from whatever it says it is.
NIFI-5516: FIxed missing License headers
NIFI-5516: Some minor code cleanup
NIFI-5516: Adddressed review feedback; Bug fixes; some code cleanup.
Changed dependency on nifi-registry from SNAPSHOT to official 0.3.0 release
NIFI-5516: Take backpressure configuration into account
NIFI-5516: Fixed ConnectionDiagnosticsSnapshot to include node identifier
NIFI-5516: Addressed review feedback
This closes #2947
commit 47bbe20f9ee3c348378adfa86965f34a319057bb
Author: Mark Payne <markap14@...>
Date: 2018-10-05T15:12:49Z
NIFI-5663: Ensure that when sort Node Identifiers that we use both the
node's API Address as well as API Port, in case 2 nodes are running on same
host. Also ensure that when Local Node ID is determined that we update all Load
Balancing Partitions, if necessary
----
> FlowFile load balancing keeps re-partitioning
> ---------------------------------------------
>
> Key: NIFI-5663
> URL: https://issues.apache.org/jira/browse/NIFI-5663
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.8.0
> Reporter: Koji Kawamura
> Assignee: Mark Payne
> Priority: Critical
>
> Scenario
> # Start a brand-new cluster with only 1 node (nifi0). Using existing
> multi-node clusters don't reproduce the issue.
> # Create GenerateFlowFile -> LogAttribute
> # Then set 'Partition by attribute' LB strategy at the connection
> # Add 2nd node, nifi1
> # Generate some FlowFiles. Then load-balance activity never finishes.
> With a 2-node cluster, for some reason, each node ended up having different
> queuePartitions order at SocketLoadBalancedFlowFileQueue. By adding debug
> logs, I found each node has followings:
> * nifi0
> ** queuePartitions[0} =
> RemoteQueuePartition[queueId=14ac9634-0166-1000-ffff-ffffd9ae7f4b,
> nodeId=nifi1.example.com:8080]
> ** queuePartitions[1} =
> SwappablePriorityQueueLocalPartition[queueId=14ac9634-0166-1000-ffff-ffffd9ae7f4b]
> * nifi1
> ** queuePartitions[0} =
> RemoteQueuePartition[queueId=14ac9634-0166-1000-ffff-ffffd9ae7f4b,
> nodeId=nifi0.example.com:8080]
> ** queuePartitions[1} =
> SwappablePriorityQueueLocalPartition[queueId=14ac9634-0166-1000-ffff-ffffd9ae7f4b]
> Because of this, 'Partition by attribute' LB strategy keeps re-partitioning
> received FlowFiles between each other in case the calculated attribute value
> hash points to queuePartitions[0]. Following log is written endlessly:
> {code:java}
> 2018-10-05 07:09:32,372 DEBUG [Load Balance Server Thread-3]
> o.a.n.c.q.c.SocketLoadBalancedFlowFileQueue Received the following FlowFiles
> from Peer: ...offset=7452,
> length=180],offset=162,name=10653317458635,size=18]]. Will re-partition
> FlowFiles to ensure proper balancing across the cluster.
> {code}
> SocketLoadBalancedFlowFileQueue maintains queuePartitions by listening to
> cluster topology change using ClusterTopologyEventListener.
> SocketLoadBalancedFlowFileQueueClusterEventListener.onNodeAdded debug log
> shows the array was empty when the 2nd node (nifi1) is added:
> {code:java}
> ClusterEventListener.onNodeAdded. 2018-10-05 07:06:42,883 DEBUG [Process
> Cluster Protocol Request-10] o.a.n.c.q.c.SocketLoadBalancedFlowFileQueue Node
> Identifier nifi1.example.com:8080 added to cluster. Node ID's changing from
> [] to [nifi1.example.com:8080]
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)