[ 
https://issues.apache.org/jira/browse/NIP-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18031426#comment-18031426
 ] 

endzeit commented on NIP-14:
----------------------------

+1

> Support Dynamic Discovery of Data Ingress Ports
> -----------------------------------------------
>
>                 Key: NIP-14
>                 URL: https://issues.apache.org/jira/browse/NIP-14
>             Project: NiFi Improvement Proposal
>          Issue Type: Improvement
>            Reporter: Kevin Doran
>            Assignee: Kevin Doran
>            Priority: High
>
> *Motivation*
> Increasingly, NiFi clusters are being deployed into environments where 
> network traffic is managed through a non-NiFi component, such as a gateway, 
> ingress controller, load balancer, or reverse proxy. These include cloud and 
> containerized deployment environments. NiFi infrastructure is sometimes 
> managed via infrastructure as code (IaC) frameworks as part of a larger 
> system.
> Additionally, many NiFi operators desire to deploy flow definitions 
> programmatically via automated deployment pipelines. Flow versioning and 
> promotion of flow definitions from non-production to production NiFi clusters 
> has become a best practice in the community.
> The combination of these factors leads to situations where it would be 
> advantageous if data ingress ports defined as part of a flow definition (eg, 
> a ListenHTTP processor) could be dynamically discovered by infrastructure and 
> deployment software in order to automatically configure managed networking 
> components responsible for managing ingress traffic to NiFi clusters.
> *Scope and Description*
> There are currently 9 processors and 1 controller service included in the 
> NiFi source code distribution that define Listen Ports:
>  - HandleHTTPRequest
>  - ListenFTP
>  - ListenHTTP
>  - ListenOTLP
>  - ListenSyslog
>  - ListenTCP
>  - ListenTrapSNMP
>  - ListenUDP
>  - ListenUDPRecord
>  - JettyWebSocketServer (CS)
> (Note, that ListenSlack uses the Listen* naming convention, but this 
> processor actually initiates a two-way TCP connection with a Slack workspace 
> in order to start receiving events. It does not create a server process to 
> accept inbound connections.)
> Of course, it is possible that community members have (or could) develop 
> custom NiFi extensions that create Listen Ports. Apache NiFi components 
> usually serve as example, reference implementations in these cases. The goal 
> of this feature is to introduce standard interfaces for declaring data 
> ingress ports in components as well as framework mechanisms for a standard 
> discovery process of such ports, making it possible to support Apache NiFi 
> provided components as well as external, third-party extensions.
> All of the example components listed above are ConfigurableComponents that 
> define a Port property the establishes the numbered port that is bound to in 
> order to listen to payloads and connections from external clients. A new 
> Property Descriptor field in nifi-api would be a natural place to annotate 
> that a property defines a Listen Port. This would allow discovery of Listen 
> Ports both at runtime by the framework, as well as in flow definitions, which 
> is very advantageous as it allows determining network ingress requirements 
> based solely on the static flow definition before it has even been deployed 
> into a NiFi runtime environment.
> In addition to additions to Property Descriptors in NiFi API described above, 
> the NiFi Framework and NiFi REST API would be modified to dynamically 
> discover configurable components (ie, processors and controller services) 
> containing Listen Ports and list those components along with their current 
> configuration programmatically.
> When looking at how a Listen Port should be defined, the most important 
> aspect is the layer 4 transport protocol, as that is usually the most 
> relevant information required to automatically configure external networking 
> components such as gateways, ingress controllers, load balancers, and 
> proxies. Of secondary importance is the layer 7 application protocol, if any. 
> Looking at existing projects that have solved similar problems, we can take 
> Kubernetes as an example. Kubernetes allows services, which can be arbitrary 
> containers running any process, to declare exports ports. For service ports, 
> the Kubernetes network data model just allows for declaring port number and 
> [transport 
> protocol|https://kubernetes.io/docs/reference/networking/service-protocols/] 
> using well-defined enum values. Optionally, [application 
> protocols|https://kubernetes.io/docs/concepts/services-networking/service/#application-protocol]
>  can be provided as a hint, use freeform strings, and when available can be 
> used by the provider for richer support of the app protocol. This data model 
> serves as a good guide for NiFi to model a similar situation, just replacing 
> services with NiFi extensions.
> A proposed, draft update to the NiFi API to introduce to concept of Listen 
> Ports is available here:
> [https://github.com/kevdoran/nifi-api/tree/listen-ports] 
> *Compatibility*
> Largely, this is a backwards compatible change.
> The proposed approach is entirely additive and optional: Once implemented and 
> released, extension components can opt-in to declaring Listen Ports that they 
> create. The burden to do so is minimal for component authors; in most cases, 
> a few lines of code. Once done, flow registries and operational tools built 
> atop them can dynamically discover flows that require data ingress rules, and 
> the NiFi framework can dynamically discover components that provide data 
> ingress ports and their port configuration to make it available via the REST 
> API for external components. These offer incentives for component authors to 
> opt-in to this feature, without any breaking changes that forces them to 
> update their components to continue working on newer NiFi versions.
> There is one minor breaking change to the properties for the ListenSyslog 
> processor. Syslog is an application protocol that can work over TCP or UDP. 
> The current ListenSyslog processor allows specifying the port to listen on as 
> a second property for specifying the transport protocol to accept (TCP or 
> UDP). The proposed design for the NiFi API Property Descriptor would 
> declaring a static transport protocol associated with a Listen Port. This 
> allows knowing the transport protocol based on a flow definition without 
> knowing what the runtime configuration will be, which greatly simplifies 
> rules NiFi operators may want to codify such as if a flow definition is 
> compatible with a target NiFi Runtime (eg, an operator may by policy block 
> all inbound UDP traffic for security reasons.)
> The proposed solution to this is replacing the ListenSyslog processor Port 
> and Protocol properties with TCP Port and UDP Port properties that are 
> mutually exclusive (only one is allowed to be configured at a time). The 
> migrate properties feature that was introduced for in-place flow version 
> changes will allow us to migrate flows from the old configuration to the new 
> configuration automatically for users, and migration guidance for the first 
> NiFi release to include the modified processor can cover the remaining cases.
> *Verification*
> The Verification process for this feature would include:
>  - Unit tests for components that define Listen Ports to make sure that they 
> implement the new interfaces, and then when configured, the correct Listen 
> Port definition is discoverable (ie, the return value for new interfaces 
> matches the expected values based on the component configuration)
>  - Integration tests that verify when multiple Listen components exist in a 
> NiFi Cluster, the Listen Ports they create are correctly discoverable via the 
> NiFi REST API.
>  - Instructions for peer-reviewers to manually verify the feature 
> implementation.
>  - Documentation updates, primarily to the NiFi Developer Guide, to add 
> instructions and guidance for implementing new listen components in a manner 
> that is compatible with this new feature.
> *Alternatives*
> The following alternatives were considered:
>  # {_}No changes to NiFi{_}; instead put the responsibility of discovering 
> Listen Ports solely on external components, such as deployment scripts and 
> infrastructure management logic. For example, external code could just "know" 
> (ie, using hardcoded logic) that ListenHTTP defines a data ingress port via a 
> property that accepts HTTP requests and look for instances of that known 
> Processor type. Alternatively, we could go with a convention-based approach 
> such as "processors that have a type name starting with Listen* and a Port 
> property." Both of these are very brittle, lack discoverability by extension 
> component authors, and do not account for the large community of NiFi 
> developers that may use different conventions than those used for Apache NiFi 
> components. For these reasons, this alternative was deemed insufficient.
>  # {_}A larger change that also tries to unify the new feature port discovery 
> to include the various framework-level ingress ports{_}, such as the remote 
> input port used for site-to-site protocol or cluster communication ports. For 
> example, maybe a new framework-level concept of a NiFi Gateway allows 
> operators and flow authors to put all network ingress rules in one place. 
> While this may have some advantages, and could be considered in the future, 
> it was ultimately deemed too large a change at this time. It likely would 
> include breaking changes to configuration files and and APIs, and therefore 
> would be more reasonable to implement as a NiFi 3.0 / major version change, 
> should the need ever arise. Additionally, framework-level ingress ports are 
> defined via a different process (usually in nifi.properties) and once set do 
> not change often; therefore, they are already much more management by 
> something like IaC logic and less of a problem compared to ports that are 
> part of flow definitions. This means that including them in the scope of this 
> feature offers much less value despite greatly increasing the scope.
> The proposed scope and description offers the best benefit for minimal effort 
> with practically no breaking changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to