[ 
https://issues.apache.org/jira/browse/NIFI-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851973#comment-17851973
 ] 

Pierre Villard commented on NIFI-13077:
---------------------------------------

As a side note, I think it is good to also keep a close eye on what [~bbende] 
is doing in NIFI-13343.

I think we all want to go in the same direction. Let's not duplicate efforts :)

> On-demand Extension Provider
> ----------------------------
>
>                 Key: NIFI-13077
>                 URL: https://issues.apache.org/jira/browse/NIFI-13077
>             Project: Apache NiFi
>          Issue Type: Epic
>          Components: Core Framework
>            Reporter: Pierre Villard
>            Priority: Major
>
> We currently have the concept of *ExternalResourceProvider* with two 
> implementations (HDFS and NiFi Registry) that can be configured to list and 
> download all NARs made available in those locations. Those implementations, 
> if configured, would get started when NiFi starts and would download ALL of 
> the available NARs, plus a background thread would check every five minutes 
> for new NARs to be available and downloaded.
> The proposal here is to have a similar concept that would focus on extensions 
> / components but instead of having a background thread and instead of having 
> all of the components downloaded, the approach would be to plug this into the 
> *ExtensionBuilder* and when a component cannot be instantiated (when loading 
> a flow definition) with locally available components, then, instead of 
> creating a ghost component, the Extension Providers would be queried with 
> specific coordinates and if the provider makes the component available, then 
> the NAR would be downloaded (alongside required dependencies if the NAR 
> depends on another NAR).
> This approach already exists in the *Kafka Connect NiFi plugin* with the 
> class {*}ExtensionClientDefinition{*}. By adopting this approach in NiFi, 
> it’d be much easier to ship a much *smaller version of NiFi* and have NiFi 
> download the required components based on flows that are being instantiated / 
> deployed.
> The operation of downloading the NAR would not be blocking, meaning that we 
> would still create a ghost component but after completion of the NAR(s) 
> download and the loading of the components, the flows would be fully 
> operational.
> It might be possible to show something similar as for the Python extensions 
> where we show that the component is still in the process of downloading third 
> party dependencies.
> While this is a great opportunity to reduce the size of the NiFi binary (and 
> associated container image), it would not be great from a user perspective 
> when designing flows because all of the NARs removed from the default image 
> would no longer be visible in the list of available components when adding, 
> for example, a processor to the canvas.
> Longer term we could imagine that the extension providers can also implement 
> a listing API so that when showing the list of available components, we would 
> show the list of the components available locally as well as the components 
> available through the extensions providers. The listing of components could 
> add another column to indicate the source of the component.
> This is something that is exposed for the Extension Bundles in the NiFi 
> Registry (we also have the information about the NiFi API version that has 
> been used for building the components so we could use this information to 
> only list components that should be compatible from an API standpoint - same 
> major version but lower or equal API version).
> The immediate goal though would be to introduce the concept of 
> ExtensionProvider with the following APIs:
> {code:java}
> boolean isAvailableExtension(Coordinates)
> void downloadExtension(Coordinates)
> {code}
> Longer term we could also consider something like:
> {code:java}
> List<Extensions> listExtensions(){code}
> But we would need to figure out how a NAR can provide the information about 
> the components that are inside of it. The NiFi Registry provides this 
> information, but that would not be the case for a Maven based implementation 
> for example.
> In nifi.properties we would have something looking like:
> {code:java}
> nifi.nar.extension.provider.<identifier>.<property-name>{code}
> And we would loop through all the configured providers to find the 
> appropriate NAR to download based on provided coordinates in the flow 
> definition that is being instantiated (either from flow.json.gz, or an 
> uploaded JSON flow definition, or when checking out a flow from a registry 
> client).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to