BTW, talking about mixin inheritance, shared dependencies, improved
classloading, and module repositories, I feel like OSGi is the
elephant in the room. I can see perfectly good reasons NOT to move to
an OSGi-backed architecture, but it does feel like we'd end up
implementing many of the same features and capabilities. Perhaps a
topic for a separate DISCUSS thread?

Regards,
Matt

On Wed, Jan 17, 2018 at 11:05 AM, Matt Burgess <[email protected]> wrote:
> I'd like to echo many of the comments / discussion points here,
> including the extension registry (#3), NAR packs, and mixins. A couple
> of additional comments and caveats:
>
> NAR package management:
>
> - Grouping NAR packs based on functionality (Hadoop, RDBMS, etc.) is a
> good first start but it still seems like we'd want to end up with an a
> la carte capability at the end. An incremental approach might be to
> have a simple graphical tool (in the toolkit?) pointing at your NiFi
> install and some common repository, where you can add and delete NAR
> packs, but also delete individual NARs from your NiFi install. The use
> case here is when you download the Hadoop NAR pack for HBase and
> related components, but don't want things like the Hive NAR (which I
> think is the largest at ~93MB).
>
> - Some NiFi installs will be located on systems that cannot contact an
> outside (or any external) repository. When we consider NAR
> repositories, we should consider providing a repo-to-go or something
> of that sort. At the very least I would think the Extension Registry
> itself would support such a thing; the ability to have an Extension
> Registry anywhere, not just attached to Bintray or Apache repo HTTP
> pages, etc.
>
> - Murphy's Law says as soon as we pick NAR pack boundaries, there will
> be components that don't fit well into one or another, or they fit
> into more than one. For instance, a user might expect the Spark/Livy
> NAR to be in the Hadoop NAR pack but there is no requirement for Spark
> or Livy to run on Hadoop. Perhaps with a "Big Data" NAR pack (versus
> Hadoop) it would encompass the Hadoop and Spark stuff, but then where
> does Cassandra fit in? It certainly handles Big Data, but if there
> were a "NoSQL" NAR pack, which should it belong to (or can it be in
> both?).
>
> - Because NARs are unpacked before use in NiFi, there are two related
> footprints, the footprint of the NARs in the lib/ folder, and the
> footprint of the unpacked NARs. As part of the "duplicate JARs"
> discussion, this also segues into another area, the runtime footprint
> (to include classloader hierarchies, etc.)
>
> Optimized JARs/classloading
>
> - Promoting JARs to the lib/ folder because they are common to many
> processors is not the right solution IMO. With parent-first
> classloaders (which is what NarClassLoaders are), if you had a NAR
> that needed a different version of a library, then it would find the
> parent version first and would likely cause issues.  We could make the
> NarClassLoader self-first (which we might want to do under other
> circumstances anyway), but then care would need to be taken to ensure
> that shared/API dependencies are indeed "provided".
>
> - I do like the idea of "promotion" though, not just for JAR
> deduplication but also for better classloading. Here's an idea for how
> we might achieve this. When unpacking NARs, we would do something
> similar to a Maven install, where we build up a repository of
> artifacts. If two artifacts are the same (we'd likely want to verify
> checksums too, not just Maven coordinates), they'd install to the same
> place. At the end of NAR unpacking, the repo would contain unique
> (de-duplicated) JARs, and each NAR would have a bill-of-materials
> (BOM) from which to build its classloader.  An possible runtime
> improvement on top of that is to build a classloader hierarchy, where
> JARs shared by multiple NARs could be in their own classloader, which
> would be the parent of the NARs' classloaders. This way, instead of
> the same classes loaded into each NAR's classloader, they would only
> be loaded once into a shared parent. This "de-dupes" the memory
> footprint of the JARs as well. Hopefully the construction of the
> classloader graph would not be too computationally intensive, but we
> could have a best-effort algorithm rather than an optimal one if that
> were an issue.
>
> Thoughts? Thanks,
> Matt
>
>
>
> On Tue, Jan 16, 2018 at 12:52 PM, Kevin Doran <[email protected]> wrote:
>> Nice discussion on this thread.
>>
>> I'm also in favor of the long-term solution being publishing extension NARs 
>> to an extension registry (#3) and removing them from the NiFi convenience 
>> binary.
>>
>> A few thoughts that build upon what others have said:
>>
>> 1. Many decisions, such as the structure of the project/repo(s) and 
>> mechanics of the release, don't have to be made right away, though it is 
>> probably good to start considering the impacts of various approaches as 
>> people have. There is a lot that has to be done to make progress towards the 
>> long-term goal regardless of those decisions, some of which follows below.
>>
>> 2. We can start adding support for extensions to the Registry project 
>> (obviously).
>>
>> 3. As James W and others have pointed out, start classifying which 
>> components belong in the "core" convenience binary and which ones will be 
>> published separately. For the ones published separately, we can further 
>> classifying them down into categories / "packs" to reduce the burden on 
>> end-users.
>>
>> 4. Anticipating that the release cycles of NiFI-core and extensions will 
>> eventually be separated, we should design a way for versioned extensions to 
>> declare which versions of (Mi)NiFi they are compatible with, i.e. as a 
>> semantic version range. There are lots of good examples to pull from; pretty 
>> much any modern package management framework has some concept of support for 
>> >=, ==, =~ syntax that honors semantic versioning. If done well, this should 
>> reduce the burden of managing separate release cycles as minor and patch 
>> releases of NiFi will be backwards compatible w.r.t the public APIs used by 
>> extensions, so in most cases extensions declaring a simple 'NiFi >= 1.x' 
>> should suffice.
>>
>> 4b. Likewise, when defining a versioned flow in NiFi, the user should be 
>> able to fix the version of each processor to a specific extension version.
>>
>> 5. Great work Tony in surfacing some data on NAR size and jar duplication 
>> across NARs. Following up on Bryan's email that explores possible solutions 
>> to this, I think the best approach would be the concept of lib NARs and a 
>> more flexible NAR dependency declaration/evaluation mechanism, e.g., the 
>> "mix-in style" Bryan described vs. the current single-class inheritance 
>> style. I'm not sure what work this would require for making the runtime 
>> classpaths work correctly. For just developing/ publishing/installing NARs 
>> in this style leveraging an extension registry, we are getting pretty close 
>> to describing a full-fledged package manager, both on the server side (NiFi 
>> Registry) and client side (publishing tooling and NiFi for importing flows 
>> that reference processors that declare dependencies). Given that NAR packs 
>> could solve the immediate problem of reducing the size of individual 
>> binaries, I think we should make jar de-duplication a goal for after a 
>> functional extension registry, while keeping it in mind for the design of 
>> the extension registry.
>>
>> On 1/16/18, 11:08, "Bryan Bende" <[email protected]> wrote:
>>
>>     I still like the "NAR packs" idea even for the single repo approach. I
>>     think if we only provide a "light" binary and then say that everything
>>     else has to be built on your own, it creates a big barrier to entry
>>     for a lot of users. With the NAR packs approach we could provide one
>>     binary that is the actual application, and then multiple zips/tars
>>     that each contain a set of NARs. So someone gets the first binary and
>>     then adds whichever NAR packs to it. This solves the immediate problem
>>     of having any single binary exceed a certain size.
>>
>>     As a side effect of whatever we do, I was also hoping we could make
>>     the build process easier for folks working on the framework. If all we
>>     do is change our current assembly, I think you'd still incur the time
>>     of building all the NARs since they are listed in the modules section
>>     nifi-nar-bundles pom, even though most of them wouldn't be included in
>>     the new "light" assembly. We'd have to consider restructuring the git
>>     repo a little bit if this was something we wanted to do. Possibly the
>>     top-level could be divided into "nifi-core" and "nifi-nar-bundles",
>>     where nifi-core produced the light assembly so folks working on the
>>     framework can build this quickly, but if you want to build everything
>>     then you build from the root pom which also builds all the NAR packs.
>>     Just something to think about if we are going to make changes.
>>
>>     Regarding the duplication of many JARs (thanks for putting the data
>>     together Tony!)...
>>
>>     We could try to collapse common dependencies so that we don't end up
>>     with so many duplicate copies of the same JAR, but I don't know
>>     exactly how we'd set this up...
>>
>>     We could promote a JAR to the lib directory which makes it visible to
>>     every single NAR and thus no longer needs to be bundled into each NAR.
>>     That works great for the NARs that already use the dependency, but now
>>     means that a bunch of other NARs have this extra thing on the
>>     classpath, and also means we are forcing the version of that library
>>     upon every NAR which somewhat defeats the purpose of NARs.
>>
>>     We could create "lib" NARs, similar to the original intent of
>>     nifi-hadoop-libraries-nar. For example, we could create
>>     nifi-jackson-libraries-nar, and then any NAR that needs jackson would
>>     have this as their parent. This gets tricky when their is more than
>>     one library in play, for example lets say we also had
>>     nifi-bcprov-libraries-nar, and then some other NAR needs jackson and
>>     bcprov, there can be only one parent NAR so you can only pick one of
>>     them. You could chain things together, but then how do you decide the
>>     order of the chain... nifi-xyz-nar -> nifi-jackson-nar ->
>>     nifi-bcprov-nar  VS. nifi-xyz-nar -> nifi-bcprov-nar ->
>>     nifi-jackson-nar.
>>
>>     Right now having a NAR dependency is like single class inheritance,
>>     and it seems like we would also need a mix-in style NAR dependency to
>>     be able to add multiple lib NARs without getting into this chaining
>>     issue.
>>
>>
>>     On Tue, Jan 16, 2018 at 5:14 AM, Mike Thomsen <[email protected]> 
>> wrote:
>>     > Also maybe #4: Message Queue support (JMS, Kafka, etc.)
>>     >
>>     > On Tue, Jan 16, 2018 at 5:13 AM, Mike Thomsen <[email protected]>
>>     > wrote:
>>     >
>>     >> One possibility: 3 "packs." Such as:
>>     >>
>>     >> 1. Big Data.
>>     >> 2. Search
>>     >> 3. Non-BD NoSQL.
>>     >>
>>     >> Each pack would be an assembly of NARs that correspond to the 
>> category.
>>     >>
>>     >> The core would have JDBC support and all of the data mutator 
>> processors.
>>     >>
>>     >> On Mon, Jan 15, 2018 at 11:54 PM, James Wing <[email protected]> wrote:
>>     >>
>>     >>> I think a reduced build is a good way forward until the extension 
>> registry
>>     >>> is ready.  If we can publish the remaining processors in one or more
>>     >>> additional artifacts, that would be ideal.  The admin burden of more 
>> git
>>     >>> repositories or separate releases does not appeal to me, especially 
>> since
>>     >>> we do not believe it to be our long-term path.
>>     >>>
>>     >>> It's not going to be easy to decide on a "core" build with "extras" 
>> sold
>>     >>> separately. But we will have to confront the division for the 
>> registry
>>     >>> solution in any case, we might as well get started on it.
>>     >>>
>>     >>> On Sun, Jan 14, 2018 at 1:37 PM, Mike Thomsen 
>> <[email protected]>
>>     >>> wrote:
>>     >>>
>>     >>> > Since the limit was bumped to 1.6GB, it might be prudent to not do 
>> too
>>     >>> much
>>     >>> > NiFi 1.X and instead focus on a comprehensive solution that 
>> coincides
>>     >>> with
>>     >>> > 2.0. I think that would be a time when a lot of users might expect 
>> and
>>     >>> be
>>     >>> > tolerant of breaking changes on issues like this.
>>     >>> >
>>     >>> > Also, is there a clear process for deprecating processors? If not, 
>> there
>>     >>> > should be because it would be really helpful for doing cleanup.
>>     >>> >
>>     >>> > On Sat, Jan 13, 2018 at 7:53 PM, Brett Ryan <[email protected]>
>>     >>> wrote:
>>     >>> >
>>     >>> > > Why are core modules not listing everything as provided?
>>     >>> > >
>>     >>> > > IDE’s solve this problem with the use of dependency libraries. 
>> As an
>>     >>> > > example NetBeans nbm’s have a single purpose, you must export the
>>     >>> > packages
>>     >>> > > to be exposed.
>>     >>> > >
>>     >>> > > We do the same with confluence modules using felix.
>>     >>> > >
>>     >>> > > Why is NiFi doing things different just so the person who wants 
>> to
>>     >>> > install
>>     >>> > > many custom nars can be lazy?
>>     >>> > >
>>     >>> > > > On 14 Jan 2018, at 08:59, Tony Kurc <[email protected]> wrote:
>>     >>> > > >
>>     >>> > > > I added some more stats to the wiki page, trying to determine 
>> what
>>     >>> > > > dependencies are included in jars. It seems like there is
>>     >>> opportunity.
>>     >>> > > >
>>     >>> > > > Highlights, 50 copies of what appears to be some version of
>>     >>> > bcprov-jdk15
>>     >>> > > > for a total of 162M. 51 copies of jackson-databind.
>>     >>> > > >
>>     >>> > > > total size       copies  jar
>>     >>> > > >     30.97MB     65     META-INF/bundled-dependencies/
>>     >>> > > commons-lang3-XXX.jar
>>     >>> > > >     32.53MB     50     META-INF/bundled-dependencies/
>>     >>> > > bcpkix-jdk15on-XXX.jar
>>     >>> > > >     33.55MB     16     
>> META-INF/bundled-dependencies/guava-XXX.jar
>>     >>> > > >     39.62MB      1     META-INF/bundled-dependencies/
>>     >>> > > jython-shaded-XXX.jar
>>     >>> > > >     63.06MB     51
>>     >>> > > > META-INF/bundled-dependencies/jackson-databind-XXX.jar
>>     >>> > > >    162.07MB     50     META-INF/bundled-dependencies/
>>     >>> > > bcprov-jdk15on-XXX.jar
>>     >>> > > >
>>     >>> > > >
>>     >>> > > >> On Sat, Jan 13, 2018 at 2:09 PM, Joey Frazee <
>>     >>> [email protected]>
>>     >>> > > wrote:
>>     >>> > > >>
>>     >>> > > >> I tend to have feelings similar to Michael about a multi-repo
>>     >>> > approach.
>>     >>> > > >> I’ve rarely seen it help and more often seen it hurt — it’s
>>     >>> confusing
>>     >>> > > >> (especially to newcomers), stuff gets neglected because it’s
>>     >>> easier to
>>     >>> > > >> ignore, you need another master project or some such to do an
>>     >>> entire
>>     >>> > > build.
>>     >>> > > >>
>>     >>> > > >> Maybe git submodules could help mitigate this, but creating
>>     >>> > independent
>>     >>> > > >> assemblies or using different build profiles to enable 
>> building and
>>     >>> > > >> packaging the binaries in different ways would satisfy 
>> everything
>>     >>> > except
>>     >>> > > >> disentangling the releases.
>>     >>> > > >>
>>     >>> > > >> -joey
>>     >>> > > >>
>>     >>> > > >>> On Jan 13, 2018, 12:40 PM -0600, Brandon DeVries 
>> <[email protected]>,
>>     >>> > wrote:
>>     >>> > > >>> I agree... Long term extension registry, short term one repo 
>> with
>>     >>> > > >> different
>>     >>> > > >>> assemblies (e.g. standard, slim, analytic, etc...).
>>     >>> > > >>>
>>     >>> > > >>> Brandon
>>     >>> > > >>>
>>     >>> > > >>> On Sat, Jan 13, 2018 at 1:35 PM Pierre Villard <
>>     >>> > > >> [email protected]
>>     >>> > > >>> wrote:
>>     >>> > > >>>
>>     >>> > > >>>> Option #3 also has my preference. But it's probably a good 
>> idea
>>     >>> to
>>     >>> > > only
>>     >>> > > >>>> keep one git repo and play with the assembly and Maven 
>> profiles
>>     >>> for
>>     >>> > > the
>>     >>> > > >>>> releases, no? It'd be certainly easier for release 
>> management
>>     >>> > process.
>>     >>> > > >> But
>>     >>> > > >>>> this decision could also depend on how the option #3 is 
>> going to
>>     >>> be
>>     >>> > > >>>> implemented I guess.
>>     >>> > > >>>>
>>     >>> > > >>>> 2018-01-13 6:36 GMT-07:00 Joe Witt <[email protected]>:
>>     >>> > > >>>>
>>     >>> > > >>>>> thanks tony!
>>     >>> > > >>>>>
>>     >>> > > >>>>>> On Jan 12, 2018 10:48 PM, "Tony Kurc" <[email protected]>
>>     >>> wrote:
>>     >>> > > >>>>>>
>>     >>> > > >>>>>> I put some of the data I was working with on the wiki -
>>     >>> > > >>>>>>
>>     >>> > > >>>>>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+
>>     >>> > > >> 1.5.0+nar+files
>>     >>> > > >>>>>>
>>     >>> > > >>>>>> On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer <
>>     >>> [email protected]
>>     >>> > > >>>> wrote:
>>     >>> > > >>>>>>
>>     >>> > > >>>>>>> So my favorite option is Bryan’s option number “three” of
>>     >>> using
>>     >>> > > >> the
>>     >>> > > >>>>>>> extension registry. Now my thought is do we really need 
>> to add
>>     >>> > > >>>>> complexity
>>     >>> > > >>>>>>> and do anything in the mean time or just focus on that?
>>     >>> Meaning
>>     >>> > > >> we
>>     >>> > > >>>> have
>>     >>> > > >>>>>>> roughly 500mb of available capacity today so why don’t we
>>     >>> spend
>>     >>> > > >> those
>>     >>> > > >>>>> man
>>     >>> > > >>>>>>> hours we would spend on getting the second repo up on the
>>     >>> > > >> extension
>>     >>> > > >>>>>>> registry instead?
>>     >>> > > >>>>>>>
>>     >>> > > >>>>>>> @Bryan do you have thoughts about the deployment of 
>> those bars
>>     >>> > > >> in the
>>     >>> > > >>>>>>> extension registry? Since we won’t be able to build the
>>     >>> release
>>     >>> > > >>>> binary
>>     >>> > > >>>>>>> anymore would we still need to create separate repos for 
>> the
>>     >>> > > >> nars or
>>     >>> > > >>>>>> no?? I
>>     >>> > > >>>>>>> have used the registry a little but I’m not 100% sure on 
>> your
>>     >>> > > >> vision
>>     >>> > > >>>>> for
>>     >>> > > >>>>>>> the nars
>>     >>> > > >>>>>>>
>>     >>> > > >>>>>>> - Jeremy Dyer
>>     >>> > > >>>>>>>
>>     >>> > > >>>>>>> Sent from my iPhone
>>     >>> > > >>>>>>>
>>     >>> > > >>>>>>>> On Jan 12, 2018, at 10:18 PM, Tony Kurc 
>> <[email protected]>
>>     >>> > > >> wrote:
>>     >>> > > >>>>>>>>
>>     >>> > > >>>>>>>> I was looking at nar sizes, and thought some data may be
>>     >>> > > >> helpful. I
>>     >>> > > >>>>>> used
>>     >>> > > >>>>>>> my recent RC1 verification as a basis for getting file 
>> sizes,
>>     >>> and
>>     >>> > > >>>> just
>>     >>> > > >>>>>> got
>>     >>> > > >>>>>>> the file size for each file in the assembly named 
>> "*.nar". I
>>     >>> > > >> don't
>>     >>> > > >>>> know
>>     >>> > > >>>>>>> whether the images I pasted in will go through, but I 
>> made
>>     >>> some
>>     >>> > > >>>>> graphs.b
>>     >>> > > >>>>>>> The first is a histogram of nar file size in buckets of 
>> 10MB.
>>     >>> The
>>     >>> > > >>>>> second
>>     >>> > > >>>>>>> basically is similar to a cumulative distribution, the x 
>> axis
>>     >>> is
>>     >>> > > >> the
>>     >>> > > >>>>>> "rank"
>>     >>> > > >>>>>>> of the nar (smallest to largest), and the y-axis is how 
>> what
>>     >>> > > >> fraction
>>     >>> > > >>>>> of
>>     >>> > > >>>>>>> the all the sizes of the nars together are that rank or
>>     >>> lower. In
>>     >>> > > >>>> other
>>     >>> > > >>>>>>> words, on the graph, the dot at 60 and ~27 means that the
>>     >>> > > >> smallest 60
>>     >>> > > >>>>>> nars
>>     >>> > > >>>>>>> contribute only ~27% of the total. Of note, the standard 
>> and
>>     >>> > > >>>> framework
>>     >>> > > >>>>>> nars
>>     >>> > > >>>>>>> are at 83 and 84.
>>     >>> > > >>>>>>>>
>>     >>> > > >>>>>>>>
>>     >>> > > >>>>>>>>
>>     >>> > > >>>>>>>>
>>     >>> > > >>>>>>>>
>>     >>> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:04 PM, Michael Moser <
>>     >>> > > >>>> [email protected]
>>     >>> > > >>>>>>> wrote:
>>     >>> > > >>>>>>>>> And of course, as I hit <send> I thought of one more 
>> thing.
>>     >>> > > >>>>>>>>>
>>     >>> > > >>>>>>>>> We could keep all of the code in 1 git repo (1 
>> project) but
>>     >>> > > >> the
>>     >>> > > >>>>>>>>> nifi-assembly part of the build could be broken up to 
>> build
>>     >>> > > >> core
>>     >>> > > >>>>> NiFi
>>     >>> > > >>>>>>>>> separately from the tar/zip functional grouping of 
>> other
>>     >>> > > >> NARs.
>>     >>> > > >>>>>>>>>
>>     >>> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:01 PM, Michael Moser <
>>     >>> > > >>>> [email protected]
>>     >>> > > >>>>>>> wrote:
>>     >>> > > >>>>>>>>>
>>     >>> > > >>>>>>>>>> Long term I would also like to see #3 be the 
>> solution. I
>>     >>> > > >> think
>>     >>> > > >>>>> what
>>     >>> > > >>>>>>>>>> Joseph N described could be part of the capabilities 
>> of #3.
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>> I would like to add a note of caution with respect to
>>     >>> > > >>>> reorganizing
>>     >>> > > >>>>>> and
>>     >>> > > >>>>>>>>>> releasing extension bundles separately:
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>> - the burden on release manager expands because many 
>> more
>>     >>> > > >>>>>> projects
>>     >>> > > >>>>>>>>>> have to be released; probably not all on each release 
>> cycle
>>     >>> > > >>>> but
>>     >>> > > >>>>>> it
>>     >>> > > >>>>>>> could
>>     >>> > > >>>>>>>>>> still be many
>>     >>> > > >>>>>>>>>> - the chance of accidentally forgetting to release a
>>     >>> > > >> project
>>     >>> > > >>>>> in a
>>     >>> > > >>>>>>>>>> release cycle becomes non-zero
>>     >>> > > >>>>>>>>>> - sharing code between projects gets a bit harder 
>> because
>>     >>> > > >> you
>>     >>> > > >>>>>> have
>>     >>> > > >>>>>>> to
>>     >>> > > >>>>>>>>>> manage releasing projects in a specific order
>>     >>> > > >>>>>>>>>> - it becomes harder to find all of the projects that 
>> need
>>     >>> > > >> to
>>     >>> > > >>>>>> change
>>     >>> > > >>>>>>>>>> when shared code is added
>>     >>> > > >>>>>>>>>> - the simple act of finding code becomes harder ... in
>>     >>> > > >> which
>>     >>> > > >>>>>>> project
>>     >>> > > >>>>>>>>>> is that class in? (IDEs like IntelliJ can search in 1
>>     >>> > > >>>> project,
>>     >>> > > >>>>>> but
>>     >>> > > >>>>>>> if they
>>     >>> > > >>>>>>>>>> search across multiple projects, then I haven't 
>> learned
>>     >>> > > >> how)
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>> I used to maintain several nars in separate projects, 
>> and
>>     >>> > > >>>> recently
>>     >>> > > >>>>>>>>>> reorganized them into 1 project (following NiFi's
>>     >>> > > >> multi-module
>>     >>> > > >>>>> maven
>>     >>> > > >>>>>>> build)
>>     >>> > > >>>>>>>>>> and life has become much easier!
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>> -- Mike
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>> On Fri, Jan 12, 2018 at 4:33 PM, Chris Herrera <
>>     >>> > > >>>>>>> [email protected]
>>     >>> > > >>>>>>>>>> wrote:
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>>> I very much like the solution proposed by Bryan 
>> below.
>>     >>> > > >> This
>>     >>> > > >>>> would
>>     >>> > > >>>>>>> allow
>>     >>> > > >>>>>>>>>>> for a cleaner docker image as well, while still 
>> proving
>>     >>> > > >> the
>>     >>> > > >>>>>>> functionality
>>     >>> > > >>>>>>>>>>> as needed. For sure, the extension registry will be
>>     >>> > > >> great, but
>>     >>> > > >>>> in
>>     >>> > > >>>>>>> the mean
>>     >>> > > >>>>>>>>>>> time this is an adequate mid step.
>>     >>> > > >>>>>>>>>>>
>>     >>> > > >>>>>>>>>>> Regards,
>>     >>> > > >>>>>>>>>>> Chris
>>     >>> > > >>>>>>>>>>>
>>     >>> > > >>>>>>>>>>> On Jan 12, 2018, 2:52 PM -0600, Bryan Bende <
>>     >>> > > >> [email protected]
>>     >>> > > >>>>> ,
>>     >>> > > >>>>>>> wrote:
>>     >>> > > >>>>>>>>>>>> Long term I'd like to see the extension registry 
>> take
>>     >>> > > >> form
>>     >>> > > >>>> and
>>     >>> > > >>>>>> have
>>     >>> > > >>>>>>>>>>>> that be the solution (#3).
>>     >>> > > >>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>> In the more near term, we could separate all of the
>>     >>> > > >> NARs,
>>     >>> > > >>>>> except
>>     >>> > > >>>>>>> for
>>     >>> > > >>>>>>>>>>>> framework and maybe standard processors & services,
>>     >>> > > >> into a
>>     >>> > > >>>>>> separate
>>     >>> > > >>>>>>>>>>>> git repo.
>>     >>> > > >>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>> In that new git repo we could organize things like 
>> Joe
>>     >>> > > >> N just
>>     >>> > > >>>>>>>>>>>> described according to some kind of functional
>>     >>> > > >> grouping. Each
>>     >>> > > >>>>> of
>>     >>> > > >>>>>>> these
>>     >>> > > >>>>>>>>>>>> functional bundles could produce its own tar/zip 
>> which
>>     >>> > > >> we can
>>     >>> > > >>>>>> make
>>     >>> > > >>>>>>>>>>>> available for download.
>>     >>> > > >>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>> That would separate the release cycles between core
>>     >>> > > >> NiFi and
>>     >>> > > >>>>> the
>>     >>> > > >>>>>>> other
>>     >>> > > >>>>>>>>>>>> NARs, and also avoid having any single binary 
>> artifact
>>     >>> > > >> that
>>     >>> > > >>>>> gets
>>     >>> > > >>>>>>> too
>>     >>> > > >>>>>>>>>>>> large.
>>     >>> > > >>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:43 PM, Joseph Niemiec <
>>     >>> > > >>>>>>> [email protected]
>>     >>> > > >>>>>>>>>>> wrote:
>>     >>> > > >>>>>>>>>>>>> just a random thought.
>>     >>> > > >>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>> Drop In Lib packs... All the Hadoop ones in one
>>     >>> > > >> package for
>>     >>> > > >>>>>>> example
>>     >>> > > >>>>>>>>>>> that
>>     >>> > > >>>>>>>>>>>>> can be added to a slim Nifi install. Another may be
>>     >>> > > >> for
>>     >>> > > >>>>> Cloud,
>>     >>> > > >>>>>> or
>>     >>> > > >>>>>>>>>>> Database
>>     >>> > > >>>>>>>>>>>>> Interactions, Integration (JMS, FTP, etc) of course
>>     >>> > > >>>> defining
>>     >>> > > >>>>>>> these
>>     >>> > > >>>>>>>>>>> groups
>>     >>> > > >>>>>>>>>>>>> would be the tricky part... Or perhaps some type of
>>     >>> > > >>>> installer
>>     >>> > > >>>>>>> which
>>     >>> > > >>>>>>>>>>> allows
>>     >>> > > >>>>>>>>>>>>> you to elect which packages to download to add to
>>     >>> > > >> the slim
>>     >>> > > >>>>>>> install?
>>     >>> > > >>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:10 PM, Joe Witt <
>>     >>> > > >>>>> [email protected]
>>     >>> > > >>>>>>> wrote:
>>     >>> > > >>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>> Team,
>>     >>> > > >>>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>> The NiFi convenience binary (tar.gz/zip) size has
>>     >>> > > >> grown
>>     >>> > > >>>> to
>>     >>> > > >>>>>>> 1.1GB now
>>     >>> > > >>>>>>>>>>>>>> in the latest release. Apache infra expanded it to
>>     >>> > > >> 1.6GB
>>     >>> > > >>>>>>> allowance
>>     >>> > > >>>>>>>>>>>>>> for us but has stated this is the last time.
>>     >>> > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-15816
>>     >>> > > >>>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>> We need consider:
>>     >>> > > >>>>>>>>>>>>>> 1) removing old nars/less commonly used nars/or
>>     >>> > > >>>>> particularly
>>     >>> > > >>>>>>> massive
>>     >>> > > >>>>>>>>>>>>>> nars from the assembly we distribute by default.
>>     >>> > > >> Folks
>>     >>> > > >>>> can
>>     >>> > > >>>>>>> still use
>>     >>> > > >>>>>>>>>>>>>> these things if they want just not from our
>>     >>> > > >> convenience
>>     >>> > > >>>>>> binary
>>     >>> > > >>>>>>>>>>>>>> 2) collapsing nars with highly repeating deps
>>     >>> > > >>>>>>>>>>>>>> 3) Getting the extension registry baked into the
>>     >>> > > >> Flow
>>     >>> > > >>>>>> Registry
>>     >>> > > >>>>>>> then
>>     >>> > > >>>>>>>>>>>>>> moving to separate releases for extension bundles.
>>     >>> > > >> The
>>     >>> > > >>>> main
>>     >>> > > >>>>>>> release
>>     >>> > > >>>>>>>>>>>>>> then would be just the NiFi framework.
>>     >>> > > >>>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>> Any other ideas ?
>>     >>> > > >>>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>> I'll plan to start identifying candiates for
>>     >>> > > >> removal
>>     >>> > > >>>> soon.
>>     >>> > > >>>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>> Thanks
>>     >>> > > >>>>>>>>>>>>>> Joe
>>     >>> > > >>>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>>
>>     >>> > > >>>>>>>>>>>>> --
>>     >>> > > >>>>>>>>>>>>> Joseph
>>     >>> > > >>>>>>>>>>>
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>>>
>>     >>> > > >>>>>>>>
>>     >>> > > >>>>>>>
>>     >>> > > >>>>>>
>>     >>> > > >>>>>
>>     >>> > > >>>>
>>     >>> > > >>
>>     >>> > >
>>     >>> >
>>     >>>
>>     >>
>>     >>
>>
>>
>>

Reply via email to