BTW, talking about mixin inheritance, shared dependencies, improved classloading, and module repositories, I feel like OSGi is the elephant in the room. I can see perfectly good reasons NOT to move to an OSGi-backed architecture, but it does feel like we'd end up implementing many of the same features and capabilities. Perhaps a topic for a separate DISCUSS thread?
Regards, Matt On Wed, Jan 17, 2018 at 11:05 AM, Matt Burgess <[email protected]> wrote: > I'd like to echo many of the comments / discussion points here, > including the extension registry (#3), NAR packs, and mixins. A couple > of additional comments and caveats: > > NAR package management: > > - Grouping NAR packs based on functionality (Hadoop, RDBMS, etc.) is a > good first start but it still seems like we'd want to end up with an a > la carte capability at the end. An incremental approach might be to > have a simple graphical tool (in the toolkit?) pointing at your NiFi > install and some common repository, where you can add and delete NAR > packs, but also delete individual NARs from your NiFi install. The use > case here is when you download the Hadoop NAR pack for HBase and > related components, but don't want things like the Hive NAR (which I > think is the largest at ~93MB). > > - Some NiFi installs will be located on systems that cannot contact an > outside (or any external) repository. When we consider NAR > repositories, we should consider providing a repo-to-go or something > of that sort. At the very least I would think the Extension Registry > itself would support such a thing; the ability to have an Extension > Registry anywhere, not just attached to Bintray or Apache repo HTTP > pages, etc. > > - Murphy's Law says as soon as we pick NAR pack boundaries, there will > be components that don't fit well into one or another, or they fit > into more than one. For instance, a user might expect the Spark/Livy > NAR to be in the Hadoop NAR pack but there is no requirement for Spark > or Livy to run on Hadoop. Perhaps with a "Big Data" NAR pack (versus > Hadoop) it would encompass the Hadoop and Spark stuff, but then where > does Cassandra fit in? It certainly handles Big Data, but if there > were a "NoSQL" NAR pack, which should it belong to (or can it be in > both?). > > - Because NARs are unpacked before use in NiFi, there are two related > footprints, the footprint of the NARs in the lib/ folder, and the > footprint of the unpacked NARs. As part of the "duplicate JARs" > discussion, this also segues into another area, the runtime footprint > (to include classloader hierarchies, etc.) > > Optimized JARs/classloading > > - Promoting JARs to the lib/ folder because they are common to many > processors is not the right solution IMO. With parent-first > classloaders (which is what NarClassLoaders are), if you had a NAR > that needed a different version of a library, then it would find the > parent version first and would likely cause issues. We could make the > NarClassLoader self-first (which we might want to do under other > circumstances anyway), but then care would need to be taken to ensure > that shared/API dependencies are indeed "provided". > > - I do like the idea of "promotion" though, not just for JAR > deduplication but also for better classloading. Here's an idea for how > we might achieve this. When unpacking NARs, we would do something > similar to a Maven install, where we build up a repository of > artifacts. If two artifacts are the same (we'd likely want to verify > checksums too, not just Maven coordinates), they'd install to the same > place. At the end of NAR unpacking, the repo would contain unique > (de-duplicated) JARs, and each NAR would have a bill-of-materials > (BOM) from which to build its classloader. An possible runtime > improvement on top of that is to build a classloader hierarchy, where > JARs shared by multiple NARs could be in their own classloader, which > would be the parent of the NARs' classloaders. This way, instead of > the same classes loaded into each NAR's classloader, they would only > be loaded once into a shared parent. This "de-dupes" the memory > footprint of the JARs as well. Hopefully the construction of the > classloader graph would not be too computationally intensive, but we > could have a best-effort algorithm rather than an optimal one if that > were an issue. > > Thoughts? Thanks, > Matt > > > > On Tue, Jan 16, 2018 at 12:52 PM, Kevin Doran <[email protected]> wrote: >> Nice discussion on this thread. >> >> I'm also in favor of the long-term solution being publishing extension NARs >> to an extension registry (#3) and removing them from the NiFi convenience >> binary. >> >> A few thoughts that build upon what others have said: >> >> 1. Many decisions, such as the structure of the project/repo(s) and >> mechanics of the release, don't have to be made right away, though it is >> probably good to start considering the impacts of various approaches as >> people have. There is a lot that has to be done to make progress towards the >> long-term goal regardless of those decisions, some of which follows below. >> >> 2. We can start adding support for extensions to the Registry project >> (obviously). >> >> 3. As James W and others have pointed out, start classifying which >> components belong in the "core" convenience binary and which ones will be >> published separately. For the ones published separately, we can further >> classifying them down into categories / "packs" to reduce the burden on >> end-users. >> >> 4. Anticipating that the release cycles of NiFI-core and extensions will >> eventually be separated, we should design a way for versioned extensions to >> declare which versions of (Mi)NiFi they are compatible with, i.e. as a >> semantic version range. There are lots of good examples to pull from; pretty >> much any modern package management framework has some concept of support for >> >=, ==, =~ syntax that honors semantic versioning. If done well, this should >> reduce the burden of managing separate release cycles as minor and patch >> releases of NiFi will be backwards compatible w.r.t the public APIs used by >> extensions, so in most cases extensions declaring a simple 'NiFi >= 1.x' >> should suffice. >> >> 4b. Likewise, when defining a versioned flow in NiFi, the user should be >> able to fix the version of each processor to a specific extension version. >> >> 5. Great work Tony in surfacing some data on NAR size and jar duplication >> across NARs. Following up on Bryan's email that explores possible solutions >> to this, I think the best approach would be the concept of lib NARs and a >> more flexible NAR dependency declaration/evaluation mechanism, e.g., the >> "mix-in style" Bryan described vs. the current single-class inheritance >> style. I'm not sure what work this would require for making the runtime >> classpaths work correctly. For just developing/ publishing/installing NARs >> in this style leveraging an extension registry, we are getting pretty close >> to describing a full-fledged package manager, both on the server side (NiFi >> Registry) and client side (publishing tooling and NiFi for importing flows >> that reference processors that declare dependencies). Given that NAR packs >> could solve the immediate problem of reducing the size of individual >> binaries, I think we should make jar de-duplication a goal for after a >> functional extension registry, while keeping it in mind for the design of >> the extension registry. >> >> On 1/16/18, 11:08, "Bryan Bende" <[email protected]> wrote: >> >> I still like the "NAR packs" idea even for the single repo approach. I >> think if we only provide a "light" binary and then say that everything >> else has to be built on your own, it creates a big barrier to entry >> for a lot of users. With the NAR packs approach we could provide one >> binary that is the actual application, and then multiple zips/tars >> that each contain a set of NARs. So someone gets the first binary and >> then adds whichever NAR packs to it. This solves the immediate problem >> of having any single binary exceed a certain size. >> >> As a side effect of whatever we do, I was also hoping we could make >> the build process easier for folks working on the framework. If all we >> do is change our current assembly, I think you'd still incur the time >> of building all the NARs since they are listed in the modules section >> nifi-nar-bundles pom, even though most of them wouldn't be included in >> the new "light" assembly. We'd have to consider restructuring the git >> repo a little bit if this was something we wanted to do. Possibly the >> top-level could be divided into "nifi-core" and "nifi-nar-bundles", >> where nifi-core produced the light assembly so folks working on the >> framework can build this quickly, but if you want to build everything >> then you build from the root pom which also builds all the NAR packs. >> Just something to think about if we are going to make changes. >> >> Regarding the duplication of many JARs (thanks for putting the data >> together Tony!)... >> >> We could try to collapse common dependencies so that we don't end up >> with so many duplicate copies of the same JAR, but I don't know >> exactly how we'd set this up... >> >> We could promote a JAR to the lib directory which makes it visible to >> every single NAR and thus no longer needs to be bundled into each NAR. >> That works great for the NARs that already use the dependency, but now >> means that a bunch of other NARs have this extra thing on the >> classpath, and also means we are forcing the version of that library >> upon every NAR which somewhat defeats the purpose of NARs. >> >> We could create "lib" NARs, similar to the original intent of >> nifi-hadoop-libraries-nar. For example, we could create >> nifi-jackson-libraries-nar, and then any NAR that needs jackson would >> have this as their parent. This gets tricky when their is more than >> one library in play, for example lets say we also had >> nifi-bcprov-libraries-nar, and then some other NAR needs jackson and >> bcprov, there can be only one parent NAR so you can only pick one of >> them. You could chain things together, but then how do you decide the >> order of the chain... nifi-xyz-nar -> nifi-jackson-nar -> >> nifi-bcprov-nar VS. nifi-xyz-nar -> nifi-bcprov-nar -> >> nifi-jackson-nar. >> >> Right now having a NAR dependency is like single class inheritance, >> and it seems like we would also need a mix-in style NAR dependency to >> be able to add multiple lib NARs without getting into this chaining >> issue. >> >> >> On Tue, Jan 16, 2018 at 5:14 AM, Mike Thomsen <[email protected]> >> wrote: >> > Also maybe #4: Message Queue support (JMS, Kafka, etc.) >> > >> > On Tue, Jan 16, 2018 at 5:13 AM, Mike Thomsen <[email protected]> >> > wrote: >> > >> >> One possibility: 3 "packs." Such as: >> >> >> >> 1. Big Data. >> >> 2. Search >> >> 3. Non-BD NoSQL. >> >> >> >> Each pack would be an assembly of NARs that correspond to the >> category. >> >> >> >> The core would have JDBC support and all of the data mutator >> processors. >> >> >> >> On Mon, Jan 15, 2018 at 11:54 PM, James Wing <[email protected]> wrote: >> >> >> >>> I think a reduced build is a good way forward until the extension >> registry >> >>> is ready. If we can publish the remaining processors in one or more >> >>> additional artifacts, that would be ideal. The admin burden of more >> git >> >>> repositories or separate releases does not appeal to me, especially >> since >> >>> we do not believe it to be our long-term path. >> >>> >> >>> It's not going to be easy to decide on a "core" build with "extras" >> sold >> >>> separately. But we will have to confront the division for the >> registry >> >>> solution in any case, we might as well get started on it. >> >>> >> >>> On Sun, Jan 14, 2018 at 1:37 PM, Mike Thomsen >> <[email protected]> >> >>> wrote: >> >>> >> >>> > Since the limit was bumped to 1.6GB, it might be prudent to not do >> too >> >>> much >> >>> > NiFi 1.X and instead focus on a comprehensive solution that >> coincides >> >>> with >> >>> > 2.0. I think that would be a time when a lot of users might expect >> and >> >>> be >> >>> > tolerant of breaking changes on issues like this. >> >>> > >> >>> > Also, is there a clear process for deprecating processors? If not, >> there >> >>> > should be because it would be really helpful for doing cleanup. >> >>> > >> >>> > On Sat, Jan 13, 2018 at 7:53 PM, Brett Ryan <[email protected]> >> >>> wrote: >> >>> > >> >>> > > Why are core modules not listing everything as provided? >> >>> > > >> >>> > > IDE’s solve this problem with the use of dependency libraries. >> As an >> >>> > > example NetBeans nbm’s have a single purpose, you must export the >> >>> > packages >> >>> > > to be exposed. >> >>> > > >> >>> > > We do the same with confluence modules using felix. >> >>> > > >> >>> > > Why is NiFi doing things different just so the person who wants >> to >> >>> > install >> >>> > > many custom nars can be lazy? >> >>> > > >> >>> > > > On 14 Jan 2018, at 08:59, Tony Kurc <[email protected]> wrote: >> >>> > > > >> >>> > > > I added some more stats to the wiki page, trying to determine >> what >> >>> > > > dependencies are included in jars. It seems like there is >> >>> opportunity. >> >>> > > > >> >>> > > > Highlights, 50 copies of what appears to be some version of >> >>> > bcprov-jdk15 >> >>> > > > for a total of 162M. 51 copies of jackson-databind. >> >>> > > > >> >>> > > > total size copies jar >> >>> > > > 30.97MB 65 META-INF/bundled-dependencies/ >> >>> > > commons-lang3-XXX.jar >> >>> > > > 32.53MB 50 META-INF/bundled-dependencies/ >> >>> > > bcpkix-jdk15on-XXX.jar >> >>> > > > 33.55MB 16 >> META-INF/bundled-dependencies/guava-XXX.jar >> >>> > > > 39.62MB 1 META-INF/bundled-dependencies/ >> >>> > > jython-shaded-XXX.jar >> >>> > > > 63.06MB 51 >> >>> > > > META-INF/bundled-dependencies/jackson-databind-XXX.jar >> >>> > > > 162.07MB 50 META-INF/bundled-dependencies/ >> >>> > > bcprov-jdk15on-XXX.jar >> >>> > > > >> >>> > > > >> >>> > > >> On Sat, Jan 13, 2018 at 2:09 PM, Joey Frazee < >> >>> [email protected]> >> >>> > > wrote: >> >>> > > >> >> >>> > > >> I tend to have feelings similar to Michael about a multi-repo >> >>> > approach. >> >>> > > >> I’ve rarely seen it help and more often seen it hurt — it’s >> >>> confusing >> >>> > > >> (especially to newcomers), stuff gets neglected because it’s >> >>> easier to >> >>> > > >> ignore, you need another master project or some such to do an >> >>> entire >> >>> > > build. >> >>> > > >> >> >>> > > >> Maybe git submodules could help mitigate this, but creating >> >>> > independent >> >>> > > >> assemblies or using different build profiles to enable >> building and >> >>> > > >> packaging the binaries in different ways would satisfy >> everything >> >>> > except >> >>> > > >> disentangling the releases. >> >>> > > >> >> >>> > > >> -joey >> >>> > > >> >> >>> > > >>> On Jan 13, 2018, 12:40 PM -0600, Brandon DeVries >> <[email protected]>, >> >>> > wrote: >> >>> > > >>> I agree... Long term extension registry, short term one repo >> with >> >>> > > >> different >> >>> > > >>> assemblies (e.g. standard, slim, analytic, etc...). >> >>> > > >>> >> >>> > > >>> Brandon >> >>> > > >>> >> >>> > > >>> On Sat, Jan 13, 2018 at 1:35 PM Pierre Villard < >> >>> > > >> [email protected] >> >>> > > >>> wrote: >> >>> > > >>> >> >>> > > >>>> Option #3 also has my preference. But it's probably a good >> idea >> >>> to >> >>> > > only >> >>> > > >>>> keep one git repo and play with the assembly and Maven >> profiles >> >>> for >> >>> > > the >> >>> > > >>>> releases, no? It'd be certainly easier for release >> management >> >>> > process. >> >>> > > >> But >> >>> > > >>>> this decision could also depend on how the option #3 is >> going to >> >>> be >> >>> > > >>>> implemented I guess. >> >>> > > >>>> >> >>> > > >>>> 2018-01-13 6:36 GMT-07:00 Joe Witt <[email protected]>: >> >>> > > >>>> >> >>> > > >>>>> thanks tony! >> >>> > > >>>>> >> >>> > > >>>>>> On Jan 12, 2018 10:48 PM, "Tony Kurc" <[email protected]> >> >>> wrote: >> >>> > > >>>>>> >> >>> > > >>>>>> I put some of the data I was working with on the wiki - >> >>> > > >>>>>> >> >>> > > >>>>>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+ >> >>> > > >> 1.5.0+nar+files >> >>> > > >>>>>> >> >>> > > >>>>>> On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer < >> >>> [email protected] >> >>> > > >>>> wrote: >> >>> > > >>>>>> >> >>> > > >>>>>>> So my favorite option is Bryan’s option number “three” of >> >>> using >> >>> > > >> the >> >>> > > >>>>>>> extension registry. Now my thought is do we really need >> to add >> >>> > > >>>>> complexity >> >>> > > >>>>>>> and do anything in the mean time or just focus on that? >> >>> Meaning >> >>> > > >> we >> >>> > > >>>> have >> >>> > > >>>>>>> roughly 500mb of available capacity today so why don’t we >> >>> spend >> >>> > > >> those >> >>> > > >>>>> man >> >>> > > >>>>>>> hours we would spend on getting the second repo up on the >> >>> > > >> extension >> >>> > > >>>>>>> registry instead? >> >>> > > >>>>>>> >> >>> > > >>>>>>> @Bryan do you have thoughts about the deployment of >> those bars >> >>> > > >> in the >> >>> > > >>>>>>> extension registry? Since we won’t be able to build the >> >>> release >> >>> > > >>>> binary >> >>> > > >>>>>>> anymore would we still need to create separate repos for >> the >> >>> > > >> nars or >> >>> > > >>>>>> no?? I >> >>> > > >>>>>>> have used the registry a little but I’m not 100% sure on >> your >> >>> > > >> vision >> >>> > > >>>>> for >> >>> > > >>>>>>> the nars >> >>> > > >>>>>>> >> >>> > > >>>>>>> - Jeremy Dyer >> >>> > > >>>>>>> >> >>> > > >>>>>>> Sent from my iPhone >> >>> > > >>>>>>> >> >>> > > >>>>>>>> On Jan 12, 2018, at 10:18 PM, Tony Kurc >> <[email protected]> >> >>> > > >> wrote: >> >>> > > >>>>>>>> >> >>> > > >>>>>>>> I was looking at nar sizes, and thought some data may be >> >>> > > >> helpful. I >> >>> > > >>>>>> used >> >>> > > >>>>>>> my recent RC1 verification as a basis for getting file >> sizes, >> >>> and >> >>> > > >>>> just >> >>> > > >>>>>> got >> >>> > > >>>>>>> the file size for each file in the assembly named >> "*.nar". I >> >>> > > >> don't >> >>> > > >>>> know >> >>> > > >>>>>>> whether the images I pasted in will go through, but I >> made >> >>> some >> >>> > > >>>>> graphs.b >> >>> > > >>>>>>> The first is a histogram of nar file size in buckets of >> 10MB. >> >>> The >> >>> > > >>>>> second >> >>> > > >>>>>>> basically is similar to a cumulative distribution, the x >> axis >> >>> is >> >>> > > >> the >> >>> > > >>>>>> "rank" >> >>> > > >>>>>>> of the nar (smallest to largest), and the y-axis is how >> what >> >>> > > >> fraction >> >>> > > >>>>> of >> >>> > > >>>>>>> the all the sizes of the nars together are that rank or >> >>> lower. In >> >>> > > >>>> other >> >>> > > >>>>>>> words, on the graph, the dot at 60 and ~27 means that the >> >>> > > >> smallest 60 >> >>> > > >>>>>> nars >> >>> > > >>>>>>> contribute only ~27% of the total. Of note, the standard >> and >> >>> > > >>>> framework >> >>> > > >>>>>> nars >> >>> > > >>>>>>> are at 83 and 84. >> >>> > > >>>>>>>> >> >>> > > >>>>>>>> >> >>> > > >>>>>>>> >> >>> > > >>>>>>>> >> >>> > > >>>>>>>> >> >>> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:04 PM, Michael Moser < >> >>> > > >>>> [email protected] >> >>> > > >>>>>>> wrote: >> >>> > > >>>>>>>>> And of course, as I hit <send> I thought of one more >> thing. >> >>> > > >>>>>>>>> >> >>> > > >>>>>>>>> We could keep all of the code in 1 git repo (1 >> project) but >> >>> > > >> the >> >>> > > >>>>>>>>> nifi-assembly part of the build could be broken up to >> build >> >>> > > >> core >> >>> > > >>>>> NiFi >> >>> > > >>>>>>>>> separately from the tar/zip functional grouping of >> other >> >>> > > >> NARs. >> >>> > > >>>>>>>>> >> >>> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:01 PM, Michael Moser < >> >>> > > >>>> [email protected] >> >>> > > >>>>>>> wrote: >> >>> > > >>>>>>>>> >> >>> > > >>>>>>>>>> Long term I would also like to see #3 be the >> solution. I >> >>> > > >> think >> >>> > > >>>>> what >> >>> > > >>>>>>>>>> Joseph N described could be part of the capabilities >> of #3. >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> I would like to add a note of caution with respect to >> >>> > > >>>> reorganizing >> >>> > > >>>>>> and >> >>> > > >>>>>>>>>> releasing extension bundles separately: >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> - the burden on release manager expands because many >> more >> >>> > > >>>>>> projects >> >>> > > >>>>>>>>>> have to be released; probably not all on each release >> cycle >> >>> > > >>>> but >> >>> > > >>>>>> it >> >>> > > >>>>>>> could >> >>> > > >>>>>>>>>> still be many >> >>> > > >>>>>>>>>> - the chance of accidentally forgetting to release a >> >>> > > >> project >> >>> > > >>>>> in a >> >>> > > >>>>>>>>>> release cycle becomes non-zero >> >>> > > >>>>>>>>>> - sharing code between projects gets a bit harder >> because >> >>> > > >> you >> >>> > > >>>>>> have >> >>> > > >>>>>>> to >> >>> > > >>>>>>>>>> manage releasing projects in a specific order >> >>> > > >>>>>>>>>> - it becomes harder to find all of the projects that >> need >> >>> > > >> to >> >>> > > >>>>>> change >> >>> > > >>>>>>>>>> when shared code is added >> >>> > > >>>>>>>>>> - the simple act of finding code becomes harder ... in >> >>> > > >> which >> >>> > > >>>>>>> project >> >>> > > >>>>>>>>>> is that class in? (IDEs like IntelliJ can search in 1 >> >>> > > >>>> project, >> >>> > > >>>>>> but >> >>> > > >>>>>>> if they >> >>> > > >>>>>>>>>> search across multiple projects, then I haven't >> learned >> >>> > > >> how) >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> I used to maintain several nars in separate projects, >> and >> >>> > > >>>> recently >> >>> > > >>>>>>>>>> reorganized them into 1 project (following NiFi's >> >>> > > >> multi-module >> >>> > > >>>>> maven >> >>> > > >>>>>>> build) >> >>> > > >>>>>>>>>> and life has become much easier! >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> -- Mike >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> On Fri, Jan 12, 2018 at 4:33 PM, Chris Herrera < >> >>> > > >>>>>>> [email protected] >> >>> > > >>>>>>>>>> wrote: >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>>> I very much like the solution proposed by Bryan >> below. >> >>> > > >> This >> >>> > > >>>> would >> >>> > > >>>>>>> allow >> >>> > > >>>>>>>>>>> for a cleaner docker image as well, while still >> proving >> >>> > > >> the >> >>> > > >>>>>>> functionality >> >>> > > >>>>>>>>>>> as needed. For sure, the extension registry will be >> >>> > > >> great, but >> >>> > > >>>> in >> >>> > > >>>>>>> the mean >> >>> > > >>>>>>>>>>> time this is an adequate mid step. >> >>> > > >>>>>>>>>>> >> >>> > > >>>>>>>>>>> Regards, >> >>> > > >>>>>>>>>>> Chris >> >>> > > >>>>>>>>>>> >> >>> > > >>>>>>>>>>> On Jan 12, 2018, 2:52 PM -0600, Bryan Bende < >> >>> > > >> [email protected] >> >>> > > >>>>> , >> >>> > > >>>>>>> wrote: >> >>> > > >>>>>>>>>>>> Long term I'd like to see the extension registry >> take >> >>> > > >> form >> >>> > > >>>> and >> >>> > > >>>>>> have >> >>> > > >>>>>>>>>>>> that be the solution (#3). >> >>> > > >>>>>>>>>>>> >> >>> > > >>>>>>>>>>>> In the more near term, we could separate all of the >> >>> > > >> NARs, >> >>> > > >>>>> except >> >>> > > >>>>>>> for >> >>> > > >>>>>>>>>>>> framework and maybe standard processors & services, >> >>> > > >> into a >> >>> > > >>>>>> separate >> >>> > > >>>>>>>>>>>> git repo. >> >>> > > >>>>>>>>>>>> >> >>> > > >>>>>>>>>>>> In that new git repo we could organize things like >> Joe >> >>> > > >> N just >> >>> > > >>>>>>>>>>>> described according to some kind of functional >> >>> > > >> grouping. Each >> >>> > > >>>>> of >> >>> > > >>>>>>> these >> >>> > > >>>>>>>>>>>> functional bundles could produce its own tar/zip >> which >> >>> > > >> we can >> >>> > > >>>>>> make >> >>> > > >>>>>>>>>>>> available for download. >> >>> > > >>>>>>>>>>>> >> >>> > > >>>>>>>>>>>> That would separate the release cycles between core >> >>> > > >> NiFi and >> >>> > > >>>>> the >> >>> > > >>>>>>> other >> >>> > > >>>>>>>>>>>> NARs, and also avoid having any single binary >> artifact >> >>> > > >> that >> >>> > > >>>>> gets >> >>> > > >>>>>>> too >> >>> > > >>>>>>>>>>>> large. >> >>> > > >>>>>>>>>>>> >> >>> > > >>>>>>>>>>>> >> >>> > > >>>>>>>>>>>> >> >>> > > >>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:43 PM, Joseph Niemiec < >> >>> > > >>>>>>> [email protected] >> >>> > > >>>>>>>>>>> wrote: >> >>> > > >>>>>>>>>>>>> just a random thought. >> >>> > > >>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>> Drop In Lib packs... All the Hadoop ones in one >> >>> > > >> package for >> >>> > > >>>>>>> example >> >>> > > >>>>>>>>>>> that >> >>> > > >>>>>>>>>>>>> can be added to a slim Nifi install. Another may be >> >>> > > >> for >> >>> > > >>>>> Cloud, >> >>> > > >>>>>> or >> >>> > > >>>>>>>>>>> Database >> >>> > > >>>>>>>>>>>>> Interactions, Integration (JMS, FTP, etc) of course >> >>> > > >>>> defining >> >>> > > >>>>>>> these >> >>> > > >>>>>>>>>>> groups >> >>> > > >>>>>>>>>>>>> would be the tricky part... Or perhaps some type of >> >>> > > >>>> installer >> >>> > > >>>>>>> which >> >>> > > >>>>>>>>>>> allows >> >>> > > >>>>>>>>>>>>> you to elect which packages to download to add to >> >>> > > >> the slim >> >>> > > >>>>>>> install? >> >>> > > >>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:10 PM, Joe Witt < >> >>> > > >>>>> [email protected] >> >>> > > >>>>>>> wrote: >> >>> > > >>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>>> Team, >> >>> > > >>>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>>> The NiFi convenience binary (tar.gz/zip) size has >> >>> > > >> grown >> >>> > > >>>> to >> >>> > > >>>>>>> 1.1GB now >> >>> > > >>>>>>>>>>>>>> in the latest release. Apache infra expanded it to >> >>> > > >> 1.6GB >> >>> > > >>>>>>> allowance >> >>> > > >>>>>>>>>>>>>> for us but has stated this is the last time. >> >>> > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-15816 >> >>> > > >>>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>>> We need consider: >> >>> > > >>>>>>>>>>>>>> 1) removing old nars/less commonly used nars/or >> >>> > > >>>>> particularly >> >>> > > >>>>>>> massive >> >>> > > >>>>>>>>>>>>>> nars from the assembly we distribute by default. >> >>> > > >> Folks >> >>> > > >>>> can >> >>> > > >>>>>>> still use >> >>> > > >>>>>>>>>>>>>> these things if they want just not from our >> >>> > > >> convenience >> >>> > > >>>>>> binary >> >>> > > >>>>>>>>>>>>>> 2) collapsing nars with highly repeating deps >> >>> > > >>>>>>>>>>>>>> 3) Getting the extension registry baked into the >> >>> > > >> Flow >> >>> > > >>>>>> Registry >> >>> > > >>>>>>> then >> >>> > > >>>>>>>>>>>>>> moving to separate releases for extension bundles. >> >>> > > >> The >> >>> > > >>>> main >> >>> > > >>>>>>> release >> >>> > > >>>>>>>>>>>>>> then would be just the NiFi framework. >> >>> > > >>>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>>> Any other ideas ? >> >>> > > >>>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>>> I'll plan to start identifying candiates for >> >>> > > >> removal >> >>> > > >>>> soon. >> >>> > > >>>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>>> Thanks >> >>> > > >>>>>>>>>>>>>> Joe >> >>> > > >>>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>> >> >>> > > >>>>>>>>>>>>> -- >> >>> > > >>>>>>>>>>>>> Joseph >> >>> > > >>>>>>>>>>> >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>>>> >> >>> > > >>>>>>>> >> >>> > > >>>>>>> >> >>> > > >>>>>> >> >>> > > >>>>> >> >>> > > >>>> >> >>> > > >> >> >>> > > >> >>> > >> >>> >> >> >> >> >> >> >>
