Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-19 Thread Adam Taft
I'd also vote for an OSGi backend (in the long term). It's something that has been on my mind (and mentioned) for years now. The Nar classloader ecosystem is trying to implement features of OSGi (and doing it somewhat poorly at that, if you are honest). Not saying that OSGi is the right solution

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Brett Ryan
> On 18 Jan 2018, at 03:07, Matt Burgess wrote: > > BTW, talking about mixin inheritance, shared dependencies, improved > classloading, and module repositories, I feel like OSGi is the > elephant in the room. I can see perfectly good reasons NOT to move to > an OSGi-backed architecture, but it

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Brett Ryan
> On 18 Jan 2018, at 03:05, Matt Burgess wrote: > > - Some NiFi installs will be located on systems that cannot contact an > outside (or any external) repository. When we consider NAR > repositories, we should consider providing a repo-to-go or something > of that sort. At the very least I woul

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Russell Bateman
I'm glad to see the discussion of this topic. I have nothing material to add apart the caution that making NiFi too hard to install (it has always been super easy--especially important for non-engineer folk) will detract from my ability to recommend it. If it becomes a multistep action, adding

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Joe Witt
absolutely yes Sumanth - that is what the extension registry would enable. There are lots of great discussion points and feedback in this thread. I'll try to summarize in a table of pros/cons. We'll want to offer a range of these I suspect. Thanks Joe On Wed, Jan 17, 2018 at 11:21 AM, Sumanth

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Sumanth Chinthagunta
Just an idea, can we also manger Nar, custom Nars in NiFi Repository and let fresh NiFi installers to add required nars and their dependencies on the fly vi NiFi UI ? -Sumanth > On Jan 17, 2018, at 8:05 AM, Matt Burgess wrote: > > I'd like to echo many of the comments / discussion points her

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Matt Burgess
BTW, talking about mixin inheritance, shared dependencies, improved classloading, and module repositories, I feel like OSGi is the elephant in the room. I can see perfectly good reasons NOT to move to an OSGi-backed architecture, but it does feel like we'd end up implementing many of the same featu

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Matt Burgess
I'd like to echo many of the comments / discussion points here, including the extension registry (#3), NAR packs, and mixins. A couple of additional comments and caveats: NAR package management: - Grouping NAR packs based on functionality (Hadoop, RDBMS, etc.) is a good first start but it still s

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-16 Thread Kevin Doran
Nice discussion on this thread. I'm also in favor of the long-term solution being publishing extension NARs to an extension registry (#3) and removing them from the NiFi convenience binary. A few thoughts that build upon what others have said: 1. Many decisions, such as the structure of the pro

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-16 Thread Bryan Bende
I still like the "NAR packs" idea even for the single repo approach. I think if we only provide a "light" binary and then say that everything else has to be built on your own, it creates a big barrier to entry for a lot of users. With the NAR packs approach we could provide one binary that is the a

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-16 Thread Mike Thomsen
Also maybe #4: Message Queue support (JMS, Kafka, etc.) On Tue, Jan 16, 2018 at 5:13 AM, Mike Thomsen wrote: > One possibility: 3 "packs." Such as: > > 1. Big Data. > 2. Search > 3. Non-BD NoSQL. > > Each pack would be an assembly of NARs that correspond to the category. > > The core would have

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-16 Thread Mike Thomsen
One possibility: 3 "packs." Such as: 1. Big Data. 2. Search 3. Non-BD NoSQL. Each pack would be an assembly of NARs that correspond to the category. The core would have JDBC support and all of the data mutator processors. On Mon, Jan 15, 2018 at 11:54 PM, James Wing wrote: > I think a reduced

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-15 Thread James Wing
I think a reduced build is a good way forward until the extension registry is ready. If we can publish the remaining processors in one or more additional artifacts, that would be ideal. The admin burden of more git repositories or separate releases does not appeal to me, especially since we do no

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-14 Thread Mike Thomsen
Since the limit was bumped to 1.6GB, it might be prudent to not do too much NiFi 1.X and instead focus on a comprehensive solution that coincides with 2.0. I think that would be a time when a lot of users might expect and be tolerant of breaking changes on issues like this. Also, is there a clear

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-13 Thread Brett Ryan
Why are core modules not listing everything as provided? IDE’s solve this problem with the use of dependency libraries. As an example NetBeans nbm’s have a single purpose, you must export the packages to be exposed. We do the same with confluence modules using felix. Why is NiFi doing things d

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-13 Thread Tony Kurc
I added some more stats to the wiki page, trying to determine what dependencies are included in jars. It seems like there is opportunity. Highlights, 50 copies of what appears to be some version of bcprov-jdk15 for a total of 162M. 51 copies of jackson-databind. total size copies jar

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-13 Thread Joey Frazee
I tend to have feelings similar to Michael about a multi-repo approach. I’ve rarely seen it help and more often seen it hurt — it’s confusing (especially to newcomers), stuff gets neglected because it’s easier to ignore, you need another master project or some such to do an entire build. Maybe

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-13 Thread Brandon DeVries
I agree... Long term extension registry, short term one repo with different assemblies (e.g. standard, slim, analytic, etc...). Brandon On Sat, Jan 13, 2018 at 1:35 PM Pierre Villard wrote: > Option #3 also has my preference. But it's probably a good idea to only > keep one git repo and play wi

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-13 Thread Pierre Villard
Option #3 also has my preference. But it's probably a good idea to only keep one git repo and play with the assembly and Maven profiles for the releases, no? It'd be certainly easier for release management process. But this decision could also depend on how the option #3 is going to be implemented

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-13 Thread Joe Witt
thanks tony! On Jan 12, 2018 10:48 PM, "Tony Kurc" wrote: > I put some of the data I was working with on the wiki - > > https://cwiki.apache.org/confluence/display/NIFI/NiFi+1.5.0+nar+files > > On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer wrote: > > > So my favorite option is Bryan’s option nu

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Tony Kurc
I put some of the data I was working with on the wiki - https://cwiki.apache.org/confluence/display/NIFI/NiFi+1.5.0+nar+files On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer wrote: > So my favorite option is Bryan’s option number “three” of using the > extension registry. Now my thought is do we

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Jeremy Dyer
So my favorite option is Bryan’s option number “three” of using the extension registry. Now my thought is do we really need to add complexity and do anything in the mean time or just focus on that? Meaning we have roughly 500mb of available capacity today so why don’t we spend those man hours we

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Tony Kurc
I was looking at nar sizes, and thought some data may be helpful. I used my recent RC1 verification as a basis for getting file sizes, and just got the file size for each file in the assembly named "*.nar". I don't know whether the images I pasted in will go through, but I made some graphs.b The fi

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Michael Moser
And of course, as I hit I thought of one more thing. We could keep all of the code in 1 git repo (1 project) but the nifi-assembly part of the build could be broken up to build core NiFi separately from the tar/zip functional grouping of other NARs. On Fri, Jan 12, 2018 at 5:01 PM, Michael Moser

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Michael Moser
Long term I would also like to see #3 be the solution. I think what Joseph N described could be part of the capabilities of #3. I would like to add a note of caution with respect to reorganizing and releasing extension bundles separately: - the burden on release manager expands because many m

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Chris Herrera
I very much like the solution proposed by Bryan below. This would allow for a cleaner docker image as well, while still proving the functionality as needed. For sure, the extension registry will be great, but in the mean time this is an adequate mid step. Regards, Chris On Jan 12, 2018, 2:52 P

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Bryan Bende
Long term I'd like to see the extension registry take form and have that be the solution (#3). In the more near term, we could separate all of the NARs, except for framework and maybe standard processors & services, into a separate git repo. In that new git repo we could organize things like Joe

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Joseph Niemiec
just a random thought. Drop In Lib packs... All the Hadoop ones in one package for example that can be added to a slim Nifi install. Another may be for Cloud, or Database Interactions, Integration (JMS, FTP, etc) of course defining these groups would be the tricky part... Or perhaps some type of i

[DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Joe Witt
Team, The NiFi convenience binary (tar.gz/zip) size has grown to 1.1GB now in the latest release. Apache infra expanded it to 1.6GB allowance for us but has stated this is the last time. https://issues.apache.org/jira/browse/INFRA-15816 We need consider: 1) removing old nars/less commonly used n