Since the limit was bumped to 1.6GB, it might be prudent to not do too much NiFi 1.X and instead focus on a comprehensive solution that coincides with 2.0. I think that would be a time when a lot of users might expect and be tolerant of breaking changes on issues like this.
Also, is there a clear process for deprecating processors? If not, there should be because it would be really helpful for doing cleanup. On Sat, Jan 13, 2018 at 7:53 PM, Brett Ryan <[email protected]> wrote: > Why are core modules not listing everything as provided? > > IDE’s solve this problem with the use of dependency libraries. As an > example NetBeans nbm’s have a single purpose, you must export the packages > to be exposed. > > We do the same with confluence modules using felix. > > Why is NiFi doing things different just so the person who wants to install > many custom nars can be lazy? > > > On 14 Jan 2018, at 08:59, Tony Kurc <[email protected]> wrote: > > > > I added some more stats to the wiki page, trying to determine what > > dependencies are included in jars. It seems like there is opportunity. > > > > Highlights, 50 copies of what appears to be some version of bcprov-jdk15 > > for a total of 162M. 51 copies of jackson-databind. > > > > total size copies jar > > 30.97MB 65 META-INF/bundled-dependencies/ > commons-lang3-XXX.jar > > 32.53MB 50 META-INF/bundled-dependencies/ > bcpkix-jdk15on-XXX.jar > > 33.55MB 16 META-INF/bundled-dependencies/guava-XXX.jar > > 39.62MB 1 META-INF/bundled-dependencies/ > jython-shaded-XXX.jar > > 63.06MB 51 > > META-INF/bundled-dependencies/jackson-databind-XXX.jar > > 162.07MB 50 META-INF/bundled-dependencies/ > bcprov-jdk15on-XXX.jar > > > > > >> On Sat, Jan 13, 2018 at 2:09 PM, Joey Frazee <[email protected]> > wrote: > >> > >> I tend to have feelings similar to Michael about a multi-repo approach. > >> I’ve rarely seen it help and more often seen it hurt — it’s confusing > >> (especially to newcomers), stuff gets neglected because it’s easier to > >> ignore, you need another master project or some such to do an entire > build. > >> > >> Maybe git submodules could help mitigate this, but creating independent > >> assemblies or using different build profiles to enable building and > >> packaging the binaries in different ways would satisfy everything except > >> disentangling the releases. > >> > >> -joey > >> > >>> On Jan 13, 2018, 12:40 PM -0600, Brandon DeVries <[email protected]>, wrote: > >>> I agree... Long term extension registry, short term one repo with > >> different > >>> assemblies (e.g. standard, slim, analytic, etc...). > >>> > >>> Brandon > >>> > >>> On Sat, Jan 13, 2018 at 1:35 PM Pierre Villard < > >> [email protected] > >>> wrote: > >>> > >>>> Option #3 also has my preference. But it's probably a good idea to > only > >>>> keep one git repo and play with the assembly and Maven profiles for > the > >>>> releases, no? It'd be certainly easier for release management process. > >> But > >>>> this decision could also depend on how the option #3 is going to be > >>>> implemented I guess. > >>>> > >>>> 2018-01-13 6:36 GMT-07:00 Joe Witt <[email protected]>: > >>>> > >>>>> thanks tony! > >>>>> > >>>>>> On Jan 12, 2018 10:48 PM, "Tony Kurc" <[email protected]> wrote: > >>>>>> > >>>>>> I put some of the data I was working with on the wiki - > >>>>>> > >>>>>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+ > >> 1.5.0+nar+files > >>>>>> > >>>>>> On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer <[email protected] > >>>> wrote: > >>>>>> > >>>>>>> So my favorite option is Bryan’s option number “three” of using > >> the > >>>>>>> extension registry. Now my thought is do we really need to add > >>>>> complexity > >>>>>>> and do anything in the mean time or just focus on that? Meaning > >> we > >>>> have > >>>>>>> roughly 500mb of available capacity today so why don’t we spend > >> those > >>>>> man > >>>>>>> hours we would spend on getting the second repo up on the > >> extension > >>>>>>> registry instead? > >>>>>>> > >>>>>>> @Bryan do you have thoughts about the deployment of those bars > >> in the > >>>>>>> extension registry? Since we won’t be able to build the release > >>>> binary > >>>>>>> anymore would we still need to create separate repos for the > >> nars or > >>>>>> no?? I > >>>>>>> have used the registry a little but I’m not 100% sure on your > >> vision > >>>>> for > >>>>>>> the nars > >>>>>>> > >>>>>>> - Jeremy Dyer > >>>>>>> > >>>>>>> Sent from my iPhone > >>>>>>> > >>>>>>>> On Jan 12, 2018, at 10:18 PM, Tony Kurc <[email protected]> > >> wrote: > >>>>>>>> > >>>>>>>> I was looking at nar sizes, and thought some data may be > >> helpful. I > >>>>>> used > >>>>>>> my recent RC1 verification as a basis for getting file sizes, and > >>>> just > >>>>>> got > >>>>>>> the file size for each file in the assembly named "*.nar". I > >> don't > >>>> know > >>>>>>> whether the images I pasted in will go through, but I made some > >>>>> graphs.b > >>>>>>> The first is a histogram of nar file size in buckets of 10MB. The > >>>>> second > >>>>>>> basically is similar to a cumulative distribution, the x axis is > >> the > >>>>>> "rank" > >>>>>>> of the nar (smallest to largest), and the y-axis is how what > >> fraction > >>>>> of > >>>>>>> the all the sizes of the nars together are that rank or lower. In > >>>> other > >>>>>>> words, on the graph, the dot at 60 and ~27 means that the > >> smallest 60 > >>>>>> nars > >>>>>>> contribute only ~27% of the total. Of note, the standard and > >>>> framework > >>>>>> nars > >>>>>>> are at 83 and 84. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Fri, Jan 12, 2018 at 5:04 PM, Michael Moser < > >>>> [email protected] > >>>>>>> wrote: > >>>>>>>>> And of course, as I hit <send> I thought of one more thing. > >>>>>>>>> > >>>>>>>>> We could keep all of the code in 1 git repo (1 project) but > >> the > >>>>>>>>> nifi-assembly part of the build could be broken up to build > >> core > >>>>> NiFi > >>>>>>>>> separately from the tar/zip functional grouping of other > >> NARs. > >>>>>>>>> > >>>>>>>>> On Fri, Jan 12, 2018 at 5:01 PM, Michael Moser < > >>>> [email protected] > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Long term I would also like to see #3 be the solution. I > >> think > >>>>> what > >>>>>>>>>> Joseph N described could be part of the capabilities of #3. > >>>>>>>>>> > >>>>>>>>>> I would like to add a note of caution with respect to > >>>> reorganizing > >>>>>> and > >>>>>>>>>> releasing extension bundles separately: > >>>>>>>>>> > >>>>>>>>>> - the burden on release manager expands because many more > >>>>>> projects > >>>>>>>>>> have to be released; probably not all on each release cycle > >>>> but > >>>>>> it > >>>>>>> could > >>>>>>>>>> still be many > >>>>>>>>>> - the chance of accidentally forgetting to release a > >> project > >>>>> in a > >>>>>>>>>> release cycle becomes non-zero > >>>>>>>>>> - sharing code between projects gets a bit harder because > >> you > >>>>>> have > >>>>>>> to > >>>>>>>>>> manage releasing projects in a specific order > >>>>>>>>>> - it becomes harder to find all of the projects that need > >> to > >>>>>> change > >>>>>>>>>> when shared code is added > >>>>>>>>>> - the simple act of finding code becomes harder ... in > >> which > >>>>>>> project > >>>>>>>>>> is that class in? (IDEs like IntelliJ can search in 1 > >>>> project, > >>>>>> but > >>>>>>> if they > >>>>>>>>>> search across multiple projects, then I haven't learned > >> how) > >>>>>>>>>> > >>>>>>>>>> I used to maintain several nars in separate projects, and > >>>> recently > >>>>>>>>>> reorganized them into 1 project (following NiFi's > >> multi-module > >>>>> maven > >>>>>>> build) > >>>>>>>>>> and life has become much easier! > >>>>>>>>>> > >>>>>>>>>> -- Mike > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, Jan 12, 2018 at 4:33 PM, Chris Herrera < > >>>>>>> [email protected] > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> I very much like the solution proposed by Bryan below. > >> This > >>>> would > >>>>>>> allow > >>>>>>>>>>> for a cleaner docker image as well, while still proving > >> the > >>>>>>> functionality > >>>>>>>>>>> as needed. For sure, the extension registry will be > >> great, but > >>>> in > >>>>>>> the mean > >>>>>>>>>>> time this is an adequate mid step. > >>>>>>>>>>> > >>>>>>>>>>> Regards, > >>>>>>>>>>> Chris > >>>>>>>>>>> > >>>>>>>>>>> On Jan 12, 2018, 2:52 PM -0600, Bryan Bende < > >> [email protected] > >>>>> , > >>>>>>> wrote: > >>>>>>>>>>>> Long term I'd like to see the extension registry take > >> form > >>>> and > >>>>>> have > >>>>>>>>>>>> that be the solution (#3). > >>>>>>>>>>>> > >>>>>>>>>>>> In the more near term, we could separate all of the > >> NARs, > >>>>> except > >>>>>>> for > >>>>>>>>>>>> framework and maybe standard processors & services, > >> into a > >>>>>> separate > >>>>>>>>>>>> git repo. > >>>>>>>>>>>> > >>>>>>>>>>>> In that new git repo we could organize things like Joe > >> N just > >>>>>>>>>>>> described according to some kind of functional > >> grouping. Each > >>>>> of > >>>>>>> these > >>>>>>>>>>>> functional bundles could produce its own tar/zip which > >> we can > >>>>>> make > >>>>>>>>>>>> available for download. > >>>>>>>>>>>> > >>>>>>>>>>>> That would separate the release cycles between core > >> NiFi and > >>>>> the > >>>>>>> other > >>>>>>>>>>>> NARs, and also avoid having any single binary artifact > >> that > >>>>> gets > >>>>>>> too > >>>>>>>>>>>> large. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:43 PM, Joseph Niemiec < > >>>>>>> [email protected] > >>>>>>>>>>> wrote: > >>>>>>>>>>>>> just a random thought. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Drop In Lib packs... All the Hadoop ones in one > >> package for > >>>>>>> example > >>>>>>>>>>> that > >>>>>>>>>>>>> can be added to a slim Nifi install. Another may be > >> for > >>>>> Cloud, > >>>>>> or > >>>>>>>>>>> Database > >>>>>>>>>>>>> Interactions, Integration (JMS, FTP, etc) of course > >>>> defining > >>>>>>> these > >>>>>>>>>>> groups > >>>>>>>>>>>>> would be the tricky part... Or perhaps some type of > >>>> installer > >>>>>>> which > >>>>>>>>>>> allows > >>>>>>>>>>>>> you to elect which packages to download to add to > >> the slim > >>>>>>> install? > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:10 PM, Joe Witt < > >>>>> [email protected] > >>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Team, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> The NiFi convenience binary (tar.gz/zip) size has > >> grown > >>>> to > >>>>>>> 1.1GB now > >>>>>>>>>>>>>> in the latest release. Apache infra expanded it to > >> 1.6GB > >>>>>>> allowance > >>>>>>>>>>>>>> for us but has stated this is the last time. > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-15816 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> We need consider: > >>>>>>>>>>>>>> 1) removing old nars/less commonly used nars/or > >>>>> particularly > >>>>>>> massive > >>>>>>>>>>>>>> nars from the assembly we distribute by default. > >> Folks > >>>> can > >>>>>>> still use > >>>>>>>>>>>>>> these things if they want just not from our > >> convenience > >>>>>> binary > >>>>>>>>>>>>>> 2) collapsing nars with highly repeating deps > >>>>>>>>>>>>>> 3) Getting the extension registry baked into the > >> Flow > >>>>>> Registry > >>>>>>> then > >>>>>>>>>>>>>> moving to separate releases for extension bundles. > >> The > >>>> main > >>>>>>> release > >>>>>>>>>>>>>> then would be just the NiFi framework. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Any other ideas ? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I'll plan to start identifying candiates for > >> removal > >>>> soon. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks > >>>>>>>>>>>>>> Joe > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> Joseph > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >> >
