Also maybe #4: Message Queue support (JMS, Kafka, etc.) On Tue, Jan 16, 2018 at 5:13 AM, Mike Thomsen <[email protected]> wrote:
> One possibility: 3 "packs." Such as: > > 1. Big Data. > 2. Search > 3. Non-BD NoSQL. > > Each pack would be an assembly of NARs that correspond to the category. > > The core would have JDBC support and all of the data mutator processors. > > On Mon, Jan 15, 2018 at 11:54 PM, James Wing <[email protected]> wrote: > >> I think a reduced build is a good way forward until the extension registry >> is ready. If we can publish the remaining processors in one or more >> additional artifacts, that would be ideal. The admin burden of more git >> repositories or separate releases does not appeal to me, especially since >> we do not believe it to be our long-term path. >> >> It's not going to be easy to decide on a "core" build with "extras" sold >> separately. But we will have to confront the division for the registry >> solution in any case, we might as well get started on it. >> >> On Sun, Jan 14, 2018 at 1:37 PM, Mike Thomsen <[email protected]> >> wrote: >> >> > Since the limit was bumped to 1.6GB, it might be prudent to not do too >> much >> > NiFi 1.X and instead focus on a comprehensive solution that coincides >> with >> > 2.0. I think that would be a time when a lot of users might expect and >> be >> > tolerant of breaking changes on issues like this. >> > >> > Also, is there a clear process for deprecating processors? If not, there >> > should be because it would be really helpful for doing cleanup. >> > >> > On Sat, Jan 13, 2018 at 7:53 PM, Brett Ryan <[email protected]> >> wrote: >> > >> > > Why are core modules not listing everything as provided? >> > > >> > > IDE’s solve this problem with the use of dependency libraries. As an >> > > example NetBeans nbm’s have a single purpose, you must export the >> > packages >> > > to be exposed. >> > > >> > > We do the same with confluence modules using felix. >> > > >> > > Why is NiFi doing things different just so the person who wants to >> > install >> > > many custom nars can be lazy? >> > > >> > > > On 14 Jan 2018, at 08:59, Tony Kurc <[email protected]> wrote: >> > > > >> > > > I added some more stats to the wiki page, trying to determine what >> > > > dependencies are included in jars. It seems like there is >> opportunity. >> > > > >> > > > Highlights, 50 copies of what appears to be some version of >> > bcprov-jdk15 >> > > > for a total of 162M. 51 copies of jackson-databind. >> > > > >> > > > total size copies jar >> > > > 30.97MB 65 META-INF/bundled-dependencies/ >> > > commons-lang3-XXX.jar >> > > > 32.53MB 50 META-INF/bundled-dependencies/ >> > > bcpkix-jdk15on-XXX.jar >> > > > 33.55MB 16 META-INF/bundled-dependencies/guava-XXX.jar >> > > > 39.62MB 1 META-INF/bundled-dependencies/ >> > > jython-shaded-XXX.jar >> > > > 63.06MB 51 >> > > > META-INF/bundled-dependencies/jackson-databind-XXX.jar >> > > > 162.07MB 50 META-INF/bundled-dependencies/ >> > > bcprov-jdk15on-XXX.jar >> > > > >> > > > >> > > >> On Sat, Jan 13, 2018 at 2:09 PM, Joey Frazee < >> [email protected]> >> > > wrote: >> > > >> >> > > >> I tend to have feelings similar to Michael about a multi-repo >> > approach. >> > > >> I’ve rarely seen it help and more often seen it hurt — it’s >> confusing >> > > >> (especially to newcomers), stuff gets neglected because it’s >> easier to >> > > >> ignore, you need another master project or some such to do an >> entire >> > > build. >> > > >> >> > > >> Maybe git submodules could help mitigate this, but creating >> > independent >> > > >> assemblies or using different build profiles to enable building and >> > > >> packaging the binaries in different ways would satisfy everything >> > except >> > > >> disentangling the releases. >> > > >> >> > > >> -joey >> > > >> >> > > >>> On Jan 13, 2018, 12:40 PM -0600, Brandon DeVries <[email protected]>, >> > wrote: >> > > >>> I agree... Long term extension registry, short term one repo with >> > > >> different >> > > >>> assemblies (e.g. standard, slim, analytic, etc...). >> > > >>> >> > > >>> Brandon >> > > >>> >> > > >>> On Sat, Jan 13, 2018 at 1:35 PM Pierre Villard < >> > > >> [email protected] >> > > >>> wrote: >> > > >>> >> > > >>>> Option #3 also has my preference. But it's probably a good idea >> to >> > > only >> > > >>>> keep one git repo and play with the assembly and Maven profiles >> for >> > > the >> > > >>>> releases, no? It'd be certainly easier for release management >> > process. >> > > >> But >> > > >>>> this decision could also depend on how the option #3 is going to >> be >> > > >>>> implemented I guess. >> > > >>>> >> > > >>>> 2018-01-13 6:36 GMT-07:00 Joe Witt <[email protected]>: >> > > >>>> >> > > >>>>> thanks tony! >> > > >>>>> >> > > >>>>>> On Jan 12, 2018 10:48 PM, "Tony Kurc" <[email protected]> >> wrote: >> > > >>>>>> >> > > >>>>>> I put some of the data I was working with on the wiki - >> > > >>>>>> >> > > >>>>>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+ >> > > >> 1.5.0+nar+files >> > > >>>>>> >> > > >>>>>> On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer < >> [email protected] >> > > >>>> wrote: >> > > >>>>>> >> > > >>>>>>> So my favorite option is Bryan’s option number “three” of >> using >> > > >> the >> > > >>>>>>> extension registry. Now my thought is do we really need to add >> > > >>>>> complexity >> > > >>>>>>> and do anything in the mean time or just focus on that? >> Meaning >> > > >> we >> > > >>>> have >> > > >>>>>>> roughly 500mb of available capacity today so why don’t we >> spend >> > > >> those >> > > >>>>> man >> > > >>>>>>> hours we would spend on getting the second repo up on the >> > > >> extension >> > > >>>>>>> registry instead? >> > > >>>>>>> >> > > >>>>>>> @Bryan do you have thoughts about the deployment of those bars >> > > >> in the >> > > >>>>>>> extension registry? Since we won’t be able to build the >> release >> > > >>>> binary >> > > >>>>>>> anymore would we still need to create separate repos for the >> > > >> nars or >> > > >>>>>> no?? I >> > > >>>>>>> have used the registry a little but I’m not 100% sure on your >> > > >> vision >> > > >>>>> for >> > > >>>>>>> the nars >> > > >>>>>>> >> > > >>>>>>> - Jeremy Dyer >> > > >>>>>>> >> > > >>>>>>> Sent from my iPhone >> > > >>>>>>> >> > > >>>>>>>> On Jan 12, 2018, at 10:18 PM, Tony Kurc <[email protected]> >> > > >> wrote: >> > > >>>>>>>> >> > > >>>>>>>> I was looking at nar sizes, and thought some data may be >> > > >> helpful. I >> > > >>>>>> used >> > > >>>>>>> my recent RC1 verification as a basis for getting file sizes, >> and >> > > >>>> just >> > > >>>>>> got >> > > >>>>>>> the file size for each file in the assembly named "*.nar". I >> > > >> don't >> > > >>>> know >> > > >>>>>>> whether the images I pasted in will go through, but I made >> some >> > > >>>>> graphs.b >> > > >>>>>>> The first is a histogram of nar file size in buckets of 10MB. >> The >> > > >>>>> second >> > > >>>>>>> basically is similar to a cumulative distribution, the x axis >> is >> > > >> the >> > > >>>>>> "rank" >> > > >>>>>>> of the nar (smallest to largest), and the y-axis is how what >> > > >> fraction >> > > >>>>> of >> > > >>>>>>> the all the sizes of the nars together are that rank or >> lower. In >> > > >>>> other >> > > >>>>>>> words, on the graph, the dot at 60 and ~27 means that the >> > > >> smallest 60 >> > > >>>>>> nars >> > > >>>>>>> contribute only ~27% of the total. Of note, the standard and >> > > >>>> framework >> > > >>>>>> nars >> > > >>>>>>> are at 83 and 84. >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:04 PM, Michael Moser < >> > > >>>> [email protected] >> > > >>>>>>> wrote: >> > > >>>>>>>>> And of course, as I hit <send> I thought of one more thing. >> > > >>>>>>>>> >> > > >>>>>>>>> We could keep all of the code in 1 git repo (1 project) but >> > > >> the >> > > >>>>>>>>> nifi-assembly part of the build could be broken up to build >> > > >> core >> > > >>>>> NiFi >> > > >>>>>>>>> separately from the tar/zip functional grouping of other >> > > >> NARs. >> > > >>>>>>>>> >> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:01 PM, Michael Moser < >> > > >>>> [email protected] >> > > >>>>>>> wrote: >> > > >>>>>>>>> >> > > >>>>>>>>>> Long term I would also like to see #3 be the solution. I >> > > >> think >> > > >>>>> what >> > > >>>>>>>>>> Joseph N described could be part of the capabilities of #3. >> > > >>>>>>>>>> >> > > >>>>>>>>>> I would like to add a note of caution with respect to >> > > >>>> reorganizing >> > > >>>>>> and >> > > >>>>>>>>>> releasing extension bundles separately: >> > > >>>>>>>>>> >> > > >>>>>>>>>> - the burden on release manager expands because many more >> > > >>>>>> projects >> > > >>>>>>>>>> have to be released; probably not all on each release cycle >> > > >>>> but >> > > >>>>>> it >> > > >>>>>>> could >> > > >>>>>>>>>> still be many >> > > >>>>>>>>>> - the chance of accidentally forgetting to release a >> > > >> project >> > > >>>>> in a >> > > >>>>>>>>>> release cycle becomes non-zero >> > > >>>>>>>>>> - sharing code between projects gets a bit harder because >> > > >> you >> > > >>>>>> have >> > > >>>>>>> to >> > > >>>>>>>>>> manage releasing projects in a specific order >> > > >>>>>>>>>> - it becomes harder to find all of the projects that need >> > > >> to >> > > >>>>>> change >> > > >>>>>>>>>> when shared code is added >> > > >>>>>>>>>> - the simple act of finding code becomes harder ... in >> > > >> which >> > > >>>>>>> project >> > > >>>>>>>>>> is that class in? (IDEs like IntelliJ can search in 1 >> > > >>>> project, >> > > >>>>>> but >> > > >>>>>>> if they >> > > >>>>>>>>>> search across multiple projects, then I haven't learned >> > > >> how) >> > > >>>>>>>>>> >> > > >>>>>>>>>> I used to maintain several nars in separate projects, and >> > > >>>> recently >> > > >>>>>>>>>> reorganized them into 1 project (following NiFi's >> > > >> multi-module >> > > >>>>> maven >> > > >>>>>>> build) >> > > >>>>>>>>>> and life has become much easier! >> > > >>>>>>>>>> >> > > >>>>>>>>>> -- Mike >> > > >>>>>>>>>> >> > > >>>>>>>>>> >> > > >>>>>>>>>> >> > > >>>>>>>>>> On Fri, Jan 12, 2018 at 4:33 PM, Chris Herrera < >> > > >>>>>>> [email protected] >> > > >>>>>>>>>> wrote: >> > > >>>>>>>>>> >> > > >>>>>>>>>>> I very much like the solution proposed by Bryan below. >> > > >> This >> > > >>>> would >> > > >>>>>>> allow >> > > >>>>>>>>>>> for a cleaner docker image as well, while still proving >> > > >> the >> > > >>>>>>> functionality >> > > >>>>>>>>>>> as needed. For sure, the extension registry will be >> > > >> great, but >> > > >>>> in >> > > >>>>>>> the mean >> > > >>>>>>>>>>> time this is an adequate mid step. >> > > >>>>>>>>>>> >> > > >>>>>>>>>>> Regards, >> > > >>>>>>>>>>> Chris >> > > >>>>>>>>>>> >> > > >>>>>>>>>>> On Jan 12, 2018, 2:52 PM -0600, Bryan Bende < >> > > >> [email protected] >> > > >>>>> , >> > > >>>>>>> wrote: >> > > >>>>>>>>>>>> Long term I'd like to see the extension registry take >> > > >> form >> > > >>>> and >> > > >>>>>> have >> > > >>>>>>>>>>>> that be the solution (#3). >> > > >>>>>>>>>>>> >> > > >>>>>>>>>>>> In the more near term, we could separate all of the >> > > >> NARs, >> > > >>>>> except >> > > >>>>>>> for >> > > >>>>>>>>>>>> framework and maybe standard processors & services, >> > > >> into a >> > > >>>>>> separate >> > > >>>>>>>>>>>> git repo. >> > > >>>>>>>>>>>> >> > > >>>>>>>>>>>> In that new git repo we could organize things like Joe >> > > >> N just >> > > >>>>>>>>>>>> described according to some kind of functional >> > > >> grouping. Each >> > > >>>>> of >> > > >>>>>>> these >> > > >>>>>>>>>>>> functional bundles could produce its own tar/zip which >> > > >> we can >> > > >>>>>> make >> > > >>>>>>>>>>>> available for download. >> > > >>>>>>>>>>>> >> > > >>>>>>>>>>>> That would separate the release cycles between core >> > > >> NiFi and >> > > >>>>> the >> > > >>>>>>> other >> > > >>>>>>>>>>>> NARs, and also avoid having any single binary artifact >> > > >> that >> > > >>>>> gets >> > > >>>>>>> too >> > > >>>>>>>>>>>> large. >> > > >>>>>>>>>>>> >> > > >>>>>>>>>>>> >> > > >>>>>>>>>>>> >> > > >>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:43 PM, Joseph Niemiec < >> > > >>>>>>> [email protected] >> > > >>>>>>>>>>> wrote: >> > > >>>>>>>>>>>>> just a random thought. >> > > >>>>>>>>>>>>> >> > > >>>>>>>>>>>>> Drop In Lib packs... All the Hadoop ones in one >> > > >> package for >> > > >>>>>>> example >> > > >>>>>>>>>>> that >> > > >>>>>>>>>>>>> can be added to a slim Nifi install. Another may be >> > > >> for >> > > >>>>> Cloud, >> > > >>>>>> or >> > > >>>>>>>>>>> Database >> > > >>>>>>>>>>>>> Interactions, Integration (JMS, FTP, etc) of course >> > > >>>> defining >> > > >>>>>>> these >> > > >>>>>>>>>>> groups >> > > >>>>>>>>>>>>> would be the tricky part... Or perhaps some type of >> > > >>>> installer >> > > >>>>>>> which >> > > >>>>>>>>>>> allows >> > > >>>>>>>>>>>>> you to elect which packages to download to add to >> > > >> the slim >> > > >>>>>>> install? >> > > >>>>>>>>>>>>> >> > > >>>>>>>>>>>>> >> > > >>>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:10 PM, Joe Witt < >> > > >>>>> [email protected] >> > > >>>>>>> wrote: >> > > >>>>>>>>>>>>> >> > > >>>>>>>>>>>>>> Team, >> > > >>>>>>>>>>>>>> >> > > >>>>>>>>>>>>>> The NiFi convenience binary (tar.gz/zip) size has >> > > >> grown >> > > >>>> to >> > > >>>>>>> 1.1GB now >> > > >>>>>>>>>>>>>> in the latest release. Apache infra expanded it to >> > > >> 1.6GB >> > > >>>>>>> allowance >> > > >>>>>>>>>>>>>> for us but has stated this is the last time. >> > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-15816 >> > > >>>>>>>>>>>>>> >> > > >>>>>>>>>>>>>> We need consider: >> > > >>>>>>>>>>>>>> 1) removing old nars/less commonly used nars/or >> > > >>>>> particularly >> > > >>>>>>> massive >> > > >>>>>>>>>>>>>> nars from the assembly we distribute by default. >> > > >> Folks >> > > >>>> can >> > > >>>>>>> still use >> > > >>>>>>>>>>>>>> these things if they want just not from our >> > > >> convenience >> > > >>>>>> binary >> > > >>>>>>>>>>>>>> 2) collapsing nars with highly repeating deps >> > > >>>>>>>>>>>>>> 3) Getting the extension registry baked into the >> > > >> Flow >> > > >>>>>> Registry >> > > >>>>>>> then >> > > >>>>>>>>>>>>>> moving to separate releases for extension bundles. >> > > >> The >> > > >>>> main >> > > >>>>>>> release >> > > >>>>>>>>>>>>>> then would be just the NiFi framework. >> > > >>>>>>>>>>>>>> >> > > >>>>>>>>>>>>>> Any other ideas ? >> > > >>>>>>>>>>>>>> >> > > >>>>>>>>>>>>>> I'll plan to start identifying candiates for >> > > >> removal >> > > >>>> soon. >> > > >>>>>>>>>>>>>> >> > > >>>>>>>>>>>>>> Thanks >> > > >>>>>>>>>>>>>> Joe >> > > >>>>>>>>>>>>>> >> > > >>>>>>>>>>>>> >> > > >>>>>>>>>>>>> >> > > >>>>>>>>>>>>> >> > > >>>>>>>>>>>>> -- >> > > >>>>>>>>>>>>> Joseph >> > > >>>>>>>>>>> >> > > >>>>>>>>>> >> > > >>>>>>>>>> >> > > >>>>>>>> >> > > >>>>>>> >> > > >>>>>> >> > > >>>>> >> > > >>>> >> > > >> >> > > >> > >> > >
