Also maybe #4: Message Queue support (JMS, Kafka, etc.)

On Tue, Jan 16, 2018 at 5:13 AM, Mike Thomsen <[email protected]>
wrote:

> One possibility: 3 "packs." Such as:
>
> 1. Big Data.
> 2. Search
> 3. Non-BD NoSQL.
>
> Each pack would be an assembly of NARs that correspond to the category.
>
> The core would have JDBC support and all of the data mutator processors.
>
> On Mon, Jan 15, 2018 at 11:54 PM, James Wing <[email protected]> wrote:
>
>> I think a reduced build is a good way forward until the extension registry
>> is ready.  If we can publish the remaining processors in one or more
>> additional artifacts, that would be ideal.  The admin burden of more git
>> repositories or separate releases does not appeal to me, especially since
>> we do not believe it to be our long-term path.
>>
>> It's not going to be easy to decide on a "core" build with "extras" sold
>> separately. But we will have to confront the division for the registry
>> solution in any case, we might as well get started on it.
>>
>> On Sun, Jan 14, 2018 at 1:37 PM, Mike Thomsen <[email protected]>
>> wrote:
>>
>> > Since the limit was bumped to 1.6GB, it might be prudent to not do too
>> much
>> > NiFi 1.X and instead focus on a comprehensive solution that coincides
>> with
>> > 2.0. I think that would be a time when a lot of users might expect and
>> be
>> > tolerant of breaking changes on issues like this.
>> >
>> > Also, is there a clear process for deprecating processors? If not, there
>> > should be because it would be really helpful for doing cleanup.
>> >
>> > On Sat, Jan 13, 2018 at 7:53 PM, Brett Ryan <[email protected]>
>> wrote:
>> >
>> > > Why are core modules not listing everything as provided?
>> > >
>> > > IDE’s solve this problem with the use of dependency libraries. As an
>> > > example NetBeans nbm’s have a single purpose, you must export the
>> > packages
>> > > to be exposed.
>> > >
>> > > We do the same with confluence modules using felix.
>> > >
>> > > Why is NiFi doing things different just so the person who wants to
>> > install
>> > > many custom nars can be lazy?
>> > >
>> > > > On 14 Jan 2018, at 08:59, Tony Kurc <[email protected]> wrote:
>> > > >
>> > > > I added some more stats to the wiki page, trying to determine what
>> > > > dependencies are included in jars. It seems like there is
>> opportunity.
>> > > >
>> > > > Highlights, 50 copies of what appears to be some version of
>> > bcprov-jdk15
>> > > > for a total of 162M. 51 copies of jackson-databind.
>> > > >
>> > > > total size       copies  jar
>> > > >     30.97MB     65     META-INF/bundled-dependencies/
>> > > commons-lang3-XXX.jar
>> > > >     32.53MB     50     META-INF/bundled-dependencies/
>> > > bcpkix-jdk15on-XXX.jar
>> > > >     33.55MB     16     META-INF/bundled-dependencies/guava-XXX.jar
>> > > >     39.62MB      1     META-INF/bundled-dependencies/
>> > > jython-shaded-XXX.jar
>> > > >     63.06MB     51
>> > > > META-INF/bundled-dependencies/jackson-databind-XXX.jar
>> > > >    162.07MB     50     META-INF/bundled-dependencies/
>> > > bcprov-jdk15on-XXX.jar
>> > > >
>> > > >
>> > > >> On Sat, Jan 13, 2018 at 2:09 PM, Joey Frazee <
>> [email protected]>
>> > > wrote:
>> > > >>
>> > > >> I tend to have feelings similar to Michael about a multi-repo
>> > approach.
>> > > >> I’ve rarely seen it help and more often seen it hurt — it’s
>> confusing
>> > > >> (especially to newcomers), stuff gets neglected because it’s
>> easier to
>> > > >> ignore, you need another master project or some such to do an
>> entire
>> > > build.
>> > > >>
>> > > >> Maybe git submodules could help mitigate this, but creating
>> > independent
>> > > >> assemblies or using different build profiles to enable building and
>> > > >> packaging the binaries in different ways would satisfy everything
>> > except
>> > > >> disentangling the releases.
>> > > >>
>> > > >> -joey
>> > > >>
>> > > >>> On Jan 13, 2018, 12:40 PM -0600, Brandon DeVries <[email protected]>,
>> > wrote:
>> > > >>> I agree... Long term extension registry, short term one repo with
>> > > >> different
>> > > >>> assemblies (e.g. standard, slim, analytic, etc...).
>> > > >>>
>> > > >>> Brandon
>> > > >>>
>> > > >>> On Sat, Jan 13, 2018 at 1:35 PM Pierre Villard <
>> > > >> [email protected]
>> > > >>> wrote:
>> > > >>>
>> > > >>>> Option #3 also has my preference. But it's probably a good idea
>> to
>> > > only
>> > > >>>> keep one git repo and play with the assembly and Maven profiles
>> for
>> > > the
>> > > >>>> releases, no? It'd be certainly easier for release management
>> > process.
>> > > >> But
>> > > >>>> this decision could also depend on how the option #3 is going to
>> be
>> > > >>>> implemented I guess.
>> > > >>>>
>> > > >>>> 2018-01-13 6:36 GMT-07:00 Joe Witt <[email protected]>:
>> > > >>>>
>> > > >>>>> thanks tony!
>> > > >>>>>
>> > > >>>>>> On Jan 12, 2018 10:48 PM, "Tony Kurc" <[email protected]>
>> wrote:
>> > > >>>>>>
>> > > >>>>>> I put some of the data I was working with on the wiki -
>> > > >>>>>>
>> > > >>>>>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+
>> > > >> 1.5.0+nar+files
>> > > >>>>>>
>> > > >>>>>> On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer <
>> [email protected]
>> > > >>>> wrote:
>> > > >>>>>>
>> > > >>>>>>> So my favorite option is Bryan’s option number “three” of
>> using
>> > > >> the
>> > > >>>>>>> extension registry. Now my thought is do we really need to add
>> > > >>>>> complexity
>> > > >>>>>>> and do anything in the mean time or just focus on that?
>> Meaning
>> > > >> we
>> > > >>>> have
>> > > >>>>>>> roughly 500mb of available capacity today so why don’t we
>> spend
>> > > >> those
>> > > >>>>> man
>> > > >>>>>>> hours we would spend on getting the second repo up on the
>> > > >> extension
>> > > >>>>>>> registry instead?
>> > > >>>>>>>
>> > > >>>>>>> @Bryan do you have thoughts about the deployment of those bars
>> > > >> in the
>> > > >>>>>>> extension registry? Since we won’t be able to build the
>> release
>> > > >>>> binary
>> > > >>>>>>> anymore would we still need to create separate repos for the
>> > > >> nars or
>> > > >>>>>> no?? I
>> > > >>>>>>> have used the registry a little but I’m not 100% sure on your
>> > > >> vision
>> > > >>>>> for
>> > > >>>>>>> the nars
>> > > >>>>>>>
>> > > >>>>>>> - Jeremy Dyer
>> > > >>>>>>>
>> > > >>>>>>> Sent from my iPhone
>> > > >>>>>>>
>> > > >>>>>>>> On Jan 12, 2018, at 10:18 PM, Tony Kurc <[email protected]>
>> > > >> wrote:
>> > > >>>>>>>>
>> > > >>>>>>>> I was looking at nar sizes, and thought some data may be
>> > > >> helpful. I
>> > > >>>>>> used
>> > > >>>>>>> my recent RC1 verification as a basis for getting file sizes,
>> and
>> > > >>>> just
>> > > >>>>>> got
>> > > >>>>>>> the file size for each file in the assembly named "*.nar". I
>> > > >> don't
>> > > >>>> know
>> > > >>>>>>> whether the images I pasted in will go through, but I made
>> some
>> > > >>>>> graphs.b
>> > > >>>>>>> The first is a histogram of nar file size in buckets of 10MB.
>> The
>> > > >>>>> second
>> > > >>>>>>> basically is similar to a cumulative distribution, the x axis
>> is
>> > > >> the
>> > > >>>>>> "rank"
>> > > >>>>>>> of the nar (smallest to largest), and the y-axis is how what
>> > > >> fraction
>> > > >>>>> of
>> > > >>>>>>> the all the sizes of the nars together are that rank or
>> lower. In
>> > > >>>> other
>> > > >>>>>>> words, on the graph, the dot at 60 and ~27 means that the
>> > > >> smallest 60
>> > > >>>>>> nars
>> > > >>>>>>> contribute only ~27% of the total. Of note, the standard and
>> > > >>>> framework
>> > > >>>>>> nars
>> > > >>>>>>> are at 83 and 84.
>> > > >>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:04 PM, Michael Moser <
>> > > >>>> [email protected]
>> > > >>>>>>> wrote:
>> > > >>>>>>>>> And of course, as I hit <send> I thought of one more thing.
>> > > >>>>>>>>>
>> > > >>>>>>>>> We could keep all of the code in 1 git repo (1 project) but
>> > > >> the
>> > > >>>>>>>>> nifi-assembly part of the build could be broken up to build
>> > > >> core
>> > > >>>>> NiFi
>> > > >>>>>>>>> separately from the tar/zip functional grouping of other
>> > > >> NARs.
>> > > >>>>>>>>>
>> > > >>>>>>>>> On Fri, Jan 12, 2018 at 5:01 PM, Michael Moser <
>> > > >>>> [email protected]
>> > > >>>>>>> wrote:
>> > > >>>>>>>>>
>> > > >>>>>>>>>> Long term I would also like to see #3 be the solution. I
>> > > >> think
>> > > >>>>> what
>> > > >>>>>>>>>> Joseph N described could be part of the capabilities of #3.
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> I would like to add a note of caution with respect to
>> > > >>>> reorganizing
>> > > >>>>>> and
>> > > >>>>>>>>>> releasing extension bundles separately:
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> - the burden on release manager expands because many more
>> > > >>>>>> projects
>> > > >>>>>>>>>> have to be released; probably not all on each release cycle
>> > > >>>> but
>> > > >>>>>> it
>> > > >>>>>>> could
>> > > >>>>>>>>>> still be many
>> > > >>>>>>>>>> - the chance of accidentally forgetting to release a
>> > > >> project
>> > > >>>>> in a
>> > > >>>>>>>>>> release cycle becomes non-zero
>> > > >>>>>>>>>> - sharing code between projects gets a bit harder because
>> > > >> you
>> > > >>>>>> have
>> > > >>>>>>> to
>> > > >>>>>>>>>> manage releasing projects in a specific order
>> > > >>>>>>>>>> - it becomes harder to find all of the projects that need
>> > > >> to
>> > > >>>>>> change
>> > > >>>>>>>>>> when shared code is added
>> > > >>>>>>>>>> - the simple act of finding code becomes harder ... in
>> > > >> which
>> > > >>>>>>> project
>> > > >>>>>>>>>> is that class in? (IDEs like IntelliJ can search in 1
>> > > >>>> project,
>> > > >>>>>> but
>> > > >>>>>>> if they
>> > > >>>>>>>>>> search across multiple projects, then I haven't learned
>> > > >> how)
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> I used to maintain several nars in separate projects, and
>> > > >>>> recently
>> > > >>>>>>>>>> reorganized them into 1 project (following NiFi's
>> > > >> multi-module
>> > > >>>>> maven
>> > > >>>>>>> build)
>> > > >>>>>>>>>> and life has become much easier!
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> -- Mike
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> On Fri, Jan 12, 2018 at 4:33 PM, Chris Herrera <
>> > > >>>>>>> [email protected]
>> > > >>>>>>>>>> wrote:
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>> I very much like the solution proposed by Bryan below.
>> > > >> This
>> > > >>>> would
>> > > >>>>>>> allow
>> > > >>>>>>>>>>> for a cleaner docker image as well, while still proving
>> > > >> the
>> > > >>>>>>> functionality
>> > > >>>>>>>>>>> as needed. For sure, the extension registry will be
>> > > >> great, but
>> > > >>>> in
>> > > >>>>>>> the mean
>> > > >>>>>>>>>>> time this is an adequate mid step.
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> Regards,
>> > > >>>>>>>>>>> Chris
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> On Jan 12, 2018, 2:52 PM -0600, Bryan Bende <
>> > > >> [email protected]
>> > > >>>>> ,
>> > > >>>>>>> wrote:
>> > > >>>>>>>>>>>> Long term I'd like to see the extension registry take
>> > > >> form
>> > > >>>> and
>> > > >>>>>> have
>> > > >>>>>>>>>>>> that be the solution (#3).
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> In the more near term, we could separate all of the
>> > > >> NARs,
>> > > >>>>> except
>> > > >>>>>>> for
>> > > >>>>>>>>>>>> framework and maybe standard processors & services,
>> > > >> into a
>> > > >>>>>> separate
>> > > >>>>>>>>>>>> git repo.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> In that new git repo we could organize things like Joe
>> > > >> N just
>> > > >>>>>>>>>>>> described according to some kind of functional
>> > > >> grouping. Each
>> > > >>>>> of
>> > > >>>>>>> these
>> > > >>>>>>>>>>>> functional bundles could produce its own tar/zip which
>> > > >> we can
>> > > >>>>>> make
>> > > >>>>>>>>>>>> available for download.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> That would separate the release cycles between core
>> > > >> NiFi and
>> > > >>>>> the
>> > > >>>>>>> other
>> > > >>>>>>>>>>>> NARs, and also avoid having any single binary artifact
>> > > >> that
>> > > >>>>> gets
>> > > >>>>>>> too
>> > > >>>>>>>>>>>> large.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:43 PM, Joseph Niemiec <
>> > > >>>>>>> [email protected]
>> > > >>>>>>>>>>> wrote:
>> > > >>>>>>>>>>>>> just a random thought.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> Drop In Lib packs... All the Hadoop ones in one
>> > > >> package for
>> > > >>>>>>> example
>> > > >>>>>>>>>>> that
>> > > >>>>>>>>>>>>> can be added to a slim Nifi install. Another may be
>> > > >> for
>> > > >>>>> Cloud,
>> > > >>>>>> or
>> > > >>>>>>>>>>> Database
>> > > >>>>>>>>>>>>> Interactions, Integration (JMS, FTP, etc) of course
>> > > >>>> defining
>> > > >>>>>>> these
>> > > >>>>>>>>>>> groups
>> > > >>>>>>>>>>>>> would be the tricky part... Or perhaps some type of
>> > > >>>> installer
>> > > >>>>>>> which
>> > > >>>>>>>>>>> allows
>> > > >>>>>>>>>>>>> you to elect which packages to download to add to
>> > > >> the slim
>> > > >>>>>>> install?
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> On Fri, Jan 12, 2018 at 3:10 PM, Joe Witt <
>> > > >>>>> [email protected]
>> > > >>>>>>> wrote:
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> Team,
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> The NiFi convenience binary (tar.gz/zip) size has
>> > > >> grown
>> > > >>>> to
>> > > >>>>>>> 1.1GB now
>> > > >>>>>>>>>>>>>> in the latest release. Apache infra expanded it to
>> > > >> 1.6GB
>> > > >>>>>>> allowance
>> > > >>>>>>>>>>>>>> for us but has stated this is the last time.
>> > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/INFRA-15816
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> We need consider:
>> > > >>>>>>>>>>>>>> 1) removing old nars/less commonly used nars/or
>> > > >>>>> particularly
>> > > >>>>>>> massive
>> > > >>>>>>>>>>>>>> nars from the assembly we distribute by default.
>> > > >> Folks
>> > > >>>> can
>> > > >>>>>>> still use
>> > > >>>>>>>>>>>>>> these things if they want just not from our
>> > > >> convenience
>> > > >>>>>> binary
>> > > >>>>>>>>>>>>>> 2) collapsing nars with highly repeating deps
>> > > >>>>>>>>>>>>>> 3) Getting the extension registry baked into the
>> > > >> Flow
>> > > >>>>>> Registry
>> > > >>>>>>> then
>> > > >>>>>>>>>>>>>> moving to separate releases for extension bundles.
>> > > >> The
>> > > >>>> main
>> > > >>>>>>> release
>> > > >>>>>>>>>>>>>> then would be just the NiFi framework.
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> Any other ideas ?
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> I'll plan to start identifying candiates for
>> > > >> removal
>> > > >>>> soon.
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> Thanks
>> > > >>>>>>>>>>>>>> Joe
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> --
>> > > >>>>>>>>>>>>> Joseph
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>
>> > > >>>>>>
>> > > >>>>>
>> > > >>>>
>> > > >>
>> > >
>> >
>>
>
>

Reply via email to