Chris,

Thanks for the reply and recommendations. It seems like some of the work to
reorganize the module structure could be done outside of a major release,
but it would be great to target any breaking changes for 2.0. Perhaps a
separate feature proposal on module restructuring, with the goal of
supporting optimized builds, would be a helpful way to move that part of
the discussion forward.

Regarding updating AWS SDK to version 2, it seems like that might be
possible now. I haven't taken a close look at the referencing components,
so I'm not sure about the level of effort involved. Minor NiFi version
updates have incorporated major new versions of dependencies. For example,
NiFi 1.14 included an upgrade from Spring Framework 4 to 5. On the one
hand, including the AWS SDK update as part of a major release seems
helpful, but unless there are changes that break existing component
properties, upgrading the AWS SDK could be worked independently. Others may
have more insight into particular usage of that library.

Regards,
David Handermann

On Sun, Jul 25, 2021 at 2:12 AM Chris Sampson
<chris.samp...@naimuri.com.invalid> wrote:

> Might be worth considering refactoring the build as part of this work too,
> e.g. only building the bits of the repo affected by a commit, etc. -
> discussed briefly in previous threads but don't think any changes made yet.
> If NARs/components are likely to be split up and refactored then such work
> around the build probably makes sense to consider.
>
> I've a couple of PRs open that include updates to Elasticsearch versions
> already, although I stopped at 7.10.2 (after which Elastic changed licence)
> in case there were licence concerns. But more work can be done to tidy up
> the processors, absolutely.
>
> AWS libraries to v2 would seem a sensible move and a refactor of those
> processors as well.
>
>
> Cheers,
>
> Chris Sampson
>
> On Sat, 24 Jul 2021, 17:47 David Handermann, <exceptionfact...@apache.org>
> wrote:
>
> > Thanks for pointing out the standard NAR bundles, Mark.  There are a
> number
> > of components in the standard NAR bundles with particular dependencies
> that
> > would make more sense in separate NARs. Reorganizing the standard NAR to
> > components with limited dependencies and wide applicability would
> > definitely help with future maintenance.
> >
> > Regards,
> > David Handermann
> >
> > On Sat, Jul 24, 2021 at 10:57 AM Mark Payne <marka...@hotmail.com>
> wrote:
> >
> > > There’s also some code that exists in order to maintain backward
> > > compatibility in the repositories. I would very much like the
> > repositories
> > > to contain no unnecessary code. And swap file format supports really
> old
> > > formats. And the old impls of the repositories themselves, like
> > > PersistentProvRepo instead of WriteAheadProv Repo, etc. Lots of stuff
> > there
> > > that can be removed. And some methods in ProcessSession that are never
> > used
> > > by any processor in the codebase but exists in the public API so can’t
> be
> > > removed till 2.0.
> > >
> > > I think his is also a great time to clean up the “standard nar.” At
> this
> > > point, it’s something like 70 MB. And many of the components there are
> > not
> > > really “standard” - things like connecting to FTP & SFTP servers, XML
> > > processing, Jolt transform, etc. could potentially be moved into other
> > > nars. The nifi-standard-content-viewer-1.15.0-SNAPSHOT.war is 6.9 MB is
> > not
> > > necessary for stateless or minifi java. Lots of things probably to
> > > reconsider within the standard nar.
> > >
> > > I definitely think this is a reasonable approach, to allow for a 2.0
> that
> > > is not a huge feature release but allows the project to be simpler and
> > more
> > > nimble.
> > >
> > > Thanks
> > > -Mark
> > >
> > > On Jul 24, 2021, at 10:59 AM, Mike Thomsen <mikerthom...@gmail.com
> > <mailto:
> > > mikerthom...@gmail.com>> wrote:
> > >
> > > Russell,
> > >
> > > AFAICT from looking at Elastic's repos, the low level REST client is
> > > still fine.
> > >
> >
> https://github.com/elastic/elasticsearch/blob/e5518e07f13701e3bb3dcc6842b9023966752497/client/rest/src/main/java/org/elasticsearch/client/RestClient.java
> > >
> > > Our Elasticsearch support is spread over two NARs at present. One uses
> > > OkHttp and the other uses that low level Elastic REST client.
> > > Therefore, I think we're fine on licensing for the moment.
> > >
> > > Mike
> > >
> > > On Fri, Jul 23, 2021 at 1:10 PM Russell Bateman <r...@windofkeltia.com
> > > <mailto:r...@windofkeltia.com>> wrote:
> > >
> > > Bringing up Elastic also reminds me that the Elastic framework has just
> > > recently transitioned out of Open Source, so to acknowledge that, maybe
> > > some effort toward OpenSearch--I say this not understanding exactly how
> > > this sort of thing is considered in a large-scale, world-class software
> > > project like Apache NiFi. (I'm not a contributor, just a grateful
> > > consumer.)
> > >
> > > Russ
> > >
> > > On 7/23/21 10:28 AM, Matt Burgess wrote:
> > > Along with the itemized list for ancient components we should look at
> > > updating versions of drivers, SDKs, etc. for external systems such as
> > > Elasticsearch, Cassandra, etc. There may be breaking changes but 2.0
> > > is probably the right time to get things up to date to make them more
> > > useful to more people.
> > >
> > > On Fri, Jul 23, 2021 at 12:21 PM Nathan Gough <thena...@gmail.com
> > <mailto:
> > > thena...@gmail.com>> wrote:
> > > I'm a +1 for removing pretty much all of this stuff. There are security
> > > implications to keeping old dependencies around, so the more old code
> we
> > > can remove the better. I agree that eventually we need to move to
> > > supporting only Java 11+, and as our next release will probably be
> about
> > 4
> > > - 6 months from now that doesn't seem too soon. We could potentially
> > break
> > > this in two and remove the deprecated processors and leave 1.x on Java
> 8,
> > > and finally start on 2.x which would support only Java 11. I'm unsure
> of
> > > what implications changing the date and time handling would have - for
> > > running systems that use long term historical logs, unexpected impacts
> to
> > > time logging could be a problem.
> > >
> > > As Joe says I think feature work will have to be dedicated to 2.x and
> we
> > > could support 1.x for security fixes for some period of time. 2.x seems
> > > like a gargantuan task but it's probably time to get started. Not sure
> > how
> > > we handle all open PRs and the transition between 1.x and 2.x.
> > >
> > > On Fri, Jul 23, 2021 at 10:57 AM Joe Witt <joe.w...@gmail.com<mailto:
> > > joe.w...@gmail.com>> wrote:
> > >
> > > Jon
> > >
> > > You're right we have to be careful and you're right there are still
> > > significant Java 8 users out there.  But we also have to be careful
> > > about security and sustainability of the codebase.  If we had talked
> > > about this last year when that article came out I'd have agreed it is
> > > too early.  Interestingly that link seems to get updated and I tried
> > > [1] and found more recent data (not sure how recent).  Anyway it
> > > suggests Java 8 is still the top dog but we see good growth on 11.  In
> > > my $dayjob this aligns to what I'm seeing too.  Customers didn't seem
> > > to care about Java 11 until later half last year and now suddenly it
> > > is all over the place.
> > >
> > > I think once we put out a NiFi 2.0 release we'd see rapid decrease in
> > > work on the 1.x line just being blunt.  We did this many years ago
> > > with 0.x to 1.x and we stood behind 0.x for a while (maybe a year or
> > > so) but it was purely bug fix/security related bits.  We would need to
> > > do something similar.  But feature work would almost certainly go to
> > > the 2.x line.  Maybe there are other workable models but my instinct
> > > suggests this is likely to follow a similar path.
> > >
> > > ...anyway I agree it isn't that easy of a call to dump Java 8.  We
> > > need to make the call in both the interests of the user base and the
> > > contributor base of the community.
> > >
> > > [1] https://www.jetbrains.com/lp/devecosystem-2021/java/
> > >
> > >
> > > Thanks
> > > Joe
> > >
> > > On Fri, Jul 23, 2021 at 7:46 AM Joe Witt <joe.w...@gmail.com<mailto:
> > > joe.w...@gmail.com>> wrote:
> > > Russ
> > >
> > > Yeah the flow registry is a key part of it.  But also now you can
> > > download the flow definition in JSON (upload i think is there now
> > > too).  Templates offered a series of challenges such as we store them
> > > in the flow definition which has made flows massive in an unintended
> > > way which isn't fun for cluster behavior.
> > >
> > > We have a couple cases where we headed down a particular concept and
> > > came up with better approaches later.  We need to reconcile these with
> > > the benefit of hindsight, and while being careful to be not overly
> > > disruptive to existing users, to reduce the codebase/maintenance
> > > burden and allow continued evolution of the project.
> > >
> > > Thanks
> > >
> > > On Fri, Jul 23, 2021 at 7:43 AM Russell Bateman <r...@windofkeltia.com
> > > <mailto:r...@windofkeltia.com>>
> > > wrote:
> > > Joe,
> > >
> > > I apologize for the off-topic intrusion, but what replaces templates?
> > > The Registry? Templates rocked and we have used them since 0.5.x.
> > >
> > > Russ
> > >
> > > On 7/23/21 8:31 AM, Joe Witt wrote:
> > > David,
> > >
> > > I think this is a highly reasonable approach and such a focus will
> > > greatly help make a 2.0 release far more approachable to knock out.
> > > Not only that but tech debt reduction would help make work towards
> > > major features we'd think about in a 'major release' sense more
> > > approachable.
> > >
> > > We should remove all deprecated things (as well as verify we have the
> > > right list).  We should remove/consider removal of deprecated
> > > concepts
> > > like templates.  We should consider whether we can resolve the
> > > various
> > > ways we've handled what are now parameters down to one clean
> > > approach.
> > > We should remove options in the nifi.properties which turn out to
> > > never be used quite right (if there are).  There is quite a bit we
> > > can
> > > do purely in the name of tech debt reduction.
> > >
> > > Lots to consider here but I think this is the right discussion.
> > >
> > > Than ks
> > >
> > > On Fri, Jul 23, 2021 at 7:26 AM Bryan Bende <bbe...@gmail.com<mailto:
> > > bbe...@gmail.com>>
> > > wrote:
> > > I'm a +1 for this... Not sure if this falls under "Removing
> > > Deprecated
> > > Components", but I think we should also look at anything that has
> > > been
> > > marked as deprecated throughout the code base as a candidate for
> > > removal. There are quite a few classes, methods, properties, etc
> > > that
> > > have been waiting for a chance to be removed.
> > >
> > > On Fri, Jul 23, 2021 at 10:13 AM David Handermann
> > > <exceptionfact...@apache.org<mailto:exceptionfact...@apache.org>>
> wrote:
> > > Team,
> > >
> > > With all of the excellent work that many have contributed to NiFi
> > > over the
> > > years, the code base has also accumulated some amount of technical
> > > debt. A
> > > handful of components have been marked as deprecated, and some
> > > components
> > > remain in the code base to support integration with old versions
> > > of various
> > > products. Following the principles of semantic versioning,
> > > introducing a
> > > major release would provide the opportunity to remove these
> > > deprecated and
> > > unsupported components.
> > >
> > > Rather than focusing the next major release on new features, what
> > > do you
> > > think about focusing on technical debt removal? This approach
> > > would not
> > > make for the most interesting release, but it provides the
> > > opportunity to
> > > clean up elements that involve breaking changes.
> > >
> > > Focusing on technical debt, at least three primary goals come to
> > > mind for
> > > the next major release:
> > >
> > > 1. Removal of deprecated and unmaintained components
> > > 2. Require Java 11 as the minimum supported version
> > > 3. Transition internal date and time handling to JSR 310 java.time
> > > components
> > >
> > > *Removing Deprecated Components*
> > >
> > > Removing support for older and deprecated components provides a
> > > great
> > > opportunity to improve the overall security posture when it comes
> > > to
> > > maintaining dependencies. The OWASP dependency plugin report
> > > currently
> > > generates 50 MB of HTML for questionable dependencies, many of
> > > which are
> > > related to old versions of various libraries.
> > >
> > > As a starting point, here are a handful of components and
> > > extension modules
> > > that could be targeted for removal in a major version:
> > >
> > > - PostHTTP and GetHTTP
> > > - ListenLumberjack and the entire nifi-lumberjack-bundle
> > > - ListenBeats and the entire nifi-beats-bundle
> > > - Elasticsearch 5 components
> > > - Hive 1 and 2 components
> > >
> > > *Requiring Java 11*
> > >
> > > Java 8 is now over seven years old, and NiFi has supported general
> > > compatibility with Java 11 for several years. NiFi 1.14.0
> > > incorporated
> > > internal improvements specifically related to TLS 1.3, which
> > > allowed
> > > closing out the long-running Java 11 compatibility epic NIFI-5174.
> > > Making
> > > Java 11 the minimum required version provides the opportunity to
> > > address
> > > any lingering edge cases and put NiFi in a better position to
> > > support
> > > current Java versions.
> > >
> > > *JSR 310 for Date and Time Handling*
> > >
> > > Without making the scope too broad, transitioning internal date
> > > and time
> > > handling to use DateTimeFormatter instead of SimpleDateFormat
> > > would provide
> > > a number of advantages. The Java Time components provide much
> > > better
> > > clarity when it comes to handling localized date and time
> > > representations,
> > > and also avoid the inherent confusion of java.sql.Date extending
> > > java.util.Date. Many internal components, specifically
> > > Record-oriented
> > > processors and services, rely on date parsing, leading to
> > > confusion and
> > > various workarounds. The pattern formats of SimpleDateFormat and
> > > DateTimeFormatter are very similar, but there are a few subtle
> > > differences.
> > > Making this transition would provide a much better foundation
> > > going forward.
> > > *Conclusion*
> > >
> > > Thanks for giving this proposal some consideration. Many of you
> > > have been
> > > developing NiFi for years and I look forward to your feedback. I
> > > would be
> > > glad to put together a more formalized recommendation on
> > > Confluence and
> > > write up Jira epics if this general approach sounds agreeable to
> > > the
> > > community.
> > >
> > > Regards,
> > > David Handermann
> > >
> > >
> > >
> >
>

Reply via email to