Hi Joe, Arpad and all,
I'm strongly in favor of moving all Python components to a separate
repository. It could be called apache/nifi-python-extensions or
-components, and contain all Python components that this community
maintains. I would prefer that over a separate repo for each extension,
because it seems easier to keep track of all components maintained by
the community if they are in the same repo, than if they were separate.
Since MiNiFi C++ implemented a large subset of the same Python API, I
think it makes our lives easier if we share the code, and keep the
Python components in their own dedicated location. As NiFi and MiNiFi
C++ both approach their next major version, and we commit to a stable
Python API, I expect it to become easier to maintain the Python
components separately, targeting this stable API, and we could align the
release frequency with the maintenance needs of the Python components.
I'm neutral of whether to package them with convenience binaries or
leave that up to the user. Hopefully we can come up with a user-friendly
way to install them if they're not included. I wouldn't include them in
the source tarballs.
I would keep the Java components separate from Python components.
Whether that's in the NiFi repo or somewhere else, both are fine with me.
Regarding introducing breaking changes: on the NiFi side, unit tests
should cover the API well enough, and after 2.0 GA, I expect it to
remain backwards-compatible until the next major version. So I think the
API will not be a moving target (things only added, not changed), and it
will be easy to keep things working. But I think we should set up
automated testing that runs tests with the extensions, checking their
functionality with NiFi 2.0, NiFi latest, and at least one MiNiFi C++
version, to catch breakages early if they rear up their head anyway.
Thanks and have a great weekend,
Marton
On 6/21/24 23:18, Joe Witt wrote:
"I would suggest starting with
moving the Python ones to a dedicated repo, let's have a workflow
established and polished there, might follow with some Java ones in case it
works well."
Yeah kinda where my head is too
On Fri, Jun 21, 2024 at 2:07 PM Arpad Boda <ab...@apache.org> wrote:
Joe,
Interesting thoughts, I see a lot of pros and cons. Let me list the most
important ones of both:
+cves in extensions doesn't make nifi "vulnerable" automatically as they
live in a different repo.
+the responsibility of being up-to-date is being moved to the maintainers
of the given extension, same applies for the stability of the tests
covering that extension
-easier to introduce breaking changes accidentally: a breaking change might
go through and get committed. Especially in case of Java extensions, they
python api is pretty thin (yet!). Only an extension developer will find it,
most probably not immediately, when things already depend on the breaking
change and it gets very difficult to make the right call in this case
-might lose some extensions as they get even less maintained than they are
now
Overall I have no strong opinion either ways, I would suggest starting with
moving the Python ones to a dedicated repo, let's have a workflow
established and polished there, might follow with some Java ones in case it
works well.
Cheers,
Arpad
On Friday, June 21, 2024, Joe Witt <joew...@apache.org> wrote:
Team,
For the longest time we had all these Java based extensions and it was
often inconvenient for them to live within the codebase. Indeed it makes
the builds crazy long and it delays getting new components out. We had a
lot of work to do for this to be convenient and perhaps we still have
gaps
remaining.
Now we have these Python components. I am not confident we really want
these in the codebase for similar but even more important reasons. The
python components have similar issues when it comes to Licensing and
Notice
recognition. They have their own rapid vulnerability tracking. Our
current tooling doesn't make tracking that very easy.
I'm concerned about where the Python ones are heading in terms of
maintainability but also generally for the builds as well with the Java
ones. Is it time to move to a repo for the Java extensions and its own
project/group name and versioning? Same for Python extensions?
This lets them evolve on their own schedule. It does bring up an
interesting challenge as it relates to a convenience binary. The ideal
state is extensions are released and shipped independent of the nifi
application. But we'd need to make that really nice/easy for the users.
We have a lot going on so maybe still not time to tackle this. Curious
to
hear thoughts
Thanks