Great work David - thanks! On Wed, Jun 26, 2024 at 2:36 PM David Handermann < exceptionfact...@apache.org> wrote:
> Team, > > Thanks to collaboration with Joe Witt, the new nifi-python-extensions > repository [1] is now populated with the initial set of Python > Processors. > > The repository includes a standard GitHub workflow for pull request > validation that checks license headers and Python code formatting. > > The project uses Hatch [2] to run code formatting as well as build > source and binary distribution packages. > > The source distribution and binary wheel packages both contain the > Python Processors, which can be placed into an Apache NiFi 2 > installation. > > The source distribution archive will provide a suitable release > candidate file when we are ready to release a version of > nifi-python-extensions. > > In Jira, there is a new python-extensions-2.0.0 version to target for > features and fixes. > > There is certainly more room for documentation and improvement, but > this should provide a reasonable foundation for decoupled Python > Processor development efforts. > > Regards, > David Handermann > > [1] https://github.com/apache/nifi-python-extensions > [2] https://hatch.pypa.io > > On Sat, Jun 22, 2024 at 9:06 AM David Handermann > <exceptionfact...@apache.org> wrote: > > > > Joe, > > > > Thanks for raising the discussion, and thanks to everyone for the > feedback thus far. This tracks our previous discussion on the topic [1]. > > > > I am also strongly in favor of separating out extensions into their own > repositories for many of the reasons already mentioned. Starting with a > single dedicated repository named nifi-python-extensions should be a good > opportunity to prove out the concept. I agree considering the Java > extensions is more involved, and I think should consider that separately. > > > > I would be glad to handle the initial work of creating the new > repository and setting up the initial build structure. I am familiar with > the work necessary to publish to the PyPI repository, and that could > provide an optional distribution channel. We would still make the source > distribution available through standard Apache channels, following standard > project policies. Based on the current structure, it should be > straightforward to download an archive of the Python extensions and expand > that for those who want to have them as part of their NiFi installation. > > > > Once the initial repository is in place, that last initial step would be > a pull request to remove the Python extensions from the main repository. > > > > I can proceed along these lines, unless any substantive objections come > up, and the pull request process will also provide opportunity for > additional consideration and review. > > > > Regards, > > David Handermann > > > > [1] https://lists.apache.org/thread/nok561sg1dzw3zrott06gkl34hdjxbb3 > > > > On Fri, Jun 21, 2024, 9:14 PM Marton Szasz <sza...@apache.org> wrote: > >> > >> Hi Joe, Arpad and all, > >> > >> I'm strongly in favor of moving all Python components to a separate > >> repository. It could be called apache/nifi-python-extensions or > >> -components, and contain all Python components that this community > >> maintains. I would prefer that over a separate repo for each extension, > >> because it seems easier to keep track of all components maintained by > >> the community if they are in the same repo, than if they were separate. > >> > >> Since MiNiFi C++ implemented a large subset of the same Python API, I > >> think it makes our lives easier if we share the code, and keep the > >> Python components in their own dedicated location. As NiFi and MiNiFi > >> C++ both approach their next major version, and we commit to a stable > >> Python API, I expect it to become easier to maintain the Python > >> components separately, targeting this stable API, and we could align the > >> release frequency with the maintenance needs of the Python components. > >> > >> I'm neutral of whether to package them with convenience binaries or > >> leave that up to the user. Hopefully we can come up with a user-friendly > >> way to install them if they're not included. I wouldn't include them in > >> the source tarballs. > >> > >> I would keep the Java components separate from Python components. > >> Whether that's in the NiFi repo or somewhere else, both are fine with > me. > >> > >> Regarding introducing breaking changes: on the NiFi side, unit tests > >> should cover the API well enough, and after 2.0 GA, I expect it to > >> remain backwards-compatible until the next major version. So I think the > >> API will not be a moving target (things only added, not changed), and it > >> will be easy to keep things working. But I think we should set up > >> automated testing that runs tests with the extensions, checking their > >> functionality with NiFi 2.0, NiFi latest, and at least one MiNiFi C++ > >> version, to catch breakages early if they rear up their head anyway. > >> > >> Thanks and have a great weekend, > >> Marton > >> > >> > >> On 6/21/24 23:18, Joe Witt wrote: > >> > "I would suggest starting with > >> > moving the Python ones to a dedicated repo, let's have a workflow > >> > established and polished there, might follow with some Java ones in > case it > >> > works well." > >> > > >> > Yeah kinda where my head is too > >> > > >> > On Fri, Jun 21, 2024 at 2:07 PM Arpad Boda <ab...@apache.org> wrote: > >> > > >> >> Joe, > >> >> > >> >> Interesting thoughts, I see a lot of pros and cons. Let me list the > most > >> >> important ones of both: > >> >> +cves in extensions doesn't make nifi "vulnerable" automatically as > they > >> >> live in a different repo. > >> >> +the responsibility of being up-to-date is being moved to the > maintainers > >> >> of the given extension, same applies for the stability of the tests > >> >> covering that extension > >> >> > >> >> -easier to introduce breaking changes accidentally: a breaking > change might > >> >> go through and get committed. Especially in case of Java extensions, > they > >> >> python api is pretty thin (yet!). Only an extension developer will > find it, > >> >> most probably not immediately, when things already depend on the > breaking > >> >> change and it gets very difficult to make the right call in this case > >> >> -might lose some extensions as they get even less maintained than > they are > >> >> now > >> >> > >> >> Overall I have no strong opinion either ways, I would suggest > starting with > >> >> moving the Python ones to a dedicated repo, let's have a workflow > >> >> established and polished there, might follow with some Java ones in > case it > >> >> works well. > >> >> > >> >> Cheers, > >> >> Arpad > >> >> > >> >> On Friday, June 21, 2024, Joe Witt <joew...@apache.org> wrote: > >> >> > >> >>> Team, > >> >>> > >> >>> For the longest time we had all these Java based extensions and it > was > >> >>> often inconvenient for them to live within the codebase. Indeed it > makes > >> >>> the builds crazy long and it delays getting new components out. We > had a > >> >>> lot of work to do for this to be convenient and perhaps we still > have > >> >> gaps > >> >>> remaining. > >> >>> > >> >>> Now we have these Python components. I am not confident we really > want > >> >>> these in the codebase for similar but even more important reasons. > The > >> >>> python components have similar issues when it comes to Licensing and > >> >> Notice > >> >>> recognition. They have their own rapid vulnerability tracking. Our > >> >>> current tooling doesn't make tracking that very easy. > >> >>> > >> >>> I'm concerned about where the Python ones are heading in terms of > >> >>> maintainability but also generally for the builds as well with the > Java > >> >>> ones. Is it time to move to a repo for the Java extensions and its > own > >> >>> project/group name and versioning? Same for Python extensions? > >> >>> > >> >>> This lets them evolve on their own schedule. It does bring up an > >> >>> interesting challenge as it relates to a convenience binary. The > ideal > >> >>> state is extensions are released and shipped independent of the nifi > >> >>> application. But we'd need to make that really nice/easy for the > users. > >> >>> > >> >>> We have a lot going on so maybe still not time to tackle this. > Curious > >> >> to > >> >>> hear thoughts > >> >>> > >> >>> Thanks > >> >>> >