Marc I'm not opposed to it staying if there is zero license issue. I only mentioned the maintenance as a 'additional thing to consider' - that is more of a discussion item. L&N isn't discussion - it is hard rule.
Ok so in the minificpp case you're saying "there is no source or binary dependency (that we'd package)" that isn't entirely ALv2/ASF compatible and so if people want to use this they can do so by adding in the 'h2o' bits at their own effort, right? In this line of thinking I still don't love it but it is arguably not a lot different that our components which talk to services you have to have a paid or otherwise agreement with like Cloud services offering by AWS, Azure, GCP, etc.. So in the case of minificpp we are certain there is no L&N concern regarding either our source OR convenience binary? Lastly, why are the JIRAs still open if this is merged? Can you please close them and assign to the proper fix version if these are 'done'? Thanks On Mon, May 4, 2020 at 11:46 AM Marc Parisi <[email protected]> wrote: > "We should find a way for vendors to offer extension points like this > without having to take on the burden of maintenance. How can we possibly do > this well? We're learning this lesson the hard way in NiFi itself and this > is why the registry is being formulated." > > I think there is a fine line when walking vendor tie-in. Having the > registry allows for an arm's length agreement with the Apache community. I > know others will disagree but my opinion is that extension points that > don't introduce licensing concerns from libraries should be fair game but > the fact that a paid product requires a license for that code to be useful > should not be a non-starter -- if that makes sense. I believe I stated, "I > don't have a strong argument" as I don't really have a strong opinion ( and > certainly can be wrong too ) -- so I'm interested in hearing others in > regards to this. > > > > On Mon, May 4, 2020 at 11:38 AM Marc Parisi <[email protected]> wrote: > > > Hi Joe, > > > > When merging I did not use the NiFi processors to test as I already have > > tooling around H20 driven ML in my home. While I don't admittedly use it > > very often, I thought the python processors were quite cool and could > > certainly be useful for others. My rationale is that the dependencies are > > not built into source or binary for MiNiFi C++. I would agree on the Java > > side that this would be unavoidable. While it would be preferential to > have > > the NiFi analogue I do not think it would be required when considering > the > > MiNiFi C++ processors; however, James should be the arbiter of that > > decision. > > > > While I disagree that the MiNiFi portion should be removed, I don't have > a > > strong argument to keep or remove the MiNiFI C++ code so I'd be happy to > > revert if the community feels the need to. > > > > Thanks, > > Marc > > > > On Mon, May 4, 2020 at 10:50 AM Joe Witt <[email protected]> wrote: > > > >> Team > >> > >> The below two commits raise serious concerns. > >> > >> I want to be clear here and point out that h2o is cool. Having such > >> integration is a neat concept and idea and one that certainly warrants > >> determination on how best to do so. > >> > >> My issue is with these two commits as it relates to licensing and > >> maintenance. > >> > >> License: > >> We should support vendors like this wanting to bring their capabilities > >> into NiFi generally sure. But the licensing and mode of use is critical. > >> In talking about this with the contributor for NiFi as well it is clear > >> that at least some or important portions of this require the user to > have > >> a > >> 'driverless ai license' so they can include their jar or build pipelines > >> or > >> whatnot. Thus it isn't usable without that first. So it might be the > case > >> that this stuff is source dependent only on ASF compatible licenses in > >> terms of source - but it certainly doesn't seem to be true in terms of > >> binary dependencies. Where in the PR or JIRA is there any discussion or > >> review of the licensing? I see plenty from James to believe this needs > to > >> be reverted. Any contrary info? > >> https://github.com/apache/nifi/pull/4242#issuecomment-622654527 > >> > >> > >> Maintenance: > >> We should find a way for vendors to offer extension points like this > >> without having to take on the burden of maintenance. How can we possibly > >> do > >> this well? We're learning this lesson the hard way in NiFi itself and > >> this > >> is why the registry is being formulated. > >> > >> JIRA/hygiene: > >> https://issues.apache.org/jira/browse/MINIFICPP-1199 > >> and > >> https://issues.apache.org/jira/browse/MINIFICPP-1201 > >> Both are open. Yet we've merged commits that claim to be against each. > >> > >> I believe both of these commits need to be reverted as I do not believe > >> the > >> licensing considerations have been addressed. I'd like to see > discussion > >> on the above maintenance concern as well but that is less pressing. > >> > >> > >> Thanks > >> Joe > >> > >> > >> commit 7206c62240647520cf35649868d5d87903a256c2 > >> Author: James Medel <[email protected]> > >> Date: Wed Apr 29 12:38:04 2020 -0700 > >> > >> MINIFI-1201: Integrate H2O Driverless AI MSP in MiNFi (#766) > >> > >> MINIFI-1201: Integrate H2O Driverless AI MSP in MiNFi (#766) > >> > >> commit 6e5f96518764df7791519c0ee625a94a207ddc69 > >> Author: James Medel <[email protected]> > >> Date: Wed Apr 29 12:37:00 2020 -0700 > >> > >> MINIFI-1199: Integrate H2O Driverless AI PSP in MiNiFi (#763) > >> > >> * MINIFI-1199: Integrate H2O Driverless AI in MiNiFi > >> > >> MiNiFi C++ and H2O Driverless AI Integration via Custom Python > >> Processors: > >> Integrates MiNiFi C++ with H2O's Driverless AI by Using Driverless > >> AI's > >> Python Scoring Pipeline and MiNiFi's Custom Python Processors. Uses the > >> Python Processors to execute the Python Scoring Pipeline scorer to do > >> batch > >> scoring and real-time scoring for one or more predicted labels on test > >> data > >> in the incoming flow file content. I would like to contribute my > >> processors > >> to MiNiFi C++ as a new feature. > >> > >> * Update H2oPspScoreBatches processor > >> > >> This update includes passing "index=False" to > >> "batch_scores_df.to_string(index=False)". By updating this line of code, > >> we > >> tell pandas to not include the DataFrame index when converting the > >> DataFrame to a string. The reason for this update is that we don't want > >> this extra column pandas tries to add to our predictions frame, we only > >> want the predictions. Thus, later when the predictions get saved to a > csv > >> file, it will only include the predictions. > >> > >> * Moved ConvertDsToCsv to h2o base dir > >> > >> Since this ConvertDsToCsv python processor is used by the > >> Python Scoring Pipeline and MOJO Scoring Pipeline processors, > >> I moved ConvertDsToCsv to h2o base dir for easier access to it. > >> > > >
