Roman - thank you for taking a look at the PR! The PR should compile out-of-box if you have R and the evaluate package installed.
I have been working on a “release” version rebased to match 5.5-release. It has documentation, including what features are added with optional (incompatible-license) R packages. And how to build a seamless Spark pipeline that mixes scala, python, and R without breaking lazy evaluation. I can send you a zip of that tomorrow, and will contact you directly to provide it. From: Roman Shaposhnik <[email protected]> Reply: [email protected] <[email protected]> Date: December 2, 2015 at 12:30:52 AM To: [email protected] <[email protected]> CC: moon soo Lee <[email protected]> Subject: Re: contributions impasse. Was: [GitHub] incubator-zeppelin pull request: R Interpreter for Zeppelin Hi! reading through the thread made me realize that what Amos is saying makes perfect sense to me. That said, I don't like to take Moon's concerns lightly either. Hence, would it be possible to get the *bundle* of a binary package built with the patch applied and ready to go? I'd like to provide an extra pair of eyes to help settle this. Thanks, Roman. P.S. I do a lot of this type of work around IP at my $DAYJOB On Tue, Dec 1, 2015 at 5:25 PM, Amos B. Elberg <[email protected]> wrote: > Moon - If there were seriously a licensing issue, then you or someone else > would be able to say clearly and specifically what it is. > > There plainly is not. This idea you have about a “compiler exception” has > nothing to do with any of this. The link you sent doesn’t talk about any > “compiler exception.” (I’m not sure what a 2008 e-mail from a developer who > wanted to release a branded version of R has to do with Apache’s policy > concerning linking to GPL’d code anyway…) > > My e-mail was accidentally sent incomplete. This is the rest: > > You say this: > > My point is not 'KnitR' is optional or not. Point is 'KnitRInterpreter' you > wrote is not an optional feature. Which is clearly not optionally enabled > code and feature. And that depends on KnitR library which is GPL. > > That’s false. > > *** > I addressed the licensing issue only because waving the “license” flag could > scare people off. > > If someone really believed there was a licensing issue, they would be able to > say clearly what that issue is. > > I am not going to respond to the rest of Moon’s e-mail. > > > From: moon soo Lee <[email protected]> > Reply: moon soo Lee <[email protected]> > Date: December 1, 2015 at 8:16:36 PM > To: Amos B. Elberg <[email protected]>, [email protected] > <[email protected]> > Subject: Re: contributions impasse. Was: [GitHub] incubator-zeppelin pull > request: R Interpreter for Zeppelin > > Is knitR is commonly considered as a interpreter/compiler? or is it > considered as a library routine? > > Thanks, > moon > > On Wed, Dec 2, 2015 at 10:12 AM Amos B. Elberg <[email protected]> wrote: > > Moon - you give this as an explanation of the licensing issue: > https://stat.ethz.ch/pipermail/r-help/2008-July/169332.html > > According to that, there is an exception in the GPL for interpreter > languages. As long as you don’t distribute the code, its fine to talk to an > interpreted language. > > Well, if that’s the case, then the PR plainly does not have a license issue. > It doesn’t distribute any GPL’d R code. > > I’m not sure what’s confusing about this. It seems completely > straightforward. > > Regarding this: > > > -- > Amos Elberg > Sent with Airmail > > From: moon soo Lee <[email protected]> > Reply: [email protected] <[email protected]> > Date: December 1, 2015 at 6:48:47 PM > > To: [email protected] <[email protected]> > Subject: Re: contributions impasse. Was: [GitHub] incubator-zeppelin pull > request: R Interpreter for Zeppelin > > On Wed, Dec 2, 2015 at 1:09 AM Amos B. Elberg <[email protected]> wrote: > >> I am going to try to minimize my reaction to Moon’s e-mail. >> >> The tl;dr is this: >> >> The reason we are having this discussion now is that active users of the >> PR — which now has its own user base — went public to complain about this. > > >> The PR has been tested by an active user base for more than three months. >> No-one has been able to identify any specific actual licensing problem, and >> the PR was prepared based on an extensive, careful review of the relevant >> licensing issues and after contacting the relevant people. >> >> > > I admire every software that used by user and helping people. That includes > your work. But that's not the topic we're in discussion. Active user does > not mean your contribution can ignore the review. > > > >> It is not an explanation for someone who has been ignoring my “how can I >> move this forward…” emails for three months to point the finger and say I >> didn’t contact the right person or file the right report. >> >> > This is also not the topic in this discussion. > > >> The burden for providing an explanation for the inaction is on the PMCC at >> this point. > > I'm sorry, but the other PRs are passing CI. If it's problem on Zeppelin >> core, why do you think other PRs are passing CI? >> They’re not! I often see comments on PRs to just ignore that CI is >> failing. >> >> One of the most common reasons this PR fails CI, is CI times-out >> downloading Spark to install. How could that possibly be caused by the PR? >> >> It looks to me like the only PRs with changes to the relevant parts of the >> code — the SparkInterpreter — are being made by the person who wrote the >> testing suite. >> >> So, that would explain why some other PRs pass CI: Neither the >> SparkInterpreter nor the testing suite are stable or robust, but since the >> PRs are coming from the person who wrote both… >> >> And let's say Zeppelin core has problem and that makes your PR fails on CI >> test. That's possible. But it still does not mean we can merge it with CI >> failing. >> >> It means you should be working with me to figure out why the CI is failing. >> >> This PR has been tested by an active user base for the past three months. >> If CI is continuing to fail, and dozens of hours of effort have not >> resolved the CI issues, then it is time to start considering whether the >> testing suite is part of the problem. >> >> The level of defensiveness about the CI and SparkInterpreter is not >> helping to resolve these issues. >> >> If you think it's problem on Zeppelin core, then file an issue that >> reproduce the problem on Zeppelin core, that might be more efficient than >> keep trying yourself. >> I contacted you numerous times about such issues... >> > > > I remember i commented your issue about CI. but you just keep repeated it's > not your problem but Zeppelin core problem. > > Then please file an issue about the problem you found in Zeppelin Core. > Then everyone will get into the problem. > > > >> >> In my interpretation, KnitRInterpreter is not an optional feature while it >> is always enabled when compiling Zeppelin and always enabled when running >> Zeppelin. And it requires dynamically linked GPL library on runtime. (yes >> it will fail when no KnitR is installed on the system) >> >> Its not always enabled. >> It is not dynamically linked at runtime. >> It will not fail when knitr is missing. If knitr is not present, the repl >> interpreter starts and a note is written to to the log that the knitr >> interpreter isn’t available because knitr is not present. >> >> no Apache code can ever call a shell script, on the purpose of dynamic >> linking with GPL library. >> You misunderstand. >> >> The *shell* is GPL'd. >> >> Is Zeppelin “linked" against the GPL’d shell because Zeppelin depends on a >> shell script to launch? >> > Obviously not. >> >> The interaction with R in the PR is the same. >> >> > > Again, bash is one of exceptions of GPL, like other GPL licensed > compiler/interpreter. > > Check here why Bash and R is okay with Apache License. > https://stat.ethz.ch/pipermail/r-help/2008-July/169332.html > > I'm not sure we can apply the same exception for 'using' KnitR. > > My point is not 'KnitR' is optional or not. Point is 'KnitRInterpreter' you > wrote is not an optional feature. Which is clearly not optionally enabled > code and feature. And that depends on KnitR library which is GPL. > > > >> I was guessing SparkR can be still in Apache License even if it is depends >> on R. Because of GPL licensed compiler generated output is not GPL license. >> and R is sort of compiler. If you can get answer from Spark community how >> SparkR get managed to stay in Apache License while R is GPL, the answer >> might help. >> The description of SparkR is not accurate in any respect. (Do you think >> SparkR is not talking to GPL-licensed libraries?) >> >> I don’t see that any genuine issue is being raised here. >> >> If there is an issue, the burden is on you to identify it. >> >> If i give you one suggestion, Zeppelin committers sometimes ask rebase the >> contribution branch for some reason. It is not the really the best >> practice, but still okay while most contributions are not including large >> code base changes >> However, your one, has more than 4000 lines of code change. If you rebase >> then review should be started from the beginning, again. So you might want >> to minimize the rebase your branch. >> >> Are you actually complaining that the problem is that I rebased the code >> during the three-month period when no-one looked at it and Zeppelin went >> through a release? >> >> I cannot take it seriously when you say things like this. >> >> Having to “start from the beginning” cannot be a problem if you never >> actually started the first time... >> >> > > You wanted coordination and cooperation. So i gave you suggestion that > helping review process. For example, your code has been rebased since my > comment and jongyoul's comment. that means committers will need to look > from the beginning. That'll require more time. And maybe, i guess that's > not what you want. But If you don't care, feel free to rebase. > > Thanks, > moon > > > >> >> From: moon soo Lee <[email protected]> >> Reply: moon soo Lee <[email protected]> >> Date: December 1, 2015 at 4:42:06 AM >> To: Amos B. Elberg <[email protected]>, >> [email protected] <[email protected]> >> Subject: Re: contributions impasse. Was: [GitHub] incubator-zeppelin pull >> request: R Interpreter for Zeppelin >> >> On Tue, Dec 1, 2015 at 4:40 PM Amos B. Elberg <[email protected]> >> wrote: >> Thank you, Cos. >> >> I’d like to briefly address the issues raised by Moon: >> >> 1. This PR does not passes CI >> The CI fails on core Zeppelin, *not* code in this PR. >> >> I’ve been seeking assistance on this since August. >> >> The most common reason is that SparkInterpreter is unable to launch Spark >> and open a Spark Backend. This is necessary to test the PR. >> >> 60+ hours, has been spent adapting and re-basing when the SparkInterpreter >> architecture changed and broke the PR’s *tests.* >> >> >> I'm sorry, but the other PRs are passing CI. If it's problem on Zeppelin >> core, why do you think other PRs are passing CI? >> >> And let's say Zeppelin core has problem and that makes your PR fails on CI >> test. That's possible. But it still does not mean we can merge it with CI >> failing. >> >> If you think it's problem on Zeppelin core, then file an issue that >> reproduce the problem on Zeppelin core, that might be more efficient than >> keep trying yourself. >> >> >> 2. Not 100% sure this PR has no license issue. (about KniteR) >> What license problem *specifically* do you believe may exist? >> >> In preparing the PR, I: >> >> * Reviewed the Apache policy in detail. >> >> * Contacted the FSF to ask their interpretation of the “linking” >> provisions of the GPL license. >> >> * Reviewed how other Apache software deals with this issue (e.g., Spark >> talks to R, which is GPL'd). >> >> * No necessary *dependencies* of the PR have license conflicts. In >> several cases, I contacted package authors who agreed to re-issue their >> packages under Apache-compatible licenses. (Usually I had to do a bit of >> coding in exchange…) >> >> * Where the license had to stay GPL, the packages are *not necessary* and >> *not dependencies.* If the optional packages are present, they will be >> used to enable additional functionality. Knitr is an example. The PR will >> compile and run fine without knitr. If knitr is available (it is part of >> the most common R distribution), the PR will enable the knitr interpreter. >> >> * This is exactly how this issue is addressed through the Apache >> ecosystem. >> The tl;dr is this: When Apache code is written to talk to libraries that >> may or may not be present on the user’s system, or where it talks to an API >> but is agnostic about implementation, that is not “linking” in a way that >> implicate the anti-linking provision of the GPL. >> >> Otherwise, no Apache code could ever call a shell script! Let alone run >> on Linux, or talk to R. >> >> >> I'm not a legal expert. So following could be wrong. >> >> In my interpretation, KnitRInterpreter is not an optional feature while it >> is always enabled when compiling Zeppelin and always enabled when running >> Zeppelin. And it requires dynamically linked GPL library on runtime. (yes >> it will fail when no KnitR is installed on the system) >> >> And of course, no Apache code can ever call a shell script, on the purpose >> of dynamic linking with GPL library. >> >> I was guessing SparkR can be still in Apache License even if it is depends >> on R. Because of GPL licensed compiler generated output is not GPL license. >> and R is sort of compiler. >> >> If you can get answer from Spark community how SparkR get managed to stay >> in Apache License while R is GPL, the answer might help. >> >> >> 3. Need more time to review. >> Has any reviewer has downloaded the PR or run the demo notebook? (Which >> is there for the benefit of reviewers, and isn’t intended to go in final >> distribution.) >> >> How many +1 comments from users would you like to see? >> >> How much time do you believe is required? >> >> >> It all depends on when CI is going to pass, when license problem is going >> to be cleared, and when a committer willing to review and responsible to >> commit your contribution. >> >> >> 1. Large code base change >> Large code base changes require coordination and cooperation. This PR >> necessarily implicates the build scripts, testing code, the >> SparkInterpreter, etc. >> >> I have been seeking to coordinate since August. >> >> I continue to stand ready to do so. >> >> -Amos >> >> >> If i give you one suggestion, Zeppelin committers sometimes ask rebase the >> contribution branch for some reason. It is not the really the best >> practice, but still okay while most contributions are not including large >> code base changes. >> >> However, your one, has more than 4000 lines of code change. If you rebase >> then review should be started from the beginning, again. So you might want >> to minimize the rebase your branch. >> >> Thanks, >> moon >> >> >> From: moon soo Lee <[email protected]> >> Reply: [email protected] < >> [email protected]> >> Date: December 1, 2015 at 1:34:19 AM >> To: [email protected] <[email protected]> >> Subject: Re: contributions impasse. Was: [GitHub] incubator-zeppelin pull >> request: R Interpreter for Zeppelin >> >> Hi Cos, >> >> Thanks for opening a discussion. >> My answer about 'Why this PR is open for three months' is >> >> 1. This PR does not passes CI >> 2. Not 100% sure this PR has no license issue. (about KniteR) >> 3. Need more time to review. >> >> It's my personal answer, other committers may have different opinion. >> >> >> I think the question should be generalized. Because this PR is not the only >> PR that is in impasse. There're more. For example >> >> Here's some examples that PR is not been merged. >> https://github.com/apache/incubator-zeppelin/pull/53, >> https://github.com/apache/incubator-zeppelin/pull/60 >> and so on. >> >> I can categorize the cases, based on experience of involving Zeppelin >> community from the beginning. >> >> 1. Large code base change >> >> When contribution has large code base changes, it tend to take more time to >> review and merged. Normally, most contributions merged in 1day~1 week. >> But some contribution has large code base changes take 2~4 weeks. Few >> contribution that has very large code base change take months. >> >> 2. Communication lost >> >> Sometimes, committer is not responding, or contributor is not responding. >> >> 3. Opinion diverges >> >> For some changes, included in contribution. When people have different >> opinion and it continue to diverges, those PRs are not been merged. >> >> >> I think having a guide such as ping other committer when a committer is not >> responding, and divide contribution into small peaces if possible, would >> help most of the cases. >> Of course committer have to pay attention more to the contribution and >> helping people. That's the first one. >> >> Thanks, >> moon >> >> On Tue, Dec 1, 2015 at 1:54 PM Konstantin Boudnik <[email protected]> wrote: >> >> > To make sure we're on the same page, here are two threads that I found >> > related >> > to this topic. >> > >> > Thread 1: >> > Subject: R? >> > Started on: July 1, 2015 >> > >> > Thread 2: >> > Subject: [GitHub] incubator-zeppelin pull request: R Interpreter for >> > Zeppelin >> > Started on: August 13, 2015 >> > >> > If you want to fetch these from our archive send emails to >> > [email protected] >> > [email protected] >> > >> > Cos >> > >> > On Mon, Nov 30, 2015 at 06:27PM, Konstantin Boudnik wrote: >> > > Guys, >> > > >> > > While catching up on my emails from the last a couple of weeks, this >> > thread >> > > caught my attention. I am not normally paying much attention to the >> code >> > > reviews traffic from GH, but this one is pretty different as it spans >> > three >> > > months and counting. >> > > >> > > First, here are my five cents: >> > > - r/R/rzeppelin/LICENSE is wrong: if the code is aimed to be >> > contributed to >> > > an ASF project this file should simply contain ASL2 text, like in [1] >> > > - r/pom.xml perhaps shouldn't contain a separate <developers> section, >> > but >> > > Zeppelin might have different guidelines on it. Say, Bigtop doesn't >> > > maintain this in the source code, but we have the list of all the >> > > committers on the project's site [2] Every new committer's first >> > commit is >> > > to update the team page ;) >> > > - comments like in >> > r/src/main/java/org/apache/zeppelin/rinterpreter/KnitR.java >> > > >> > > +/** >> > > + * Created by aelberg on 7/28/15. >> > > + */ >> > > >> > > is better to be removed. It has been already discussed here [3]. And >> > the >> > > initial discussion on the topic [4] was linked as well >> > > - same goes to r/R/rzeppelin/DESCRIPTION. I am not sure if this is >> > R-specific >> > > stuff - I have no idea about R, honestly, but >> > > >> > > +License: GPL (>= 2) | BSD_3_clause + file LICENSE >> > > ... >> > > +Author: David B. Dahl >> > > >> > > shouldn't be here, IMO. Normally, if some additional licenses are >> > used, >> > > they have to be listed in the top-level NOTICE file (already there). >> > > >> > > - I am not going to offer any comments on the technical merits of the >> > patch, >> > > as I haven't tried it personally. However, I don't see any serious >> > > technical objections to the functionality in question. >> > > >> > > So, the question is - why the PR is open for three months? I hasn't >> been >> > able >> > > to get a clear answer. What I found out though is pretty unsettling, >> > really. >> > > The communication on the topic is almost non-existent, except for this >> > sparse >> > > and bitter thread in the GH. >> > > >> > > I would love to hear from the committers what's stopping the acceptance >> > of the >> > > code, besides of the minor issues I've mentioned earlier? What are the >> > reasons for it? >> > > Is there anything wrong with it? >> > > One of the responsibilities of the committers is to make sure the >> > > contributions are reviewed; new contributors are guided and do >> > understand how >> > > the project ticks. The easy feedback flow attracts new people, allowing >> > the >> > > community to grow and thrive. >> > > >> > > I am asking _explicitely_ not to re-start the bickering I have already >> > > seen. At this point I am interested in the purely technical side of >> this. >> > > >> > > [1] https://github.com/apache/bigtop/blob/master/LICENSE >> > > [2] http://bigtop.apache.org/team-list.html >> > > [3] >> > >> http://apache-nifi-developer-list.39713.n7.nabble.com/author-tags-td1335.html >> >> > > [4] http://s.apache.org/iZl >> > > >> > > With regards, >> > > Cos >> > > >> > > On Mon, Nov 16, 2015 at 11:06PM, elbamos wrote: >> > > > Github user elbamos commented on the pull request: >> > > > >> > > > >> > >> https://github.com/apache/incubator-zeppelin/pull/208#issuecomment-157203411 >> >> > > > >> > > > The current push should resolve some issues with changes in the >> > > > Spark-Zeppelin interface that had created issues for users, as >> > well as >> > > > support for 1.5.1. >> > > > >> > > > Knitr support is improved, and the reason for a separate knitr >> > interpreter may be clearer now. >> > > > >> > > > The amount of code borrowed from rscala is reduced. >> > > > >> > > > I did not address issues with @author tags, or files under the R/ >> > > > folder. The reason is, to be blunt, I don't understand what the >> > precise >> > > > concerns actually are. >> > > > >> > > > Please note that I changed .travis.yml to only use spark 1.4 and >> > later. >> > > > I'm sure there is a better way to do it, but I'm just not in a >> > position >> > > > to try to figure out the entire Zeppelin build process. >> > > > >> > > > Integrating this with the rest of zeppelin is going to take some >> > work >> > > > regarding pom's, travis, and so forth. I can do a lot of that, >> > but I'm >> > > > going to need to discuss it with the people who have been "owning" >> > those >> > > > files. There are just too many moving pieces here. >> > > > >> > > > Because of the size of this (which is, unfortunately, necessary), >> > > > posting here is probably not the most efficient way. That is also >> > true >> > > > because certain people chose to use this PR as a forum to air other >> > > > issues. Therefore, it would be better if reviewers e-mail me >> > directly. >> > >> > >> > >>
