On Tue, Dec 1, 2015 at 4:40 PM Amos B. Elberg <amos.elb...@gmail.com> wrote:
> Thank you, Cos. > > I’d like to briefly address the issues raised by Moon: > > 1. This PR does not passes CI > > The CI fails on core Zeppelin, *not* code in this PR. > > I’ve been seeking assistance on this since August. > > The most common reason is that SparkInterpreter is unable to launch Spark > and open a Spark Backend. This is necessary to test the PR. > > 60+ hours, has been spent adapting and re-basing when the SparkInterpreter > architecture changed and broke the PR’s *tests.* > I'm sorry, but the other PRs are passing CI. If it's problem on Zeppelin core, why do you think other PRs are passing CI? And let's say Zeppelin core has problem and that makes your PR fails on CI test. That's possible. But it still does not mean we can merge it with CI failing. If you think it's problem on Zeppelin core, then file an issue that reproduce the problem on Zeppelin core, that might be more efficient than keep trying yourself. 2. Not 100% sure this PR has no license issue. (about KniteR) > > What license problem *specifically* do you believe may exist? > > In preparing the PR, I: > > * Reviewed the Apache policy in detail. > > * Contacted the FSF to ask their interpretation of the “linking” > provisions of the GPL license. > > * Reviewed how other Apache software deals with this issue (e.g., Spark > talks to R, which is GPL'd). > > * No necessary *dependencies* of the PR have license conflicts. In > several cases, I contacted package authors who agreed to re-issue their > packages under Apache-compatible licenses. (Usually I had to do a bit of > coding in exchange…) > > * Where the license had to stay GPL, the packages are *not necessary* and > *not dependencies.* If the optional packages are present, they will be > used to enable additional functionality. Knitr is an example. The PR will > compile and run fine without knitr. If knitr is available (it is part of > the most common R distribution), the PR will enable the knitr interpreter. > * This is exactly how this issue is addressed through the Apache > ecosystem. > > The tl;dr is this: When Apache code is written to talk to libraries that > may or may not be present on the user’s system, or where it talks to an API > but is agnostic about implementation, that is not “linking” in a way that > implicate the anti-linking provision of the GPL. > > Otherwise, no Apache code could ever call a shell script! Let alone run > on Linux, or talk to R. > I'm not a legal expert. So following could be wrong. In my interpretation, KnitRInterpreter is not an optional feature while it is always enabled when compiling Zeppelin and always enabled when running Zeppelin. And it requires dynamically linked GPL library on runtime. (yes it will fail when no KnitR is installed on the system) And of course, no Apache code can ever call a shell script, on the purpose of dynamic linking with GPL library. I was guessing SparkR can be still in Apache License even if it is depends on R. Because of GPL licensed compiler generated output is not GPL license. and R is sort of compiler. If you can get answer from Spark community how SparkR get managed to stay in Apache License while R is GPL, the answer might help. 3. Need more time to review. > > Has any reviewer has downloaded the PR or run the demo notebook? (Which > is there for the benefit of reviewers, and isn’t intended to go in final > distribution.) > > How many +1 comments from users would you like to see? > > How much time do you believe is required? > It all depends on when CI is going to pass, when license problem is going to be cleared, and when a committer willing to review and responsible to commit your contribution. 1. Large code base change > > Large code base changes require coordination and cooperation. This PR > necessarily implicates the build scripts, testing code, the > SparkInterpreter, etc. > > I have been seeking to coordinate since August. > > I continue to stand ready to do so. > > -Amos > If i give you one suggestion, Zeppelin committers sometimes ask rebase the contribution branch for some reason. It is not the really the best practice, but still okay while most contributions are not including large code base changes. However, your one, has more than 4000 lines of code change. If you rebase then review should be started from the beginning, again. So you might want to minimize the rebase your branch. Thanks, moon > From: moon soo Lee <m...@apache.org> <m...@apache.org> > Reply: dev@zeppelin.incubator.apache.org > <dev@zeppelin.incubator.apache.org> <dev@zeppelin.incubator.apache.org> > Date: December 1, 2015 at 1:34:19 AM > To: dev@zeppelin.incubator.apache.org <dev@zeppelin.incubator.apache.org> > <dev@zeppelin.incubator.apache.org> > Subject: Re: contributions impasse. Was: [GitHub] incubator-zeppelin > pull request: R Interpreter for Zeppelin > > Hi Cos, > > Thanks for opening a discussion. > My answer about 'Why this PR is open for three months' is > > 1. This PR does not passes CI > 2. Not 100% sure this PR has no license issue. (about KniteR) > 3. Need more time to review. > > It's my personal answer, other committers may have different opinion. > > > I think the question should be generalized. Because this PR is not the > only > PR that is in impasse. There're more. For example > > Here's some examples that PR is not been merged. > https://github.com/apache/incubator-zeppelin/pull/53, > https://github.com/apache/incubator-zeppelin/pull/60 > and so on. > > I can categorize the cases, based on experience of involving Zeppelin > community from the beginning. > > 1. Large code base change > > When contribution has large code base changes, it tend to take more time > to > review and merged. Normally, most contributions merged in 1day~1 week. > But some contribution has large code base changes take 2~4 weeks. Few > contribution that has very large code base change take months. > > 2. Communication lost > > Sometimes, committer is not responding, or contributor is not responding. > > 3. Opinion diverges > > For some changes, included in contribution. When people have different > opinion and it continue to diverges, those PRs are not been merged. > > > I think having a guide such as ping other committer when a committer is > not > responding, and divide contribution into small peaces if possible, would > help most of the cases. > Of course committer have to pay attention more to the contribution and > helping people. That's the first one. > > Thanks, > moon > > On Tue, Dec 1, 2015 at 1:54 PM Konstantin Boudnik <c...@apache.org> wrote: > > > To make sure we're on the same page, here are two threads that I found > > related > > to this topic. > > > > Thread 1: > > Subject: R? > > Started on: July 1, 2015 > > > > Thread 2: > > Subject: [GitHub] incubator-zeppelin pull request: R Interpreter for > > Zeppelin > > Started on: August 13, 2015 > > > > If you want to fetch these from our archive send emails to > > dev-thread.1...@zeppelin.incubator.apache.org > > dev-thread.3...@zeppelin.incubator.apache.org > > > > Cos > > > > On Mon, Nov 30, 2015 at 06:27PM, Konstantin Boudnik wrote: > > > Guys, > > > > > > While catching up on my emails from the last a couple of weeks, this > > thread > > > caught my attention. I am not normally paying much attention to the > code > > > reviews traffic from GH, but this one is pretty different as it spans > > three > > > months and counting. > > > > > > First, here are my five cents: > > > - r/R/rzeppelin/LICENSE is wrong: if the code is aimed to be > > contributed to > > > an ASF project this file should simply contain ASL2 text, like in [1] > > > - r/pom.xml perhaps shouldn't contain a separate <developers> section, > > but > > > Zeppelin might have different guidelines on it. Say, Bigtop doesn't > > > maintain this in the source code, but we have the list of all the > > > committers on the project's site [2] Every new committer's first > > commit is > > > to update the team page ;) > > > - comments like in > > r/src/main/java/org/apache/zeppelin/rinterpreter/KnitR.java > > > > > > +/** > > > + * Created by aelberg on 7/28/15. > > > + */ > > > > > > is better to be removed. It has been already discussed here [3]. And > > the > > > initial discussion on the topic [4] was linked as well > > > - same goes to r/R/rzeppelin/DESCRIPTION. I am not sure if this is > > R-specific > > > stuff - I have no idea about R, honestly, but > > > > > > +License: GPL (>= 2) | BSD_3_clause + file LICENSE > > > ... > > > +Author: David B. Dahl > > > > > > shouldn't be here, IMO. Normally, if some additional licenses are > > used, > > > they have to be listed in the top-level NOTICE file (already there). > > > > > > - I am not going to offer any comments on the technical merits of the > > patch, > > > as I haven't tried it personally. However, I don't see any serious > > > technical objections to the functionality in question. > > > > > > So, the question is - why the PR is open for three months? I hasn't > been > > able > > > to get a clear answer. What I found out though is pretty unsettling, > > really. > > > The communication on the topic is almost non-existent, except for this > > sparse > > > and bitter thread in the GH. > > > > > > I would love to hear from the committers what's stopping the > acceptance > > of the > > > code, besides of the minor issues I've mentioned earlier? What are the > > reasons for it? > > > Is there anything wrong with it? > > > One of the responsibilities of the committers is to make sure the > > > contributions are reviewed; new contributors are guided and do > > understand how > > > the project ticks. The easy feedback flow attracts new people, > allowing > > the > > > community to grow and thrive. > > > > > > I am asking _explicitely_ not to re-start the bickering I have already > > > seen. At this point I am interested in the purely technical side of > this. > > > > > > [1] https://github.com/apache/bigtop/blob/master/LICENSE > > > [2] http://bigtop.apache.org/team-list.html > > > [3] > > > http://apache-nifi-developer-list.39713.n7.nabble.com/author-tags-td1335.html > > > [4] http://s.apache.org/iZl > > > > > > With regards, > > > Cos > > > > > > On Mon, Nov 16, 2015 at 11:06PM, elbamos wrote: > > > > Github user elbamos commented on the pull request: > > > > > > > > > > > https://github.com/apache/incubator-zeppelin/pull/208#issuecomment-157203411 > > > > > > > > The current push should resolve some issues with changes in the > > > > Spark-Zeppelin interface that had created issues for users, as > > well as > > > > support for 1.5.1. > > > > > > > > Knitr support is improved, and the reason for a separate knitr > > interpreter may be clearer now. > > > > > > > > The amount of code borrowed from rscala is reduced. > > > > > > > > I did not address issues with @author tags, or files under the R/ > > > > folder. The reason is, to be blunt, I don't understand what the > > precise > > > > concerns actually are. > > > > > > > > Please note that I changed .travis.yml to only use spark 1.4 and > > later. > > > > I'm sure there is a better way to do it, but I'm just not in a > > position > > > > to try to figure out the entire Zeppelin build process. > > > > > > > > Integrating this with the rest of zeppelin is going to take some > > work > > > > regarding pom's, travis, and so forth. I can do a lot of that, > > but I'm > > > > going to need to discuss it with the people who have been "owning" > > those > > > > files. There are just too many moving pieces here. > > > > > > > > Because of the size of this (which is, unfortunately, necessary), > > > > posting here is probably not the most efficient way. That is also > > true > > > > because certain people chose to use this PR as a forum to air other > > > > issues. Therefore, it would be better if reviewers e-mail me > > directly. > > > > > > > >