hi Jacob — sorry to hear that. It's a bummer that you all did not attempt to work in the community for any meaningful amount of time and are choosing not to try based on the perception that it will create unacceptable overhead for you. I believe the benefits would outweigh the costs, but I suppose we will have to agree to disagree.
Can you prepare a pull request to do the requisite repository surgery? I hope the development goes well in the future and look forward to seeing folks from the Julia ecosystem engaged here on growing the Arrow ecosystem. Thanks, Wes On Fri, Apr 2, 2021 at 3:03 PM Jacob Quinn <quinn.jac...@gmail.com> wrote: > > Ok, I've had a chance to discuss with a few other Julia developers and > review various options. I think it's best to drop the Julia code from the > physical apache/arrow repo. The extra overhead on development, release > process, and user issue reporting and PR contributing are too much in > addition to the technical challenges that we never resolved involving > including the past Arrow.jl release version git trees in the apache/arrow > repo. > > We're still very much committed to working on the Julia implementation and > participating in the broader arrow community. I've enjoyed following the > user/dev mailing lists and will continue to do so. We monitor format > proposals and try to implement new functionality as quickly as possible. We > got the initial arrow flight proto code generated just last night in fact. > I'd still like to explore official integration with the archery test suite > to solidify the Julia implementation with integration tests; I think that > would be very valuable for long-term confidence in the cross-language > support of the Julia implementation. > > We realize one of the main implications will probably be dropping Julia > from the list of "official implementations". We're encouraged by the many > users who have already started using the Julia implementation and will > strive to maintain a high rate of issue responsiveness and feature > development to maintain project confidence. If there's a possibility of > being included somewhere as an "unofficial" or "semi-official" > implementation, we'd love to still be bundled with the broader arrow > project somehow, like, for example, showing how Julia integrates with the > archery test suite, once the work there is done. > > Best, > > -Jacob > > > > On Tue, Mar 30, 2021 at 4:10 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > Also, on the issue that there are no Julia-focused PMC members — note > > that I helped the JavaScript folks make their own independent releases > > for quite a while: called the votes (e.g. [1]), helped get people to > > verify and vote on the releases. After a time, it was decided to stop > > releasing independently because there wasn't enough development > > activity to justify it. > > > > [1]: https://www.mail-archive.com/dev@arrow.apache.org/msg05971.html > > > > On Tue, Mar 30, 2021 at 4:54 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > > > > hi Jacob, > > > > > > On Tue, Mar 30, 2021 at 4:18 PM Jacob Quinn <quinn.jac...@gmail.com> > > wrote: > > > > > > > > I can comment as the primary apache arrow liaison for the Arrow.jl > > > > repository and original code donator. > > > > > > > > I apologize for the "surprise", but I commented a few times in various > > > > places and put a snippet in the README > > > > < > > https://github.com/apache/arrow/tree/master/julia/Arrow#difference-between-this-code-and-the-juliadataarrowjl-repository > > > > > > > about > > > > the approach I wanted to take w/ the Julia implementation in terms of > > > > keeping the JuliaData/Arrow.jl repository as a "dev branch" of sorts > > of the > > > > apache/arrow code, upstreaming changes periodically. There's even a > > script > > > > < > > https://github.com/JuliaData/Arrow.jl/blob/main/scripts/update_apache_arrow_code.jl > > > > > > > I wrote to mostly automate this upstreaming. I realize now that I > > didn't > > > > consider the "Arrow PMC" position on this kind of setup or seek to > > affirm > > > > that it would be ok to approach things like this. > > > > > > > > The reality is that Julia users are very engrained to expect Julia > > packages > > > > to live in a single stand-alone github repo, where issues can be > > opened, > > > > and pull requests are welcome. It was hard and still is hard to imagine > > > > "turning that off", since I believe we would lose a lot of valuable bug > > > > reports and first-time contributions. This isn't necessarily any fault > > of > > > > how the bug report/contribution process is handled for the arrow > > project > > > > overall, though I'm also aware that there's a desire to make it easier > > > > > > > > > > > < > > https://lists.apache.org/x/thread.html/r8817dfba08ef8daa210956db69d513fd27b7a751d28fb8f27e39cc7e@%3Cdev.arrow.apache.org%3E > > > > > > > and > > > > it currently requires more and different effort than Julia users are > > used > > > > to. I think it's more from how open, welcoming, and how strong the > > culture > > > > is in Julia around encouraging community contributions and the tight > > > > integration with github and its open-source project management tools. > > > > > > > > > > Well, we are on track to having 1000 different people contribute to > > > the project and have over 12,000 issues, so I don't think there is > > > evidence that we are failing to attract new contributors or that > > > feature requests / bugs aren't being reported. The way that we work is > > > _different_, so adapting to the Apache process will require change. > > > > > > > Additionally, I was and still am concerned about the overall release > > > > process of the apache/arrow project. I know there have been efforts > > there > > > > as well to make it easier for individual languages to release on their > > own > > > > cadence, but just anecdotally, the JuliaData/Arrow.jl has > > had/needed/wanted > > > > 10 patch and minor releases since the original code donation, whereas > > the > > > > apache/arrow project has had one (3.0.0). This leads to some of the > > > > concerns I have with restricting development to just the apache/arrow > > > > repository: how exactly does the release process work for individual > > > > languages who may desire independent releases apart from the quarterly > > > > overall project releases? I think from the Rust thread I remember that > > you > > > > just need a group of language contributors to all agree, but what if > > I'm > > > > the only "active" Julia contributor? It's also unclear what the > > > > expectations are for actual development: with the original code > > donation > > > > PRs, I know Neal "reviewed" the PRs, but perhaps missed the details > > around > > > > how I proposed development continue going forward. Is it required to > > have a > > > > certain number of reviews before merging? On the Julia side, I can try > > to > > > > encourage/push for those who have contributed to the JuliaData/Arrow.jl > > > > repository to help review PRs to apache/arrow, but I also can't > > guarantee > > > > we would always have someone to review. It just feels pretty awkward > > if I > > > > keep needing to ping non-Julia people to "review" a PR to merge it. > > Perhaps > > > > this is just a problem of the overall Julia implementation "smallness" > > in > > > > terms of contributors, but I'm not sure on the best answer here. > > > > > > > > > > Several things here: > > > > > > * If you want to do separate Julia releases, you are free to do that, > > > but you have to follow the process (voting on the mailing list, > > > publishing GPG-signed source artifacts) > > > * If you had been working "in the community" since November, you would > > > probably already be a committer, so there is a bootstrapping here that > > > has failed to take place. In the meantime, we are more than happy to > > > help you "earn your wings" (as a committer) as quickly as possible. > > > But from my perspective, I see a code donation and two other commits, > > > which isn't enough to make a case for committership. > > > > > > > So in short, I'm not sure on the best path forward. I think strictly > > > > restricting development to the apache/arrow physical repository would > > > > actively hurt the progress of the Julia implementation, whereas it > > *has* > > > > been progressing with increasing momentum since first released. There > > are > > > > posts on the Julia discourse forum, in the Julia slack and zulip > > > > communities, and quite a few issues/PRs being opened at the > > > > JuliaData/Arrow.jl repository. There have been several calls for arrow > > > > flight support, with a member from Julia Computing actually close to > > > > releasing a gRPC client > > > > <https://github.com/JuliaComputing/gRPCClient.jl> specifically > > > > to help with flight support. But in terms of actual committers, it's > > been > > > > primarily just myself, with a few minor contributions by others. > > > > > > > > I guess the big question that comes to mind is what are the hard > > > > requirements to be considered an "official implementation"? Does the > > code > > > > *have* to live in the same physical repo? Or if it passed the series of > > > > archery integration tests, would that be enough? I apologize for my > > > > naivete/inexperience on all things "apache", but I imagine that's a big > > > > part of it: having official development/releases through the > > apache/arrow > > > > community, though again I'm not exactly sure on the formal processes > > here? > > > > I would like to keep Julia as an official implementation, but I'm also > > > > mostly carrying the maintainership alone at the moment and want to be > > > > realistic with the future of the project. > > > > > > > > > > The critical matter is whether the development/maintenance work is > > > conducted by the "Arrow community" in accordance with the Apache Way, > > > which is to say individuals collaborating with each other on Apache > > > channels (for communication and development) and avoiding the bad > > > patterns you see sometimes in other communities (e.g. inconsistent > > > openness). > > > > > > It's fine — really, no pressure — if you want to be independent and do > > > things your own way, you just have to be clear that you are > > > independent and not operating as part of the Apache Arrow community. > > > You can't have it both ways, though. No hard feelings whatever you > > > decide, but the current "dump code over the wall occasionally" > > > approach but work on independent channels is not compatible. Building > > > healthy open source communities is hard, but this way has been shown > > > to work well, which is why I've spent the last 6 years working hard to > > > bring people together to build this project and ecosystem! > > > > > > If you want to maintain a test harness here to verify an independent > > > Julia implementation, that's fine, too. I'm disappointed that things > > > failed to bootstrap after the code donation, so I want to see if we > > > can course correct quickly or if not decide to go our separate ways. > > > > > > Thanks, > > > Wes > > > > > > > I'm open to discussion and ideas on the best way forward. > > > > > > > > -Jacob > > > > > > > > On Tue, Mar 30, 2021 at 2:03 PM Wes McKinney <wesmck...@gmail.com> > > wrote: > > > > > > > > > hi folks, > > > > > > > > > > I was very surprised today to learn that the Julia Arrow > > > > > implementation has continued operating more or less like an > > > > > independent open source project since the code donation last > > November: > > > > > > > > > > https://github.com/JuliaData/Arrow.jl/commits/main > > > > > > > > > > There may have been a misunderstanding about what was expected to > > > > > occur after the code donation, but it's problematic for a bunch of > > > > > reasons (IP lineage / governance / community development) to have > > work > > > > > happening on the implementation "outside the community". > > > > > > > > > > In any case, what is done is done, so the Arrow PMC's position on > > this > > > > > would be roughly to regard the work as a hard fork of what's in > > Apache > > > > > Arrow, which given its development activity is more or less inactive > > > > > [1]. (I had actually thought the project was simply inactive after > > the > > > > > code donation) > > > > > > > > > > The critical question now is, is there interest from Julia developers > > > > > in working "in the community", which is to say: > > > > > > > > > > * Having development discussions on ASF channels (mailing list, > > > > > GitHub, JIRA), planning and communicating in the open > > > > > * Doing all development in ASF GitHub repositories > > > > > > > > > > The answer to the question may be "no" (which is okay), but if that's > > > > > the case, I don't think we should be giving the impression that we > > > > > have an official Julia implementation that is developed and > > maintained > > > > > by the community (and so my argument would be unfortunately to drop > > > > > the donated code from the project). > > > > > > > > > > If the answer is "yes", there needs to be a hard commitment to move > > > > > development to Apache channels and not look back. We would also need > > > > > to figure out what to do to document and synchronize the new IP > > that's > > > > > been created since the code donation. > > > > > > > > > > Thanks, > > > > > Wes > > > > > > > > > > [1]: https://github.com/apache/arrow/commits/master/julia/Arrow > > > > > > >