I agree with others and don't see much upside to splitting right now. One small additional note: "The R bindings can also work with old C++ versions" is technically true, for some version (pairs), but it turns out enforcing this is awkward, and the consensus [1] (so far) is that we actually want to pull back on this and go back to "the versions must match" and folks are on their own if they need to do that.
-Jon [1] https://github.com/apache/arrow/issues/43623 On Mon, Mar 3, 2025 at 8:44 AM Antoine Pitrou <anto...@python.org> wrote: > > I agree with Neal that the decoupling is less obviously desirable on the > R side. About the number of R-related CI jobs, is there still a need for > testing so many different configurations? > > > Le 03/03/2025 à 15:32, Neal Richardson a écrit : > > Thanks for raising this, Kou. I'm personally torn on this because I see > > some of the upsides of splitting R out, particularly at the project's > state > > of maturity, but it's also not as simple as Rust or Java or others we've > > split out in the past because of the hard dependency on the C++ > libraries. > > It's not just about integration testing the IPC format. > > > > For better or worse, I can remember tons of instances where the R-related > > CI jobs have caught something in a C++ PR because they test with > different > > compilers and toolchains. If we split the projects and all of the R CI > jobs > > are only running on the arrow-r repo, does that mean that the R > maintainers > > will be continually finding CI failures in the tests that build with the > > latest version of the C++ library and filing bug reports back for the C++ > > project? Either way, it seems that the monorepo would want to keep some R > > testing jobs in crossbow to be able to validate changes, at least to be > > able to confirm that the PR for the issue that the R maintainers filed > > fixes the issue. Maybe this is the way it should be, but it's not clear > > that it reduces the collective maintenance burden. > > > > Just my thoughts based on the historical perspective, I'm happy to defer > to > > the judgment of those who are currently shouldering that maintenance > burden. > > > > Neal > > > > On Sun, Mar 2, 2025 at 7:53 PM Sutou Kouhei <k...@clear-code.com> wrote: > > > >> Hi, > >> > >> This is a similar discussion to the "[DISCUSS] Split Go > >> release process" thread[1] and the "[DISCUSS] Split Java > >> release process" thread[2]: > >> > >> [1] https://lists.apache.org/thread/fstyfvzczntt9mpnd4f0b39lzb8cxlyf > >> [2] https://lists.apache.org/thread/b99wp2f3rjhy09sx7jqvrfqjkqn9lnyy > >> > >> We've split them and they were released from separated > >> repositories. > >> > >> Let's discuss the next target. > >> > >> We raised JavaScript as the next candidate in the Java > >> discussion[3] but we may not find one or more active release > >> managers for JavaScript. > >> > >> [3] https://lists.apache.org/thread/bdko84zy72nlg3k82t772f7pq6zpd0sz > >> > >> I propose R as the next candidate because: > >> > >> * We have many active committers and PMC members who can > >> focus on R > >> * The current R release process is semi-separated > >> * In general, we release R packages to CRAN by non-trivial > >> release process after our monorepo release. > >> e.g.: https://github.com/apache/arrow/issues/45581 > >> * The R bindings can also work with old C++ versions > >> * The R bindings don't need to align with the monorepo > >> versioning. The R bindings can avoid major version up > >> per 3-4 months. > >> * We have many R related CI jobs. If we split the R > >> bindings, we can remove many CI jobs from monorepo. > >> > >> > >> What do you think about this? > >> > >> > >> Thanks, > >> -- > >> kou > >> > > >