This all looks good to me, and I'm happy to review/help with any parts of
the migration as well.

Thanks,
Joel

On Mon, Aug 19, 2024 at 4:16 PM Matt Topol <zotthewiz...@gmail.com> wrote:

> > Based on 6., users need to change their import paths on
> > upgrade whether we keep using apache/arrow or we use new
> > apache/arrow-go.
> >
> > If we use new apache/arrow-go, we will be able to reduce
> > maintenance cost for apache/arrow (e.g. we can remove Go
> > related scripts, CI jobs and so on from apache/arrow). Let's
> > use apache/arrow-go.
> >
> > If nobody objects splitting Apache Arrow Go to
> > apache/arrow-go in this week, I'll start working on this
> > next week. (Do we need a vote for this?)
>
> I think we'll need a vote on it *eventually*, but we probably can wait
> until the repo is up and ready with the vote being when we "pull the
> trigger" so-to-say and turn off Go releases from the main
> github.com/apache/arrow repository and start releasing through
> github.com/apache/arrow-go. We'll also probably need to shift open PRs /
> issues to the new repo after we do this.
>
> I'm happy to review any of the scripts being added to the new repository
> and help out if you need it in getting this going. I think the prospect of
> fewer major releases, and thus easier upgrades, is very worthwhile for us
> to pursue this.
>
> --Matt
>
> On Mon, Aug 19, 2024 at 1:14 AM Sutou Kouhei <k...@clear-code.com> wrote:
>
> > Hi,
> >
> > Sorry for not working on this.
> >
> > Thanks for sharing the standard docs! I've read it and
> > related docs.
> >
> > Here is the summary I learned in this thread and the
> > standard docs:
> >
> > 1. We're using "github.com/apache/arrow/go/v${VERSION}
> <http://github.com/apache/arrow/go/v$%7BVERSION%7D>
> > <http://github.com/apache/arrow/go/v$%7BVERSION%7D>" such
> >    as "github.com/apache/arrow/go/v17" as our module name
> >    * https://pkg.go.dev/github.com/apache/arrow/go/v17/arrow
> >    * Including the version number part ("v${VERSION}") is
> >      important
> >    * Users can avoid unexpected backward incompatibility by
> >      this style
> > 2. We used to use "github.com/apache/arrow/go" as our module
> >    name in v5 or earlier
> >    * https://pkg.go.dev/github.com/apache/arrow/go/arrow
> >    * 133 modules still use this
> > 3. We want to avoid user side changes as much as possible
> >    * As 2. shows, users may keep using old version if there
> >      is any change is required
> > 4. The current users need to change Apache Arrow Go's import
> >    path to "github.com/apache/arrow/go/v${VERSION
> <http://github.com/apache/arrow/go/v$%7BVERSION>
> > <http://github.com/apache/arrow/go/v$%7BVERSION> + 1}" when
> >    they want to upgrade Apache Arrow Go
> >    * We don't want to require more changes than "changing
> >      import path" for users as mentioned in 3.
> > 5. We can't provide backward compatible module name such as
> >    "github.com/apache/arrow/go/v18" for
> >    "github.com/apache/arrow-go/v18"
> >    * Go doesn't provide the feature
> > 6. We want to keep "v${VERSION}" in our module name even if
> >    we split Apache Arrow Go to apache/arrow-go
> >    * It's for avoiding unexpected backward incompatibility
> >      in users' projects
> >
> >
> > Based on 6., users need to change their import paths on
> > upgrade whether we keep using apache/arrow or we use new
> > apache/arrow-go.
> >
> > If we use new apache/arrow-go, we will be able to reduce
> > maintenance cost for apache/arrow (e.g. we can remove Go
> > related scripts, CI jobs and so on from apache/arrow). Let's
> > use apache/arrow-go.
> >
> > If nobody objects splitting Apache Arrow Go to
> > apache/arrow-go in this week, I'll start working on this
> > next week. (Do we need a vote for this?)
> >
> >
> > Thanks,
> > --
> > kou
> >
> > In <cah4123zxadcug6yrkz2mxupke1muftyrvhg0hh1bqck5fw+...@mail.gmail.com>
> >   "Re: [DISCUSS] Split Go release process" on Mon, 22 Jul 2024 20:47:57
> > -0400,
> >   Matt Topol <zotthewiz...@gmail.com> wrote:
> >
> > > Hey Kou,
> > >
> > > https://go.dev/doc/modules/release-workflow is the standard docs for
> > > developing module versioning and publishing with Go.
> > >
> > > There isn't really a way to alias an import path to a different git
> repo
> > > because it uses the GitHub URL itself as the import path.
> > >
> > > But it does seem like people seem to prefer the idea of shifting the Go
> > > implementation to its own repository. I'd still push for us to include
> > the
> > > major version number in the import path, and since we'll have fewer
> major
> > > releases and more minor releases, users shouldn't have to update their
> > > import paths as frequently.
> > >
> > > --Matt
> > >
> > > On Mon, Jul 22, 2024, 8:37 PM Sutou Kouhei <k...@clear-code.com> wrote:
> > >
> > >> Hi,
> > >>
> > >> >                  Kou, is your plan also counting on moving the
> > >> > specific nightlies there and removing them from the main repo?
> > >>
> > >> Yes. I should have mentioned it explicitly.
> > >>
> > >> We will remove most Go related CI jobs from apache/arrow. We
> > >> will keep Go in integration test CI jobs like we do for
> > >> apache/arrow-rs.
> > >>
> > >>
> > >> Thanks,
> > >> --
> > >> kou
> > >>
> > >> In <
> cad1rbrr2vtxaunppfrrjgfd+ofca3q4f+yr6npku4ttzlx2...@mail.gmail.com>
> > >>   "Re: [DISCUSS] Split Go release process" on Fri, 19 Jul 2024
> 17:14:25
> > >> +0200,
> > >>   Raúl Cumplido <raulcumpl...@gmail.com> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > The conversation around more frequent minor releases and version
> split
> > >> > per component has been a long one.
> > >> >
> > >> > I am in favour of these changes for the Go implementation because we
> > >> > have several maintainers.
> > >> >
> > >> > It might be difficult to release other implementations that do not
> > >> > have the same amount of maintainers. I am not sure what our plan is
> if
> > >> > one of the split implementations has less maintainers and there's a
> > >> > requirement for a release (i.e. security fix) but that might be
> > >> > something to consider in the future.
> > >> >
> > >> >> I would defer to Raul and Jacob to corroborate this, but because
> > >> >> changes to the CI configuration and release verification scripts
> > don't
> > >> >> affect other implementations, I have been able to maintain that
> > >> >> infrastructure myself without too much effort and don't have to
> lean
> > >> >> on them for anything except reviews.
> > >> >
> > >> > I think releasing and maintaining release scripts / verifications
> per
> > >> > component is much easier than for the mono repo. We currently have
> > >> > over 200 nightly CI jobs in the mono repo that are required to pass
> > >> > before releasing. Moving some of those to its own repo helps
> > >> > maintainability. Kou, is your plan also counting on moving the
> > >> > specific nightlies there and removing them from the main repo?
> > >> >
> > >> > I would be in favour of doing a new major release (v18) once the
> repo
> > >> > and the changes are in-place to update the import path to something
> > >> > like:
> > >> > github.com/apache/arrow-go/v18
> > >> >
> > >> > This would avoid confusion with previous releases. We can then
> follow
> > >> > up with patch/minor/major as required.
> > >> >
> > >> > I am also happy to help with the releases and infrastructure if
> > >> > necessary as I've done with the main Arrow one (I can also help on
> > >> > nanoarrow, adbc if necessary).
> > >> >
> > >> > Kind regards,
> > >> > Raul
> > >> >
> > >> >
> > >> >
> > >> >>
> > >> >> [1] https://github.com/apache/arrow-nanoarrow/pull/557
> > >> >>
> > >> >> On Thu, Jul 18, 2024 at 7:53 PM Matt Topol <zotthewiz...@gmail.com
> >
> > >> wrote:
> > >> >> >
> > >> >> > Part of the goal of splitting out the release processes is that
> > we'd
> > >> be
> > >> >> > able to do minor version releases more frequently instead of
> major
> > >> version
> > >> >> > releases.
> > >> >> >
> > >> >> > The general convention in the Go community is to include a major
> > >> version
> > >> >> > "v#" in the import path for all major versions past v1 so that if
> > >> there's a
> > >> >> > breaking change, it's explicit and prevents potential issues from
> > >> different
> > >> >> > major versions being used simultaneously. Being able to do minor
> > >> version
> > >> >> > releases more frequently would lead to not having to change the
> > import
> > >> >> > paths every 3-6 months, but only if we actually do a breaking
> > change.
> > >> >> >
> > >> >> > On Thu, Jul 18, 2024, 3:55 PM George Godik <ggo...@gmail.com>
> > wrote:
> > >> >> >
> > >> >> > > > If we shift the Go lib to a new/different import
> > >> >> > > path we'll end up with the same problem where people will rely
> on
> > >> older
> > >> >> > > versions and an incorrect path.
> > >> >> > >
> > >> >> > > Major version upgrades already require changing the import
> paths
> > by
> > >> >> > > increasing the version. The proposed change would require
> > everyone
> > >> to go
> > >> >> > > through a similar process one last time.
> > >> >> > >
> > >> >> > > > More to the point, there would be the question of whether or
> > not
> > >> we
> > >> >> > > should port over the same major version
> > >> >> > > number, i.e. `github.com/apache/arrow-go/v17`
> <http://github.com/apache/arrow-go/v17>
> > <http://github.com/apache/arrow-go/v17>
> > >> <http://github.com/apache/arrow-go/v17>
> > >> >> > > <http://github.com/apache/arrow-go/v17>
> > >> >> > > <http://github.com/apache/arrow-go/v17> or something to that
> > end?
> > >> Or
> > >> >> > > do we restart back at v1 (which I think would be confusing)?
> > >> >> > >
> > >> >> > > My vote - for whatever it's worth  - would be to do away with
> the
> > >> >> > > version-in-path naming convention and relying on the go
> > >> version/package
> > >> >> > > system for major upgrades.
> > >> >> > >
> > >> >> > > Benefits: I don't have to change import paths every 3-6months
> > >> >> > >
> > >> >> > > On Thu, Jul 18, 2024 at 3:34 PM Matt Topol <
> > zotthewiz...@gmail.com>
> > >> wrote:
> > >> >> > >
> > >> >> > > > My thoughts:
> > >> >> > > >
> > >> >> > > > > * Go doesn't depend on other components such as C++
> > >> >> > > > > * Go has some active PMC member (Matt) and committer (Joel)
> > >> >> > > > >   * Could you become a release manager for Go?
> > >> >> > > >
> > >> >> > > > I'd happily be the release manager for the Go implementation.
> > >> >> > > >
> > >> >> > > > > Here is my idea how to proceed this:
> > >> >> > > > >
> > >> >> > > > > 1. Extract go/ in apache/arrow to apache/arrow-go like
> > >> >> > > > >     apache/arrow-rs
> > >> >> > > > >     * Filter go/ related commits from apache/arrow and
> create
> > >> >> > > > >       apache/arrow-go with them like we did for
> > apache/arrow-rs
> > >> >> > > > >     * Remove go/ related codes from apache/arrow
> > >> >> > > > > 2. Prepare integration test CI like apache/arrow-rs does:
> > >> >> > > > >
> > >> >> > > >
> > >> >> > > >
> > >> >> > >
> > >>
> >
> https://github.com/apache/arrow-rs/blob/master/.github/workflows/integration.yml
> > >> >> > > > > 3. Prepare release script based on apache/arrow-julia,
> > >> >> > > > >     apache/arrow-adbc and/or
> > apache/arrow-flight-sql-postgresql
> > >> >> > > >
> > >> >> > > > Personally I would prefer that we do not extract it to its
> own
> > >> separate
> > >> >> > > > repository purely because I don't want to change the import
> > path
> > >> for
> > >> >> > > users
> > >> >> > > > again. We already have this issue from before we introduced
> the
> > >> major
> > >> >> > > > version into the import path and shifted it up to allow for
> the
> > >> Parquet
> > >> >> > > lib
> > >> >> > > > in the same repository. If you look at [1] you see that
> there's
> > >> still
> > >> >> > > over
> > >> >> > > > 100 projects that never upgraded to v6 or higher because they
> > are
> > >> still
> > >> >> > > > using the old import path. If we shift the Go lib to a
> > >> new/different
> > >> >> > > import
> > >> >> > > > path we'll end up with the same problem where people will
> rely
> > on
> > >> older
> > >> >> > > > versions and an incorrect path.
> > >> >> > > >
> > >> >> > > > If we as a community decide that splitting out the
> > >> implementations all
> > >> >> > > into
> > >> >> > > > separate repositories is the best way forward, I won't hold
> it
> > up
> > >> by
> > >> >> > > > strictly hammering on this. I'm just concerned about the
> > >> realities and
> > >> >> > > > difficulties of communicating the import path change,
> ensuring
> > we
> > >> don't
> > >> >> > > > break any consumers, and ensuring that users still end up
> being
> > >> able to
> > >> >> > > > upgrade easily.
> > >> >> > > >
> > >> >> > > > > The import path could be "github.com/apache/arrow-go"
> > instead
> > >> of "
> > >> >> > > > github.com/apache/arrow-go/arrow". Since go will allow users
> > to
> > >> use
> > >> >> > > > `arrow.Abc` directly if user imports `
> > github.com/apache/arrow-go` <http://github.com/apache/arrow-go> <
> http://github.com/apache/arrow-go>
> > >> <http://github.com/apache/arrow-go>
> > >> >> > > <http://github.com/apache/arrow-go>
> > >> >> > > > <http://github.com/apache/arrow-go>
> > >> >> > > > <http://github.com/apache/arrow-go>.
> > >> >> > > >
> > >> >> > > > The import path would still have to be `
> > >> >> > > github.com/apache/arrow-go/arrow`
> <http://github.com/apache/arrow-go/arrow>
> > <http://github.com/apache/arrow-go/arrow>
> > >> <http://github.com/apache/arrow-go/arrow>
> > >> >> > > <http://github.com/apache/arrow-go/arrow>
> > >> >> > > > <http://github.com/apache/arrow-go/arrow>
> > >> >> > > > since it would also contain the parquet implementation in `
> > >> >> > > > github.com/apache/arrow-go/parquet`
> <http://github.com/apache/arrow-go/parquet>
> > <http://github.com/apache/arrow-go/parquet>
> > >> <http://github.com/apache/arrow-go/parquet>
> > >> >> > > <http://github.com/apache/arrow-go/parquet>
> > >> >> > > > <http://github.com/apache/arrow-go/parquet>. More to the
> > point,
> > >> there
> > >> >> > > > would be the
> > >> >> > > > question of whether or not we should port over the same major
> > >> version
> > >> >> > > > number, i.e. `github.com/apache/arrow-go/v17`
> <http://github.com/apache/arrow-go/v17>
> > <http://github.com/apache/arrow-go/v17>
> > >> <http://github.com/apache/arrow-go/v17>
> > >> >> > > <http://github.com/apache/arrow-go/v17>
> > >> >> > > > <http://github.com/apache/arrow-go/v17> or something to that
> > >> end? Or
> > >> >> > > > do we restart back at v1 (which I think would be confusing)?
> > >> >> > > >
> > >> >> > > > --Matt
> > >> >> > > >
> > >> >> > > > [1]: https://pkg.go.dev/github.com/apache/arrow/go/arrow
> > >> >> > > >
> > >> >> > > > On Thu, Jul 18, 2024 at 7:33 AM Antoine Pitrou <
> > >> anto...@python.org>
> > >> >> > > wrote:
> > >> >> > > >
> > >> >> > > > >
> > >> >> > > > > Hi Kou,
> > >> >> > > > >
> > >> >> > > > > Le 18/07/2024 à 11:33, Sutou Kouhei a écrit :
> > >> >> > > > > >
> > >> >> > > > > > Here is my idea how to proceed this:
> > >> >> > > > > >
> > >> >> > > > > > 1. Extract go/ in apache/arrow to apache/arrow-go like
> > >> >> > > > > >     apache/arrow-rs
> > >> >> > > > > >     * Filter go/ related commits from apache/arrow and
> > create
> > >> >> > > > > >       apache/arrow-go with them like we did for
> > >> apache/arrow-rs
> > >> >> > > > > >     * Remove go/ related codes from apache/arrow
> > >> >> > > > > > 2. Prepare integration test CI like apache/arrow-rs does:
> > >> >> > > > > >
> > >> >> > > > >
> > >> >> > > >
> > >> >> > >
> > >>
> >
> https://github.com/apache/arrow-rs/blob/master/.github/workflows/integration.yml
> > >> >> > > > > > 3. Prepare release script based on apache/arrow-julia,
> > >> >> > > > > >     apache/arrow-adbc and/or
> > >> apache/arrow-flight-sql-postgresql
> > >> >> > > > >
> > >> >> > > > > I think this is a good idea, but I'm not part of the Go
> > >> maintainers.
> > >> >> > > > >
> > >> >> > > > > > Cons of this idea:
> > >> >> > > > > >
> > >> >> > > > > > * This is a backward incompatible change
> > >> >> > > > > >    * Users need to change their "import" to
> > >> >> > > > > >      "github.com/apache/arrow-go/arrow" from
> > >> >> > > > > >      "github.com/apache/arrow/go/arrow"
> > >> >> > > > >
> > >> >> > > > > Is there no way to leave some kind of alias or redirection
> in
> > >> the
> > >> >> > > > > apache/arrow repository?
> > >> >> > > > >
> > >> >> > > > > Regards
> > >> >> > > > >
> > >> >> > > > > Antoine.
> > >> >> > > > >
> > >> >> > > >
> > >> >> > >
> > >>
> >
>

Reply via email to