> Based on 6., users need to change their import paths on
> upgrade whether we keep using apache/arrow or we use new
> apache/arrow-go.
>
> If we use new apache/arrow-go, we will be able to reduce
> maintenance cost for apache/arrow (e.g. we can remove Go
> related scripts, CI jobs and so on from apache/arrow). Let's
> use apache/arrow-go.
>
> If nobody objects splitting Apache Arrow Go to
> apache/arrow-go in this week, I'll start working on this
> next week. (Do we need a vote for this?)

I think we'll need a vote on it *eventually*, but we probably can wait
until the repo is up and ready with the vote being when we "pull the
trigger" so-to-say and turn off Go releases from the main
github.com/apache/arrow repository and start releasing through
github.com/apache/arrow-go. We'll also probably need to shift open PRs /
issues to the new repo after we do this.

I'm happy to review any of the scripts being added to the new repository
and help out if you need it in getting this going. I think the prospect of
fewer major releases, and thus easier upgrades, is very worthwhile for us
to pursue this.

--Matt

On Mon, Aug 19, 2024 at 1:14 AM Sutou Kouhei <k...@clear-code.com> wrote:

> Hi,
>
> Sorry for not working on this.
>
> Thanks for sharing the standard docs! I've read it and
> related docs.
>
> Here is the summary I learned in this thread and the
> standard docs:
>
> 1. We're using "github.com/apache/arrow/go/v${VERSION}
> <http://github.com/apache/arrow/go/v$%7BVERSION%7D>" such
>    as "github.com/apache/arrow/go/v17" as our module name
>    * https://pkg.go.dev/github.com/apache/arrow/go/v17/arrow
>    * Including the version number part ("v${VERSION}") is
>      important
>    * Users can avoid unexpected backward incompatibility by
>      this style
> 2. We used to use "github.com/apache/arrow/go" as our module
>    name in v5 or earlier
>    * https://pkg.go.dev/github.com/apache/arrow/go/arrow
>    * 133 modules still use this
> 3. We want to avoid user side changes as much as possible
>    * As 2. shows, users may keep using old version if there
>      is any change is required
> 4. The current users need to change Apache Arrow Go's import
>    path to "github.com/apache/arrow/go/v${VERSION
> <http://github.com/apache/arrow/go/v$%7BVERSION> + 1}" when
>    they want to upgrade Apache Arrow Go
>    * We don't want to require more changes than "changing
>      import path" for users as mentioned in 3.
> 5. We can't provide backward compatible module name such as
>    "github.com/apache/arrow/go/v18" for
>    "github.com/apache/arrow-go/v18"
>    * Go doesn't provide the feature
> 6. We want to keep "v${VERSION}" in our module name even if
>    we split Apache Arrow Go to apache/arrow-go
>    * It's for avoiding unexpected backward incompatibility
>      in users' projects
>
>
> Based on 6., users need to change their import paths on
> upgrade whether we keep using apache/arrow or we use new
> apache/arrow-go.
>
> If we use new apache/arrow-go, we will be able to reduce
> maintenance cost for apache/arrow (e.g. we can remove Go
> related scripts, CI jobs and so on from apache/arrow). Let's
> use apache/arrow-go.
>
> If nobody objects splitting Apache Arrow Go to
> apache/arrow-go in this week, I'll start working on this
> next week. (Do we need a vote for this?)
>
>
> Thanks,
> --
> kou
>
> In <cah4123zxadcug6yrkz2mxupke1muftyrvhg0hh1bqck5fw+...@mail.gmail.com>
>   "Re: [DISCUSS] Split Go release process" on Mon, 22 Jul 2024 20:47:57
> -0400,
>   Matt Topol <zotthewiz...@gmail.com> wrote:
>
> > Hey Kou,
> >
> > https://go.dev/doc/modules/release-workflow is the standard docs for
> > developing module versioning and publishing with Go.
> >
> > There isn't really a way to alias an import path to a different git repo
> > because it uses the GitHub URL itself as the import path.
> >
> > But it does seem like people seem to prefer the idea of shifting the Go
> > implementation to its own repository. I'd still push for us to include
> the
> > major version number in the import path, and since we'll have fewer major
> > releases and more minor releases, users shouldn't have to update their
> > import paths as frequently.
> >
> > --Matt
> >
> > On Mon, Jul 22, 2024, 8:37 PM Sutou Kouhei <k...@clear-code.com> wrote:
> >
> >> Hi,
> >>
> >> >                  Kou, is your plan also counting on moving the
> >> > specific nightlies there and removing them from the main repo?
> >>
> >> Yes. I should have mentioned it explicitly.
> >>
> >> We will remove most Go related CI jobs from apache/arrow. We
> >> will keep Go in integration test CI jobs like we do for
> >> apache/arrow-rs.
> >>
> >>
> >> Thanks,
> >> --
> >> kou
> >>
> >> In <cad1rbrr2vtxaunppfrrjgfd+ofca3q4f+yr6npku4ttzlx2...@mail.gmail.com>
> >>   "Re: [DISCUSS] Split Go release process" on Fri, 19 Jul 2024 17:14:25
> >> +0200,
> >>   Raúl Cumplido <raulcumpl...@gmail.com> wrote:
> >>
> >> > Hi,
> >> >
> >> > The conversation around more frequent minor releases and version split
> >> > per component has been a long one.
> >> >
> >> > I am in favour of these changes for the Go implementation because we
> >> > have several maintainers.
> >> >
> >> > It might be difficult to release other implementations that do not
> >> > have the same amount of maintainers. I am not sure what our plan is if
> >> > one of the split implementations has less maintainers and there's a
> >> > requirement for a release (i.e. security fix) but that might be
> >> > something to consider in the future.
> >> >
> >> >> I would defer to Raul and Jacob to corroborate this, but because
> >> >> changes to the CI configuration and release verification scripts
> don't
> >> >> affect other implementations, I have been able to maintain that
> >> >> infrastructure myself without too much effort and don't have to lean
> >> >> on them for anything except reviews.
> >> >
> >> > I think releasing and maintaining release scripts / verifications per
> >> > component is much easier than for the mono repo. We currently have
> >> > over 200 nightly CI jobs in the mono repo that are required to pass
> >> > before releasing. Moving some of those to its own repo helps
> >> > maintainability. Kou, is your plan also counting on moving the
> >> > specific nightlies there and removing them from the main repo?
> >> >
> >> > I would be in favour of doing a new major release (v18) once the repo
> >> > and the changes are in-place to update the import path to something
> >> > like:
> >> > github.com/apache/arrow-go/v18
> >> >
> >> > This would avoid confusion with previous releases. We can then follow
> >> > up with patch/minor/major as required.
> >> >
> >> > I am also happy to help with the releases and infrastructure if
> >> > necessary as I've done with the main Arrow one (I can also help on
> >> > nanoarrow, adbc if necessary).
> >> >
> >> > Kind regards,
> >> > Raul
> >> >
> >> >
> >> >
> >> >>
> >> >> [1] https://github.com/apache/arrow-nanoarrow/pull/557
> >> >>
> >> >> On Thu, Jul 18, 2024 at 7:53 PM Matt Topol <zotthewiz...@gmail.com>
> >> wrote:
> >> >> >
> >> >> > Part of the goal of splitting out the release processes is that
> we'd
> >> be
> >> >> > able to do minor version releases more frequently instead of major
> >> version
> >> >> > releases.
> >> >> >
> >> >> > The general convention in the Go community is to include a major
> >> version
> >> >> > "v#" in the import path for all major versions past v1 so that if
> >> there's a
> >> >> > breaking change, it's explicit and prevents potential issues from
> >> different
> >> >> > major versions being used simultaneously. Being able to do minor
> >> version
> >> >> > releases more frequently would lead to not having to change the
> import
> >> >> > paths every 3-6 months, but only if we actually do a breaking
> change.
> >> >> >
> >> >> > On Thu, Jul 18, 2024, 3:55 PM George Godik <ggo...@gmail.com>
> wrote:
> >> >> >
> >> >> > > > If we shift the Go lib to a new/different import
> >> >> > > path we'll end up with the same problem where people will rely on
> >> older
> >> >> > > versions and an incorrect path.
> >> >> > >
> >> >> > > Major version upgrades already require changing the import paths
> by
> >> >> > > increasing the version. The proposed change would require
> everyone
> >> to go
> >> >> > > through a similar process one last time.
> >> >> > >
> >> >> > > > More to the point, there would be the question of whether or
> not
> >> we
> >> >> > > should port over the same major version
> >> >> > > number, i.e. `github.com/apache/arrow-go/v17`
> <http://github.com/apache/arrow-go/v17>
> >> <http://github.com/apache/arrow-go/v17>
> >> >> > > <http://github.com/apache/arrow-go/v17>
> >> >> > > <http://github.com/apache/arrow-go/v17> or something to that
> end?
> >> Or
> >> >> > > do we restart back at v1 (which I think would be confusing)?
> >> >> > >
> >> >> > > My vote - for whatever it's worth  - would be to do away with the
> >> >> > > version-in-path naming convention and relying on the go
> >> version/package
> >> >> > > system for major upgrades.
> >> >> > >
> >> >> > > Benefits: I don't have to change import paths every 3-6months
> >> >> > >
> >> >> > > On Thu, Jul 18, 2024 at 3:34 PM Matt Topol <
> zotthewiz...@gmail.com>
> >> wrote:
> >> >> > >
> >> >> > > > My thoughts:
> >> >> > > >
> >> >> > > > > * Go doesn't depend on other components such as C++
> >> >> > > > > * Go has some active PMC member (Matt) and committer (Joel)
> >> >> > > > >   * Could you become a release manager for Go?
> >> >> > > >
> >> >> > > > I'd happily be the release manager for the Go implementation.
> >> >> > > >
> >> >> > > > > Here is my idea how to proceed this:
> >> >> > > > >
> >> >> > > > > 1. Extract go/ in apache/arrow to apache/arrow-go like
> >> >> > > > >     apache/arrow-rs
> >> >> > > > >     * Filter go/ related commits from apache/arrow and create
> >> >> > > > >       apache/arrow-go with them like we did for
> apache/arrow-rs
> >> >> > > > >     * Remove go/ related codes from apache/arrow
> >> >> > > > > 2. Prepare integration test CI like apache/arrow-rs does:
> >> >> > > > >
> >> >> > > >
> >> >> > > >
> >> >> > >
> >>
> https://github.com/apache/arrow-rs/blob/master/.github/workflows/integration.yml
> >> >> > > > > 3. Prepare release script based on apache/arrow-julia,
> >> >> > > > >     apache/arrow-adbc and/or
> apache/arrow-flight-sql-postgresql
> >> >> > > >
> >> >> > > > Personally I would prefer that we do not extract it to its own
> >> separate
> >> >> > > > repository purely because I don't want to change the import
> path
> >> for
> >> >> > > users
> >> >> > > > again. We already have this issue from before we introduced the
> >> major
> >> >> > > > version into the import path and shifted it up to allow for the
> >> Parquet
> >> >> > > lib
> >> >> > > > in the same repository. If you look at [1] you see that there's
> >> still
> >> >> > > over
> >> >> > > > 100 projects that never upgraded to v6 or higher because they
> are
> >> still
> >> >> > > > using the old import path. If we shift the Go lib to a
> >> new/different
> >> >> > > import
> >> >> > > > path we'll end up with the same problem where people will rely
> on
> >> older
> >> >> > > > versions and an incorrect path.
> >> >> > > >
> >> >> > > > If we as a community decide that splitting out the
> >> implementations all
> >> >> > > into
> >> >> > > > separate repositories is the best way forward, I won't hold it
> up
> >> by
> >> >> > > > strictly hammering on this. I'm just concerned about the
> >> realities and
> >> >> > > > difficulties of communicating the import path change, ensuring
> we
> >> don't
> >> >> > > > break any consumers, and ensuring that users still end up being
> >> able to
> >> >> > > > upgrade easily.
> >> >> > > >
> >> >> > > > > The import path could be "github.com/apache/arrow-go"
> instead
> >> of "
> >> >> > > > github.com/apache/arrow-go/arrow". Since go will allow users
> to
> >> use
> >> >> > > > `arrow.Abc` directly if user imports `
> github.com/apache/arrow-go` <http://github.com/apache/arrow-go>
> >> <http://github.com/apache/arrow-go>
> >> >> > > <http://github.com/apache/arrow-go>
> >> >> > > > <http://github.com/apache/arrow-go>
> >> >> > > > <http://github.com/apache/arrow-go>.
> >> >> > > >
> >> >> > > > The import path would still have to be `
> >> >> > > github.com/apache/arrow-go/arrow`
> <http://github.com/apache/arrow-go/arrow>
> >> <http://github.com/apache/arrow-go/arrow>
> >> >> > > <http://github.com/apache/arrow-go/arrow>
> >> >> > > > <http://github.com/apache/arrow-go/arrow>
> >> >> > > > since it would also contain the parquet implementation in `
> >> >> > > > github.com/apache/arrow-go/parquet`
> <http://github.com/apache/arrow-go/parquet>
> >> <http://github.com/apache/arrow-go/parquet>
> >> >> > > <http://github.com/apache/arrow-go/parquet>
> >> >> > > > <http://github.com/apache/arrow-go/parquet>. More to the
> point,
> >> there
> >> >> > > > would be the
> >> >> > > > question of whether or not we should port over the same major
> >> version
> >> >> > > > number, i.e. `github.com/apache/arrow-go/v17`
> <http://github.com/apache/arrow-go/v17>
> >> <http://github.com/apache/arrow-go/v17>
> >> >> > > <http://github.com/apache/arrow-go/v17>
> >> >> > > > <http://github.com/apache/arrow-go/v17> or something to that
> >> end? Or
> >> >> > > > do we restart back at v1 (which I think would be confusing)?
> >> >> > > >
> >> >> > > > --Matt
> >> >> > > >
> >> >> > > > [1]: https://pkg.go.dev/github.com/apache/arrow/go/arrow
> >> >> > > >
> >> >> > > > On Thu, Jul 18, 2024 at 7:33 AM Antoine Pitrou <
> >> anto...@python.org>
> >> >> > > wrote:
> >> >> > > >
> >> >> > > > >
> >> >> > > > > Hi Kou,
> >> >> > > > >
> >> >> > > > > Le 18/07/2024 à 11:33, Sutou Kouhei a écrit :
> >> >> > > > > >
> >> >> > > > > > Here is my idea how to proceed this:
> >> >> > > > > >
> >> >> > > > > > 1. Extract go/ in apache/arrow to apache/arrow-go like
> >> >> > > > > >     apache/arrow-rs
> >> >> > > > > >     * Filter go/ related commits from apache/arrow and
> create
> >> >> > > > > >       apache/arrow-go with them like we did for
> >> apache/arrow-rs
> >> >> > > > > >     * Remove go/ related codes from apache/arrow
> >> >> > > > > > 2. Prepare integration test CI like apache/arrow-rs does:
> >> >> > > > > >
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >>
> https://github.com/apache/arrow-rs/blob/master/.github/workflows/integration.yml
> >> >> > > > > > 3. Prepare release script based on apache/arrow-julia,
> >> >> > > > > >     apache/arrow-adbc and/or
> >> apache/arrow-flight-sql-postgresql
> >> >> > > > >
> >> >> > > > > I think this is a good idea, but I'm not part of the Go
> >> maintainers.
> >> >> > > > >
> >> >> > > > > > Cons of this idea:
> >> >> > > > > >
> >> >> > > > > > * This is a backward incompatible change
> >> >> > > > > >    * Users need to change their "import" to
> >> >> > > > > >      "github.com/apache/arrow-go/arrow" from
> >> >> > > > > >      "github.com/apache/arrow/go/arrow"
> >> >> > > > >
> >> >> > > > > Is there no way to leave some kind of alias or redirection in
> >> the
> >> >> > > > > apache/arrow repository?
> >> >> > > > >
> >> >> > > > > Regards
> >> >> > > > >
> >> >> > > > > Antoine.
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >>
>

Reply via email to