Re: Updated Sourceware infrastructure plans

Ben Boeckel via Gcc Sat, 04 May 2024 12:59:00 -0700

On Wed, May 01, 2024 at 23:26:18 +0200, Mark Wielaard wrote:
> On Wed, May 01, 2024 at 04:04:37PM -0400, Jason Merrill wrote:
> > Do you (or others) have any thoughts about GitLab FOSS?
> 
> The gitlab "community edition" still feels not very much "community".
> We could run our own instance, but it will still be "open core" with
> features missing to try to draw you towards the proprietary hosted
> saas version. Also it seems to have way too much overhead. The focus
> is clearly corporate developers where managers want assurances the
> mandatory "pipelines" are executed and "workflows" followed exactly.


I'll offer my experience here. We (at Kitware) have been using GitLab
FOSS for around 8 years. We can't use the other editions because of the
per-account pricing and having open registration (since pretty much
everything there is FOSS code). GitLab is receptive to patches sent
their way and have considered moving things to the FOSS edition to help
large FOSS organizations (freedesktop.org, GNOME, KDE, probably others
too). There's also been discussion of implementing features such as
commit message review in order to court Linux developers given
forge-like discussion happening there. FWIW, Fedora is also looking at
forges as well:

    https://discussion.fedoraproject.org/t/2024-git-forge-evaluation/111795

That said, there are definitely gaps to fill. We have our tooling here:

    https://gitlab.kitware.com/utils/rust-ghostflow (core actions)
    https://gitlab.kitware.com/utils/ghostflow-director (service deployment)

We use it to implement things including:

  - Basic content checks (scripts are executable, no binaries, file size
    limits, formatting, etc.) either on a commit-by-commit basis or by
    looking at the MR (patch series, PR, whatever the forge calls it) as
    a whole. Docs for currently-implemented checks are here:
    
https://gitlab.kitware.com/utils/rust-ghostflow/-/blob/master/ghostflow-cli/doc/checks.md
  - Reformatting upon request; if the formatter(s) in use supports
    writing the content as intended, there is code to rewrite each
    individual patch to conform. This avoids wasting time on either side
    for things that can be done automatically (of course, you're also at
    the mercy of what the formatter wants…I find it worth it on balance).
  - More advanced merging including gathering trailers for the merge
    commit message from comments and other metadata including
    `Reviewed-by` and `Tested-by` (also from CI). Also supported is
    merging into multiple branches at once (e.g., backports to older
    branches with a single MR).
  - Merge train support (we call it the "stage"); this feature is
    otherwise locked behind for-pay editions of GitLab.

Right now, GitLab and Github are supported, but other forges can be
supported as well. In addition to the service (which is triggered by
webhook delivery), there's a command line tool for local usage (though
it only implements checking and reformatting at the moment mainly due to
a lack of available time to work on it).

There are other things that are probably of interest to supply chain or
other things such as:

  - every push is stored in a ghostflow-director-side unique ref
    (`refs/mr/ID/heads/N` where `N` is an incrementing integer) to avoid
    forge-side garbage collection (especially problematic on Github;
    I've not noticed GitLab collecting so eagerly)
  - all webhooks are delivered via filesystem and can be archived
    (`webhook-listen` is the program that listens and delivers them:
    https://gitlab.kitware.com/utils/webhook-listen); events which
    trigger failures are stored with some context about what happened;
    those that are ignored are stored with a reason for the ignore (see
    this crate for the "event loop" of `ghostflow-director` itself:
    https://gitlab.kitware.com/utils/rust-json-job-dispatch)
  - the forge is the source of truth; if a ref is force-pushed,
    `ghostflow` will accept the state on the forge as gospel instead;
    the only non-logging/historical tracking state off-forge includes:
    - the config file
    - formatter installation (formatting is designed to only use trusted
      binaries; nothing from the repo itself other than which to use)

On the first two points, we had some data loss on our instance once and
using the webhook history and stored refs, I was able to restore code
pushed to projects and "replay" comments that happened since the last
backup (I copied the content and @mentioned the original author).

> At the moment though the only thing people seem to agree on is that
> any system will be based on git. So the plan for now is to first setup
> a larger git(olite) system so that every contributor (also those who
> don't currently have commit access) can easily "post" their git
> repo. This can then hopefully integrate with the systems we already
> have setup (triggering builder CI, flag/match with patchwork/emails,
> etc.) or any future "pull request" like system.

As a fellow FOSS maintainer I definitely appreciate the benefit of being
email-based (`mutt` is far better at wrangling notifications from
umpteen places than…well basically any website is at even their own),
but as a *contributor* it is utterly opaque. It's not always clear if my
patch has been seen, if it is waiting on maintainer time, or for me to
do something. After one review, what is the courtesy time before pushing
a new patchset to avoid a review "crossing in the night" as I push more
patches? Did I get everyone that commented on the patch the first time
in the Cc list properly? Is a discussion considered resolved (FWIW,
Github is annoying with its conversation resolution behavior IMO;
GitLab's explicit closing is much better). Has it been merged? To the
right place? And that's for patches I author; figuring out the status of
patches I'm interested in but not the author of is even harder. A forge
surfaces a lot of this information pretty well and, to me, GitLab at
least offers usable enough email messages (e.g., discussions on threads
will thread in email too) that the public tracking of such things is far
more useful on the whole.

--Ben

Re: Updated Sourceware infrastructure plans

Reply via email to