Re: Updated Sourceware infrastructure plans

2024-05-10 Thread Ben Boeckel via Gcc
On Tue, May 07, 2024 at 16:17:24 +, Joseph Myers via Gcc wrote:
> I'd say we have two kinds of patch submission (= two kinds of pull request 
> in a pull request workflow) to consider in the toolchain, and it's 
> important that a PR-based system supports both of them well (and supports 
> a submission changing from one kind to the other, and preferably 
> dependencies between multiple PRs where appropriate).

The way I'd handle this in `ghostflow` is with a description trailer
like `Squash-merge: true` (already implemented trailers include
`Backport`, `Fast-forward`, `Backport-ff`, and `Topic-rename` as
description trailers, so this is a natural extension there).
Alternatively a label can be used, but that is not directly editable by
MR authors that are not also members of the project. There's also a
checkbox either at MR creation and/or merge time to select between them,
but I don't find that easily discoverable (I believe only those with
merge rights can see the button state in general).

> * Simple submissions that are intended to end up as a single commit on the 
> mainline (squash merge).  The overall set of changes to be applied to the 
> mainline is subject to review, and the commit message also is subject to 
> review (review of commit messages isn't always something that PR-based 
> systems seem to handle that well).  But for the most part there isn't a 
> need to rebase these - fixes as a result of review can go as subsequent 
> commits on the source branch (making it easy to review either the 
> individual fixes, or the whole updated set of changes), and merging from 
> upstream into that branch is also OK.  (If there *is* a rebase, the 
> PR-based system should still preserve the history of and comments on 
> previous versions, avoid GCing them and avoid getting confused.)
> 
> * Complicated submissions of patch series, that are intended to end up as 
> a sequence of commits on the mainline (non-squash merge preserving the 
> sequence of commits).  In this case, fixes (or updating from upstream) 
> *do* involve rebases to show what the full new sequence of commits should 
> be (and all individual commits and their commit messages should be subject 
> to review, not just the overall set of changes to be applied).  Again, 
> rebases need handling by the system in a history-preserving way.

There's been a long-standing issue to use `range-diff` in GitLab. I
really don't know why it isn't higher priority, but I suppose having
groups like Sourceware and/or kernel.org interested could move it up a
priority list for them.

https://gitlab.com/gitlab-org/gitlab/-/issues/24096

FWIW, there's also a "comment on commit messages" issue:

https://gitlab.com/gitlab-org/gitlab/-/issues/19691

That said, I've had little issues with rebases losing commits or
discussion on GitLab whereas I've definitely seen things get lost on
Github. I'm unfamiliar with other forges to know there (other than that
Gerrit-likes that track patches are generally workable with rebases).

> GitHub (as an example - obviously not appropriate itself for the 
> toolchain) does much better on simple submissions (either with squash 
> merges, or with merges showing the full history if you don't care about a 
> clean bisectable history), apart from review of commit messages, than it 
> does on complicated submissions or dependencies between PRs (I think 
> systems sometimes used for PR dependencies on GitHub may actually be 
> third-party add-ons).

The way I've tended to handle this is to have one "main MR" that is the
"whole story" with component MRs split out for separate review. Once the
separate MRs are reviewed and merged (with cross references), the main
MR is rebased to incorporate the merged code and simplify its diff. This
helps to review smaller bits while also having the full story available
for viewing.

> Pull request systems have obvious advantages over mailing lists for 
> tracking open submissions - but it's still very easy for an active project 
> to end up with thousands of open PRs, among which it's very hard to find 
> anything.

In CMake, the mechanism used to keep the queue manageable is to have a
`triage:expired` label for closed-for-inactivity (or other reasons) so
that closed-but-only-neglected MRs can be distinguished from
closed-because-not-going-to-be-merged MRs. The "active patch queue"
tends to stay under 20, but sometimes swells to 30 in busy times (as of
this writing, it is at 10 open MRs).

--Ben


Re: Updated Sourceware infrastructure plans

2024-05-07 Thread Joseph Myers via Gcc
On Sat, 4 May 2024, Ben Boeckel via Gcc wrote:

>   - every push is stored in a ghostflow-director-side unique ref
> (`refs/mr/ID/heads/N` where `N` is an incrementing integer) to avoid
> forge-side garbage collection (especially problematic on Github;
> I've not noticed GitLab collecting so eagerly)

That's the sort of thing I was talking about for ensuring all the versions 
of every pull request remain available - it doesn't need anything more 
than providing such refs (that someone can get with git clone --mirror if 
they wish).  (And there has been and is work on ensure git scales well to 
repositories with millions of refs, such as you get with PR-based systems 
storing all PRs or all versions of PRs as refs.)

-- 
Joseph S. Myers
josmy...@redhat.com



Re: Updated Sourceware infrastructure plans

2024-05-07 Thread Joseph Myers via Gcc
On Thu, 2 May 2024, Fangrui Song wrote:

> > On the other hand, GitHub structures the concept of pull requests 
> > around branches and enforces a branch-centric workflow. A pull request 
> > centers on the difference (commits) between the base branch and the 
> > feature branch. GitHub does not employ a stable identifier for commit 
> > tracking. If commits are rebased, reordered, or combined, GitHub can 
> > easily become confused.

I'd say we have two kinds of patch submission (= two kinds of pull request 
in a pull request workflow) to consider in the toolchain, and it's 
important that a PR-based system supports both of them well (and supports 
a submission changing from one kind to the other, and preferably 
dependencies between multiple PRs where appropriate).

* Simple submissions that are intended to end up as a single commit on the 
mainline (squash merge).  The overall set of changes to be applied to the 
mainline is subject to review, and the commit message also is subject to 
review (review of commit messages isn't always something that PR-based 
systems seem to handle that well).  But for the most part there isn't a 
need to rebase these - fixes as a result of review can go as subsequent 
commits on the source branch (making it easy to review either the 
individual fixes, or the whole updated set of changes), and merging from 
upstream into that branch is also OK.  (If there *is* a rebase, the 
PR-based system should still preserve the history of and comments on 
previous versions, avoid GCing them and avoid getting confused.)

* Complicated submissions of patch series, that are intended to end up as 
a sequence of commits on the mainline (non-squash merge preserving the 
sequence of commits).  In this case, fixes (or updating from upstream) 
*do* involve rebases to show what the full new sequence of commits should 
be (and all individual commits and their commit messages should be subject 
to review, not just the overall set of changes to be applied).  Again, 
rebases need handling by the system in a history-preserving way.

GitHub (as an example - obviously not appropriate itself for the 
toolchain) does much better on simple submissions (either with squash 
merges, or with merges showing the full history if you don't care about a 
clean bisectable history), apart from review of commit messages, than it 
does on complicated submissions or dependencies between PRs (I think 
systems sometimes used for PR dependencies on GitHub may actually be 
third-party add-ons).

Pull request systems have obvious advantages over mailing lists for 
tracking open submissions - but it's still very easy for an active project 
to end up with thousands of open PRs, among which it's very hard to find 
anything.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: Updated Sourceware infrastructure plans

2024-05-06 Thread Ben Boeckel via Gcc
On Sun, May 05, 2024 at 08:22:12 +0300, Benson Muite wrote:
> On 04/05/2024 22.56, Ben Boeckel via Overseers wrote:
> > As a fellow FOSS maintainer I definitely appreciate the benefit of being
> > email-based (`mutt` is far better at wrangling notifications from
> > umpteen places than…well basically any website is at even their own),
> > but as a *contributor* it is utterly opaque. It's not always clear if my
> > patch has been seen, if it is waiting on maintainer time, or for me to
> > do something. After one review, what is the courtesy time before pushing
> > a new patchset to avoid a review "crossing in the night" as I push more
> > patches? Did I get everyone that commented on the patch the first time
> > in the Cc list properly? Is a discussion considered resolved (FWIW,
> > Github is annoying with its conversation resolution behavior IMO;
> > GitLab's explicit closing is much better). Has it been merged? To the
> > right place? And that's for patches I author; figuring out the status of
> > patches I'm interested in but not the author of is even harder. A forge
> > surfaces a lot of this information pretty well and, to me, GitLab at
> > least offers usable enough email messages (e.g., discussions on threads
> > will thread in email too) that the public tracking of such things is far
> > more useful on the whole.
> 
> This is an area that also needs standardization of important
> functionality.  Some method of archiving the content is also helpful -
> email does this well but typically does not offer  dashboard. Sourcehut
> makes reading threads using the web interface very easy.

The other thing that email makes difficult to do: jump in on an existing
discussion without having been subscribed previously. I mean, I know
how to tell `mutt` to set an `In-Reply-To` header and munge a proper
reply by hand once I find a `Message-Id` (though a fully proper
`References` header is usually way too much work to be worth it), but
this is not something I expect others to be able to easily perform.

> Web interfaces are difficult to automate, but friendlier for occasional
> use and encouraging new contributions.  Tools separate from the version
> control system such as Gerrit, Phabricator, Rhode Code and Review Board
> also enable discussion management and overview.

Note that forges tend to have very rich APIs. It's certainly not as easy
as clicking around manually for one-off tasks or setting up a shell
pipeline to process some emails, but building automation isn't
impossible.

--Ben


Re: Updated Sourceware infrastructure plans

2024-05-04 Thread Benson Muite via Gcc
On 04/05/2024 22.56, Ben Boeckel via Overseers wrote:
> On Wed, May 01, 2024 at 23:26:18 +0200, Mark Wielaard wrote:
>> On Wed, May 01, 2024 at 04:04:37PM -0400, Jason Merrill wrote:

> 
>> At the moment though the only thing people seem to agree on is that
>> any system will be based on git. So the plan for now is to first setup
>> a larger git(olite) system so that every contributor (also those who
>> don't currently have commit access) can easily "post" their git
>> repo. This can then hopefully integrate with the systems we already
>> have setup (triggering builder CI, flag/match with patchwork/emails,
>> etc.) or any future "pull request" like system.

It may be helpful to determine minimal forge features that people find
useful and make these into a standard.  This would enable
interoperability and automation.  In addition to Git, other tools such
as Sapling, Mercurial, Fossil and Subversion are also used and are more
suitable for many projects.

> 
> As a fellow FOSS maintainer I definitely appreciate the benefit of being
> email-based (`mutt` is far better at wrangling notifications from
> umpteen places than…well basically any website is at even their own),
> but as a *contributor* it is utterly opaque. It's not always clear if my
> patch has been seen, if it is waiting on maintainer time, or for me to
> do something. After one review, what is the courtesy time before pushing
> a new patchset to avoid a review "crossing in the night" as I push more
> patches? Did I get everyone that commented on the patch the first time
> in the Cc list properly? Is a discussion considered resolved (FWIW,
> Github is annoying with its conversation resolution behavior IMO;
> GitLab's explicit closing is much better). Has it been merged? To the
> right place? And that's for patches I author; figuring out the status of
> patches I'm interested in but not the author of is even harder. A forge
> surfaces a lot of this information pretty well and, to me, GitLab at
> least offers usable enough email messages (e.g., discussions on threads
> will thread in email too) that the public tracking of such things is far
> more useful on the whole.

This is an area that also needs standardization of important
functionality.  Some method of archiving the content is also helpful -
email does this well but typically does not offer  dashboard. Sourcehut
makes reading threads using the web interface very easy.

Web interfaces are difficult to automate, but friendlier for occasional
use and encouraging new contributions.  Tools separate from the version
control system such as Gerrit, Phabricator, Rhode Code and Review Board
also enable discussion management and overview.




Re: Updated Sourceware infrastructure plans

2024-05-04 Thread Ben Boeckel via Gcc
On Wed, May 01, 2024 at 23:26:18 +0200, Mark Wielaard wrote:
> On Wed, May 01, 2024 at 04:04:37PM -0400, Jason Merrill wrote:
> > Do you (or others) have any thoughts about GitLab FOSS?
> 
> The gitlab "community edition" still feels not very much "community".
> We could run our own instance, but it will still be "open core" with
> features missing to try to draw you towards the proprietary hosted
> saas version. Also it seems to have way too much overhead. The focus
> is clearly corporate developers where managers want assurances the
> mandatory "pipelines" are executed and "workflows" followed exactly.

I'll offer my experience here. We (at Kitware) have been using GitLab
FOSS for around 8 years. We can't use the other editions because of the
per-account pricing and having open registration (since pretty much
everything there is FOSS code). GitLab is receptive to patches sent
their way and have considered moving things to the FOSS edition to help
large FOSS organizations (freedesktop.org, GNOME, KDE, probably others
too). There's also been discussion of implementing features such as
commit message review in order to court Linux developers given
forge-like discussion happening there. FWIW, Fedora is also looking at
forges as well:

https://discussion.fedoraproject.org/t/2024-git-forge-evaluation/111795

That said, there are definitely gaps to fill. We have our tooling here:

https://gitlab.kitware.com/utils/rust-ghostflow (core actions)
https://gitlab.kitware.com/utils/ghostflow-director (service deployment)

We use it to implement things including:

  - Basic content checks (scripts are executable, no binaries, file size
limits, formatting, etc.) either on a commit-by-commit basis or by
looking at the MR (patch series, PR, whatever the forge calls it) as
a whole. Docs for currently-implemented checks are here:

https://gitlab.kitware.com/utils/rust-ghostflow/-/blob/master/ghostflow-cli/doc/checks.md
  - Reformatting upon request; if the formatter(s) in use supports
writing the content as intended, there is code to rewrite each
individual patch to conform. This avoids wasting time on either side
for things that can be done automatically (of course, you're also at
the mercy of what the formatter wants…I find it worth it on balance).
  - More advanced merging including gathering trailers for the merge
commit message from comments and other metadata including
`Reviewed-by` and `Tested-by` (also from CI). Also supported is
merging into multiple branches at once (e.g., backports to older
branches with a single MR).
  - Merge train support (we call it the "stage"); this feature is
otherwise locked behind for-pay editions of GitLab.

Right now, GitLab and Github are supported, but other forges can be
supported as well. In addition to the service (which is triggered by
webhook delivery), there's a command line tool for local usage (though
it only implements checking and reformatting at the moment mainly due to
a lack of available time to work on it).

There are other things that are probably of interest to supply chain or
other things such as:

  - every push is stored in a ghostflow-director-side unique ref
(`refs/mr/ID/heads/N` where `N` is an incrementing integer) to avoid
forge-side garbage collection (especially problematic on Github;
I've not noticed GitLab collecting so eagerly)
  - all webhooks are delivered via filesystem and can be archived
(`webhook-listen` is the program that listens and delivers them:
https://gitlab.kitware.com/utils/webhook-listen); events which
trigger failures are stored with some context about what happened;
those that are ignored are stored with a reason for the ignore (see
this crate for the "event loop" of `ghostflow-director` itself:
https://gitlab.kitware.com/utils/rust-json-job-dispatch)
  - the forge is the source of truth; if a ref is force-pushed,
`ghostflow` will accept the state on the forge as gospel instead;
the only non-logging/historical tracking state off-forge includes:
- the config file
- formatter installation (formatting is designed to only use trusted
  binaries; nothing from the repo itself other than which to use)

On the first two points, we had some data loss on our instance once and
using the webhook history and stored refs, I was able to restore code
pushed to projects and "replay" comments that happened since the last
backup (I copied the content and @mentioned the original author).

> At the moment though the only thing people seem to agree on is that
> any system will be based on git. So the plan for now is to first setup
> a larger git(olite) system so that every contributor (also those who
> don't currently have commit access) can easily "post" their git
> repo. This can then hopefully integrate with the systems we already
> have setup (triggering builder CI, flag/match with patchwork/emails,
> etc.) or any future "pull request" 

Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Ian Lance Taylor
Pedro Alves via Overseers  writes:

> When GDB upstream tried to use gerrit, I found it basically impossible to
> follow development, given the volume...  The great thing with email is the
> threading of discussions.  A discussion can fork into its own subthread, and 
> any
> sane email client will display the discussion tree.  Email archives also let
> you follow the discussion subthreads.  That is great for archaeology too.
> With Gerrit that was basically lost, everything is super flat.  And
> then following
> development via the gerrit instance website alone is just basically
> impossible too.
> I mean, gerrit is great to track your own patches, and for the actual review
> and diffing between versions.  But for a maintainer who wants to stay
> on top of a
> project, then it's severely lacking, IME and IMO.

My experience is the exact opposite.  As I'm sure you know, Gerritt
supports specific comments on a code review, and discussions on those
comments are tracked separately.  For a complex patch, or series of
patches, you don't get lost in lots of separate discussions, as Gerritt
tracks them all for you separately.

But it's true that to use that effectively you have to look at the web
interface.  The comments are available via git commands, but not in a
directly usable format.

Ian


Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Fangrui Song
On Thu, May 2, 2024 at 8:35 AM Pedro Alves  wrote:
>
> On 2024-05-01 22:04, Simon Marchi wrote:
> > The Change-Id trailer works very well for Gerrit: once you have the hook
> > installed you basically never have to think about it again, and Gerrit
> > is able to track patch versions perfectly accurately.  A while ago, I
> > asked patchwork developers if they would be open to support something
> > like that to track patches, and they said they wouldn't be against it
> > (provided it's not mandatory) [1].  But somebody would have to implement
> > it.
> >
> > Simon
> >
> > [1] https://github.com/getpatchwork/patchwork/issues/327
>
> +1000.  It's mind boggling to me that people would accept Gerrit, which
> means that they'd accept Change-Id:, but then they wouldn't accept
> Change-Id: with a different system...  :-)
>

Gerrit uses "Change-Id:" as stable identifiers to track patches.
https://gregoryszorc.com/blog/2020/01/07/problems-with-pull-requests-and-how-to-fix-them/
has some analysis how they are much better than the alternative smart way.

Perhaps URLs as stable identifiers will work better.
If a reader wants to find relevant discussions, they can just click
the link in many browsers, terminals, and editors.

Currently, searching for discussions about a specific commit requires
searching its title on https://inbox.sourceware.org/gcc-patches/ .
For older patches, I might even need to dig through
https://gcc.gnu.org/pipermail/gcc-patches/-/ archives.

I agree with Jeff that principal reviewers will drive improvement to
the code review process.
I am sharing two code review services LLVM has used.

---

Between 2012 and Sep 2023, LLVM had relied on its self-hosted
Phabricator instance for code review.
Fetching a patch to your local branch was as simple as `arc patch D12345`.
Similarly, creating or updating a patch involved `arc diff`.

I believe other code review services provide similar command line functionality,

---

In September 2023, LLVM transitioned to GitHub for code review.
I really dislike its code review service (however, this is a large
step forward than email based code review). From
https://maskray.me/blog/2023-09-09-reflections-on-llvm-switch-to-github-pull-requests

> On the other hand, GitHub structures the concept of pull requests around 
> branches and enforces a branch-centric workflow. A pull request centers on 
> the difference (commits) between the base branch and the feature branch. 
> GitHub does not employ a stable identifier for commit tracking. If commits 
> are rebased, reordered, or combined, GitHub can easily become confused.
>
> When you force-push a branch after a rebase, the user interface displays a 
> line such as "force-pushed the BB branch from X to Y". Clicking the "compare" 
> button in GitHub presents something like git diff X..Y, which includes 
> unrelated commits. Ideally, GitHub would show the difference between the two 
> patch files, as Phabricator does, but it only displays the difference between 
> the two head commits. These unrelated in-between commits might be acceptable 
> for projects with lower commit frequency but can be challenging for a project 
> with a code frequency of 100+ commits every day.
>
> The fidelity of preserving inline comments after a force push has always been 
> a weakness. The comments may be presented as "outdated". In the past, there 
> was a notorious "lost inline comment" problem. Nowadays, the situation has 
> improved, but some users still report that inline comments may occasionally 
> become misplaced.

Thankfully, getcord/spr comes to a rescue.
User branches allow me to create/update a patch using `spr diff` like
`arc diff` for Phabricator.


Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Pedro Alves
On 2024-05-01 22:04, Simon Marchi wrote:
> The Change-Id trailer works very well for Gerrit: once you have the hook
> installed you basically never have to think about it again, and Gerrit
> is able to track patch versions perfectly accurately.  A while ago, I
> asked patchwork developers if they would be open to support something
> like that to track patches, and they said they wouldn't be against it
> (provided it's not mandatory) [1].  But somebody would have to implement
> it.
> 
> Simon
> 
> [1] https://github.com/getpatchwork/patchwork/issues/327

+1000.  It's mind boggling to me that people would accept Gerrit, which
means that they'd accept Change-Id:, but then they wouldn't accept 
Change-Id: with a different system...  :-)



Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Pedro Alves
On 2024-05-01 22:26, Mark Wielaard wrote:
> For now I am cleaning up Sergio's gerrit setup and upgrading it to the
> latest version, so people can at least try it out. Although I must
> admit that I seem to be the only Sourcewware PLC member that believes
> this is very useful use of our resources. Even the biggest proponents
> of gerrit seem to believe no project will actually adopt it. And on
> irc there were some people really critical of the effort. It seems you
> either love or really hate gerrit...

When GDB upstream tried to use gerrit, I found it basically impossible to
follow development, given the volume...  The great thing with email is the
threading of discussions.  A discussion can fork into its own subthread, and any
sane email client will display the discussion tree.  Email archives also let
you follow the discussion subthreads.  That is great for archaeology too.
With Gerrit that was basically lost, everything is super flat.  And then 
following
development via the gerrit instance website alone is just basically impossible 
too.
I mean, gerrit is great to track your own patches, and for the actual review
and diffing between versions.  But for a maintainer who wants to stay on top of 
a
project, then it's severely lacking, IME and IMO.

(Note: I've been using Gerrit for a few years at AMD internally.)



Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Simon Marchi via Gcc
On 5/2/24 2:47 AM, Richard Biener via Overseers wrote:> We'd only know for sure 
if we try.  But then I'm almost 100% sure that
> having to click in a GUI is slower than 'nrOK^X' in the text-mode mail UA
> I am using for "quick" patch review.  It might be comparable to the
> review parts I do in the gmail UI or when I have to download attachments
> and cut parts into the reply.  It might be also more convenient
> for "queued" for review patches which just end up in New state in either
> inbox.

Speaking of my Gerrit experience.  I don't think that it will ever be
quite as fast and responsive as whatever terminal application you're
using (because web app vs highly optimized native app).  But the time
saved in patch management, tracking down stuff, diffing patch versions,
ease of downloading patches locally to try them you, CI integration,
more than make up for it in terms of productivity, in my case.

The particular case you describe is just one click in Gerrit.  The
current version of Gerrit has a "Code review +2" button on the top
right, which is equivalent to an "OK" without further comments:

https://i.imgur.com/UEz5xmM.png

So, pretty quick too.

If you want to add a general comment on the patch (a comment not bound
to a specific line), typing `a` anywhere within a patch review brings
you to the place where you can do that, and you can do `shift + enter`
to post.  In general, I think that Gerrit has a pretty good set of
keyboard shortcuts to do most common operations:

https://i.imgur.com/RrREsTt.png

Not sure that you mean with the gmail UI and cut & paste part.  I don't
think you'd ever need to do something like that with Gerrit or similar
review system.  To put a comment on a line, you click on that line and
type in the box.

> But then would using gitlab or any similar service enforce the use of
> pull requests / forks for each change done or can I just continue to
> post patches and push them from the command-line for changes I
> can approve myself?

Like Ian said, with Gerrit, you can configure a repo such that you're
still able to git push directly.  If a patch review exists with the same
Change-Id (noted as a git trailer in each commit) as a commit that gets
directly pushed, that patch review gets automatically closed (marked as
"Merged").  So you, as a maintainer with the proper rights, could for
instance download a patch review I uploaded, fix some nits and git push
directly.  Gerrit will mark my patch review as Merged and the final
version of the patch review will reflect whatever you pushed.

Let's say you spot a typo in the code and want to push a trivial patch,
you don't need to create a patch review on Gerrit, you just push
directly (if the repo is configured to allow it).  On the other hand,
creating a patch review on Gerrit is not a big overhead, it's basically
one "git push" to a magic remote.  It prints you the URL, you can click
on it, and you're there.

Simon


Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Claudio Bantaloukas via Gcc



On 5/1/2024 10:26 PM, Mark Wielaard wrote:

Hi Jason,

On Wed, May 01, 2024 at 04:04:37PM -0400, Jason Merrill wrote:

On 5/1/24 12:15, Jeff Law wrote:

We're currently using patchwork to track patches tagged with
RISC-V.� We don't do much review with patchwork.� In that model
patchwork ultimately just adds overhead as I'm constantly trying
to figure out what patches have been integrated vs what are still
outstanding.

Patchwork definitely isn't the answer IMHO.� Nor is gitlab MRs
which we use heavily internally.� But boy I want to get away from
email and to a pull request kind of flow.


Do you (or others) have any thoughts about GitLab FOSS?


The gitlab "community edition" still feels not very much "community".
We could run our own instance, but it will still be "open core" with
features missing to try to draw you towards the proprietary hosted
saas version. Also it seems to have way too much overhead. The focus
is clearly corporate developers where managers want assurances the
mandatory "pipelines" are executed and "workflows" followed exactly.


Hi Mark,

I'm clearly in the "corporate developers" category here, and can testify 
that managers don't care about "pipelines and workflows". They do care 
that people's (very expensive) time be used sparingly.
The most expensive time is reviewer time so the reasoning behind running 
CI pipelines or workflows before review is that it's not a good use of 
people's time to review a patch which breaks tests.
The next most expensive time is that spent investigating breakages which 
could have been avoided with automated testing.


It doesn't matter what the tool's name is: as long as one can see the 
change (diffs),  whether the change is likely to break stuff (that's CI) 
and fellow hacker's comments in one place without having to trawl 
through multiple systems or set up complex local machinery, that's a 
recipe for faster reviews happening once a build is "green".

I think we can agree that's a good thing!

Github and GitLab are good from a corporate standpoint and they do the 
"all the info I need in one place" thing well, but are not exactly Libre.


Has anyone considered https://forgejo.org/ ?
It's a Libre fork of Gitea. Has functionality similar to github but is 
Libre and openly advocating compatibility between forges.


Cheers,
--
Claudio Bantaloukas


Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Mark Wielaard
Hi Jeff,

On Wed, 2024-05-01 at 15:38 -0600, Jeff Law wrote:
> What works well?  If you've wired up some CI bits, it's is extremely 
> useful to test an under development patch.  Develop, push a branch, 
> raise an MR.  At that point the CI system kicks in.  Subsequent pushes 
> to the branch trigger fresh CI runs.  This aspect I really like and if 
> you were to see internal flows, you'd see dev branches churning as a 
> patch gets iterated on.  It also has features like "when this passes CI, 
> automatically commit it", which we often use on the final patch 
> iteration if there was a nit of some kind.

Although not as sophisticated (there are no triggers, just reports),
builder.sourceware.org not only does normal CI runs, but does also
offer try-runs for various Sourceware projects (just binutils, gdb,
elfutils, libabigail and valgrind for now) when someone pushes to their
own users try-branch.

As the binutils wiki describes it:
https://sourceware.org/binutils/wiki/Buildbot

git checkout -b frob
hack, hack, hack... OK, looks good to submit
git commit -a -m "Awesome hack"
git push origin frob:users/username/try-frob
... wait for the emails to come in or watch buildbot try logs
or watch bunsen logs ...
Send in patches and mention what the try bot reported

This is pretty nice for developing patches that you aren't totally sure
yet are ready to submit.

And there is of course the Linaro buildbot that watches (and updates)
patchworks with results of various ARM systems. Which does something
similar but for already submitted (to the mailinglist) patches.

The idea is to provide something similar for GCC and RISC-V once we get
the larger Pioneer Box:
https://riscv.org/blog/2023/06/sophgo-donates-50-risc-v-motherboards-learn-more-about-the-pioneer-box/
But this has been postponed a few times now. Latest update (from about
a week ago) is: "The supplier has reached out to let us know that they
are still experiencing supply issues.  At the moment they are expecting
at least two months to get the hardware together."

Cheers,

Mark


Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Ian Lance Taylor via Gcc
On Wed, May 1, 2024 at 11:48 PM Richard Biener
 wrote:
>
> We'd only know for sure if we try.  But then I'm almost 100% sure that
> having to click in a GUI is slower than 'nrOK^X' in the text-mode mail UA
> I am using for "quick" patch review.  It might be comparable to the
> review parts I do in the gmail UI or when I have to download attachments
> and cut parts into the reply.  It might be also more convenient
> for "queued" for review patches which just end up in New state in either
> inbox.

Gerritt does not require clicking in a GUI, though that is of course
the more widely used option.  Patches can be approved from the command
line.


> But then would using gitlab or any similar service enforce the use of
> pull requests / forks for each change done or can I just continue to
> post patches and push them from the command-line for changes I
> can approve myself?

Gerritt permits submitting patches from the command line for people
who can self-approve.


> Btw, for any forge like tool I'd even consider there'd be the requirement
> that _all_ functionality is also accessible via a documented (stable) API,
> aka there's command-line tooling available or at least possible to write.

True of Gerritt.

Ian


Re: Updated Sourceware infrastructure plans

2024-05-02 Thread Richard Biener via Gcc
On Wed, May 1, 2024 at 11:41 PM Jeff Law via Gcc  wrote:
>
>
>
> On 5/1/24 2:04 PM, Jason Merrill wrote:
> > On 5/1/24 12:15, Jeff Law wrote:
> >>
> >>
> >> On 4/22/24 9:24 PM, Tom Tromey wrote:
> >>> Jason> Someone mentioned earlier that gerrit was previously tried
> >>> Jason> unsuccessfully.
> >>>
> >>> We tried it and gdb and then abandoned it.  We tried to integrate it
> >>> into the traditional gdb development style, having it send email to
> >>> gdb-patches.  I found these somewhat hard to read and in the end we
> >>> agreed not to use it.
> >>>
> >>> I've come around again to thinking we should probably abandon email
> >>> instead.  For me the main benefit is that gerrit has patch tracking,
> >>> unlike our current system, where losing patches is fairly routine.
> >>>
> >>> Jason> I think this is a common pattern in GCC at least: someone has an
> >>> Jason> idea for a workflow improvement, and gets it working, but it
> >>> Jason> isn't widely adopted.
> >>>
> >>> It essentially has to be mandated, IMO.
> >>>
> >>> For GCC this seems somewhat harder since the community is larger, so
> >>> there's more people to convince.
> >> I tend to think it's the principal reviewers that will drive this.  If
> >> several of the key folks indicated they were going to use system XYZ,
> >> whatever it is, that would drive everyone to that system.
> >>
> >> We're currently using patchwork to track patches tagged with RISC-V.
> >> We don't do much review with patchwork.  In that model patchwork
> >> ultimately just adds overhead as I'm constantly trying to figure out
> >> what patches have been integrated vs what are still outstanding.
> >>
> >> Patchwork definitely isn't the answer IMHO.  Nor is gitlab MRs which
> >> we use heavily internally.  But boy I want to get away from email and
> >> to a pull request kind of flow.
> >
> > Do you (or others) have any thoughts about GitLab FOSS?
> I would assume its basically the same as gitlab, except with any
> proprietary removed and that new features land in the enterprise version
> first and presumably migrate to the FOSS version over time.
>
>
> What works well?  If you've wired up some CI bits, it's is extremely
> useful to test an under development patch.  Develop, push a branch,
> raise an MR.  At that point the CI system kicks in.  Subsequent pushes
> to the branch trigger fresh CI runs.  This aspect I really like and if
> you were to see internal flows, you'd see dev branches churning as a
> patch gets iterated on.  It also has features like "when this passes CI,
> automatically commit it", which we often use on the final patch
> iteration if there was a nit of some kind.
>
>
>
>
> What doesn't?   Finding things in gitlab is *awful*.  Now we're just
> talking about one repo, so it may be more manageable in that regard.
> And we're not talking about using it for bug tracking.  As long as we
> kept on top of the MR queue, maybe it would be feasible.
>
> So maybe I should soften my stance on gitlab.  If we're not using it for
> bug tracking and hosting many projects, then maybe its viable.
>
> I think the driving force will be whether or not folks like you, Richi
> and others that do a ton of patch review would be more efficient in a
> gui or not.  I don't think I am, but maybe that would change if I did it
> every day for decades like I did with email :-)

We'd only know for sure if we try.  But then I'm almost 100% sure that
having to click in a GUI is slower than 'nrOK^X' in the text-mode mail UA
I am using for "quick" patch review.  It might be comparable to the
review parts I do in the gmail UI or when I have to download attachments
and cut parts into the reply.  It might be also more convenient
for "queued" for review patches which just end up in New state in either
inbox.

But then would using gitlab or any similar service enforce the use of
pull requests / forks for each change done or can I just continue to
post patches and push them from the command-line for changes I
can approve myself?

Btw, for any forge like tool I'd even consider there'd be the requirement
that _all_ functionality is also accessible via a documented (stable) API,
aka there's command-line tooling available or at least possible to write.

Richard.

> jeff
>
>


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Tom Tromey
> Do you (or others) have any thoughts about GitLab FOSS?

Dunno about the FOSS edition specifically, but I've used many review
tools in anger in the last 5 years: github, gitlab, gerrit, phabricator,
and a couple that ran in bugzilla ("MozReview", not sure if it had some
other name; and a second one that I think was nameless).

For the most part they are pretty similar, IMO, and just the gloss
differs.  I didn't like some aspects of phabricator but I no longer full
recall what.  Maybe its support for patch series was weaker.

For github/gitlab, I think it's probably nicer if you're also using
their bug tracker.  Those also seem totally fine by and large.  Nicer
than bugzilla in some ways (nicer / more responsive UI), worse in others
(searching is not as good).

Tom


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Sergio Durigan Junior via Gcc
On Wednesday, May 01 2024, Mark Wielaard wrote:

[...]
> But the part that interests me most is the self-registration part that
> Sergio setup. I believe we will need that for whatever system we end
> up with to make it as easy to contribute as it is with email.
> https://blog.sergiodj.net/posts/installing-gerrit-and-keycloak/
[...]

Hey Mark,

If I were to set this up today, I would look at Authentik (which is what
I'm using for my personal services).  It is a bit simpler than Keycloak.
I would also certainly go for a container deployment of the service
instead, because (as you can see in the blog post) it's not trivial to
set things up in a correct manner.

Let me know if you need help with this!

Thanks,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Jeff Law via Gcc




On 5/1/24 2:04 PM, Jason Merrill wrote:

On 5/1/24 12:15, Jeff Law wrote:



On 4/22/24 9:24 PM, Tom Tromey wrote:

Jason> Someone mentioned earlier that gerrit was previously tried
Jason> unsuccessfully.

We tried it and gdb and then abandoned it.  We tried to integrate it
into the traditional gdb development style, having it send email to
gdb-patches.  I found these somewhat hard to read and in the end we
agreed not to use it.

I've come around again to thinking we should probably abandon email
instead.  For me the main benefit is that gerrit has patch tracking,
unlike our current system, where losing patches is fairly routine.

Jason> I think this is a common pattern in GCC at least: someone has an
Jason> idea for a workflow improvement, and gets it working, but it
Jason> isn't widely adopted.

It essentially has to be mandated, IMO.

For GCC this seems somewhat harder since the community is larger, so
there's more people to convince.
I tend to think it's the principal reviewers that will drive this.  If 
several of the key folks indicated they were going to use system XYZ, 
whatever it is, that would drive everyone to that system.


We're currently using patchwork to track patches tagged with RISC-V.  
We don't do much review with patchwork.  In that model patchwork 
ultimately just adds overhead as I'm constantly trying to figure out 
what patches have been integrated vs what are still outstanding.


Patchwork definitely isn't the answer IMHO.  Nor is gitlab MRs which 
we use heavily internally.  But boy I want to get away from email and 
to a pull request kind of flow.


Do you (or others) have any thoughts about GitLab FOSS?
I would assume its basically the same as gitlab, except with any 
proprietary removed and that new features land in the enterprise version 
first and presumably migrate to the FOSS version over time.



What works well?  If you've wired up some CI bits, it's is extremely 
useful to test an under development patch.  Develop, push a branch, 
raise an MR.  At that point the CI system kicks in.  Subsequent pushes 
to the branch trigger fresh CI runs.  This aspect I really like and if 
you were to see internal flows, you'd see dev branches churning as a 
patch gets iterated on.  It also has features like "when this passes CI, 
automatically commit it", which we often use on the final patch 
iteration if there was a nit of some kind.





What doesn't?   Finding things in gitlab is *awful*.  Now we're just 
talking about one repo, so it may be more manageable in that regard. 
And we're not talking about using it for bug tracking.  As long as we 
kept on top of the MR queue, maybe it would be feasible.


So maybe I should soften my stance on gitlab.  If we're not using it for 
bug tracking and hosting many projects, then maybe its viable.


I think the driving force will be whether or not folks like you, Richi 
and others that do a ton of patch review would be more efficient in a 
gui or not.  I don't think I am, but maybe that would change if I did it 
every day for decades like I did with email :-)


jeff




Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Mark Wielaard
Hi Jason,

On Wed, May 01, 2024 at 04:04:37PM -0400, Jason Merrill wrote:
> On 5/1/24 12:15, Jeff Law wrote:
> >We're currently using patchwork to track patches tagged with
> >RISC-V.  We don't do much review with patchwork.  In that model
> >patchwork ultimately just adds overhead as I'm constantly trying
> >to figure out what patches have been integrated vs what are still
> >outstanding.
> >
> >Patchwork definitely isn't the answer IMHO.  Nor is gitlab MRs
> >which we use heavily internally.  But boy I want to get away from
> >email and to a pull request kind of flow.
> 
> Do you (or others) have any thoughts about GitLab FOSS?

The gitlab "community edition" still feels not very much "community".
We could run our own instance, but it will still be "open core" with
features missing to try to draw you towards the proprietary hosted
saas version. Also it seems to have way too much overhead. The focus
is clearly corporate developers where managers want assurances the
mandatory "pipelines" are executed and "workflows" followed exactly.

For now I am cleaning up Sergio's gerrit setup and upgrading it to the
latest version, so people can at least try it out. Although I must
admit that I seem to be the only Sourcewware PLC member that believes
this is very useful use of our resources. Even the biggest proponents
of gerrit seem to believe no project will actually adopt it. And on
irc there were some people really critical of the effort. It seems you
either love or really hate gerrit...

But the part that interests me most is the self-registration part that
Sergio setup. I believe we will need that for whatever system we end
up with to make it as easy to contribute as it is with email.
https://blog.sergiodj.net/posts/installing-gerrit-and-keycloak/

My personal favorite, if we really want a full "forge" would be
sourcehut. We already have mirrors of all projects at
https://sr.ht/~sourceware/ and there is a kind of sample "workflow"
(turning a "pull request" into an email thread) at
https://gnu.wildebeest.org/~mark/fsf-sourceware/presentation.html#slide18

At the moment though the only thing people seem to agree on is that
any system will be based on git. So the plan for now is to first setup
a larger git(olite) system so that every contributor (also those who
don't currently have commit access) can easily "post" their git
repo. This can then hopefully integrate with the systems we already
have setup (triggering builder CI, flag/match with patchwork/emails,
etc.) or any future "pull request" like system.

Cheers,

Mark


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Simon Marchi via Gcc



On 2024-05-01 16:53, Tom Tromey via Overseers wrote:
> Mark> See also https://sourceware.org/bugzilla/show_bug.cgi?id=30997
> Mark> We really should automate this. There are several people running
> Mark> scripts by hand. The easiest would be to simply run it from a git
> Mark> hook.  patchwork comes with a simple script that just calculates the
> Mark> hash and pings patchwork, which can then mark the patch associated
> Mark> with that hash as committed. If people really believe calculating a
> Mark> hash is too much work from a git hook then we can also simply run it
> Mark> from builder.sourceware.org. We already run a builder for each commit
> Mark> anyway. It would just be one extra build step checking the commit
> Mark> against patchwork.
> 
> There's just no possibility this approach will work for gdb.  It can't
> reliably recognize when a series is re-sent, or when patches land that
> are slightly different from what was submitted.  Both of these are
> commonplace events in gdb.
> 
> Tom

IMO, asking to always post the committed version as is (effectively
preventing doing "pushed with those nits fixed", or solving trivial
merge conflicts just before pushing) just to make patchwork happy would
be annoying and an additional burden, and noise on the mailing list.

The Change-Id trailer works very well for Gerrit: once you have the hook
installed you basically never have to think about it again, and Gerrit
is able to track patch versions perfectly accurately.  A while ago, I
asked patchwork developers if they would be open to support something
like that to track patches, and they said they wouldn't be against it
(provided it's not mandatory) [1].  But somebody would have to implement
it.

Simon

[1] https://github.com/getpatchwork/patchwork/issues/327


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Tom Tromey
Mark> See also https://sourceware.org/bugzilla/show_bug.cgi?id=30997
Mark> We really should automate this. There are several people running
Mark> scripts by hand. The easiest would be to simply run it from a git
Mark> hook.  patchwork comes with a simple script that just calculates the
Mark> hash and pings patchwork, which can then mark the patch associated
Mark> with that hash as committed. If people really believe calculating a
Mark> hash is too much work from a git hook then we can also simply run it
Mark> from builder.sourceware.org. We already run a builder for each commit
Mark> anyway. It would just be one extra build step checking the commit
Mark> against patchwork.

There's just no possibility this approach will work for gdb.  It can't
reliably recognize when a series is re-sent, or when patches land that
are slightly different from what was submitted.  Both of these are
commonplace events in gdb.

Tom


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Mark Wielaard
Hi Jonathan,

On Wed, May 01, 2024 at 08:38:26PM +0100, Jonathan Wakely wrote:
> On Wed, 1 May 2024 at 20:19, Jeff Law via Gcc  wrote:
> > We're currently using patchwork to track patches tagged with RISC-V.  We
> > don't do much review with patchwork.  In that model patchwork ultimately
> > just adds overhead as I'm constantly trying to figure out what patches
> > have been integrated vs what are still outstanding.
> 
> If patches sent by email exactly match what's committed, then the
> update_gcc_pw.sh script that I run will correctly update patchwork to
> say they're committed. I tend to only bother running that once a week,
> because it misses so many and so is of limited use. If we are now
> supposed to send generated files in the patches, and we discourage
> people from committing something close-but-not-identical to what they
> sent by email, then the script will do a better job of updating
> patchwork, and then we should look at running it automatically (not
> just when I think to run it manually).

See also https://sourceware.org/bugzilla/show_bug.cgi?id=30997
We really should automate this. There are several people running
scripts by hand. The easiest would be to simply run it from a git
hook.  patchwork comes with a simple script that just calculates the
hash and pings patchwork, which can then mark the patch associated
with that hash as committed. If people really believe calculating a
hash is too much work from a git hook then we can also simply run it
from builder.sourceware.org. We already run a builder for each commit
anyway. It would just be one extra build step checking the commit
against patchwork.

Cheers,

Mark


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Jason Merrill via Gcc

On 5/1/24 12:15, Jeff Law wrote:



On 4/22/24 9:24 PM, Tom Tromey wrote:

Jason> Someone mentioned earlier that gerrit was previously tried
Jason> unsuccessfully.

We tried it and gdb and then abandoned it.  We tried to integrate it
into the traditional gdb development style, having it send email to
gdb-patches.  I found these somewhat hard to read and in the end we
agreed not to use it.

I've come around again to thinking we should probably abandon email
instead.  For me the main benefit is that gerrit has patch tracking,
unlike our current system, where losing patches is fairly routine.

Jason> I think this is a common pattern in GCC at least: someone has an
Jason> idea for a workflow improvement, and gets it working, but it
Jason> isn't widely adopted.

It essentially has to be mandated, IMO.

For GCC this seems somewhat harder since the community is larger, so
there's more people to convince.
I tend to think it's the principal reviewers that will drive this.  If 
several of the key folks indicated they were going to use system XYZ, 
whatever it is, that would drive everyone to that system.


We're currently using patchwork to track patches tagged with RISC-V.  We 
don't do much review with patchwork.  In that model patchwork ultimately 
just adds overhead as I'm constantly trying to figure out what patches 
have been integrated vs what are still outstanding.


Patchwork definitely isn't the answer IMHO.  Nor is gitlab MRs which we 
use heavily internally.  But boy I want to get away from email and to a 
pull request kind of flow.


Do you (or others) have any thoughts about GitLab FOSS?

Jason



Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Jonathan Wakely via Gcc
On Wed, 1 May 2024 at 20:19, Jeff Law via Gcc  wrote:
>
>
>
> On 4/22/24 9:24 PM, Tom Tromey wrote:
> > Jason> Someone mentioned earlier that gerrit was previously tried
> > Jason> unsuccessfully.
> >
> > We tried it and gdb and then abandoned it.  We tried to integrate it
> > into the traditional gdb development style, having it send email to
> > gdb-patches.  I found these somewhat hard to read and in the end we
> > agreed not to use it.
> >
> > I've come around again to thinking we should probably abandon email
> > instead.  For me the main benefit is that gerrit has patch tracking,
> > unlike our current system, where losing patches is fairly routine.
> >
> > Jason> I think this is a common pattern in GCC at least: someone has an
> > Jason> idea for a workflow improvement, and gets it working, but it
> > Jason> isn't widely adopted.
> >
> > It essentially has to be mandated, IMO.
> >
> > For GCC this seems somewhat harder since the community is larger, so
> > there's more people to convince.
> I tend to think it's the principal reviewers that will drive this.  If
> several of the key folks indicated they were going to use system XYZ,
> whatever it is, that would drive everyone to that system.
>
> We're currently using patchwork to track patches tagged with RISC-V.  We
> don't do much review with patchwork.  In that model patchwork ultimately
> just adds overhead as I'm constantly trying to figure out what patches
> have been integrated vs what are still outstanding.

If patches sent by email exactly match what's committed, then the
update_gcc_pw.sh script that I run will correctly update patchwork to
say they're committed. I tend to only bother running that once a week,
because it misses so many and so is of limited use. If we are now
supposed to send generated files in the patches, and we discourage
people from committing something close-but-not-identical to what they
sent by email, then the script will do a better job of updating
patchwork, and then we should look at running it automatically (not
just when I think to run it manually).

I think there's still an issue where a patch has been superseded by a
v2 which has been committed. I don't think patchwork does a good job
of noticing that the v1 patch is no longer relevant, so somebody still
has to manually update those ones.

So overall, I agree that patchwork isn't the answer. It requires too
much manual housekeeping, and that's a huge task with the volume of
patches that GCC has.

>
> Patchwork definitely isn't the answer IMHO.  Nor is gitlab MRs which we
> use heavily internally.  But boy I want to get away from email and to a
> pull request kind of flow.
>
> jeff


Re: Updated Sourceware infrastructure plans

2024-05-01 Thread Jeff Law via Gcc




On 4/22/24 9:24 PM, Tom Tromey wrote:

Jason> Someone mentioned earlier that gerrit was previously tried
Jason> unsuccessfully.

We tried it and gdb and then abandoned it.  We tried to integrate it
into the traditional gdb development style, having it send email to
gdb-patches.  I found these somewhat hard to read and in the end we
agreed not to use it.

I've come around again to thinking we should probably abandon email
instead.  For me the main benefit is that gerrit has patch tracking,
unlike our current system, where losing patches is fairly routine.

Jason> I think this is a common pattern in GCC at least: someone has an
Jason> idea for a workflow improvement, and gets it working, but it
Jason> isn't widely adopted.

It essentially has to be mandated, IMO.

For GCC this seems somewhat harder since the community is larger, so
there's more people to convince.
I tend to think it's the principal reviewers that will drive this.  If 
several of the key folks indicated they were going to use system XYZ, 
whatever it is, that would drive everyone to that system.


We're currently using patchwork to track patches tagged with RISC-V.  We 
don't do much review with patchwork.  In that model patchwork ultimately 
just adds overhead as I'm constantly trying to figure out what patches 
have been integrated vs what are still outstanding.


Patchwork definitely isn't the answer IMHO.  Nor is gitlab MRs which we 
use heavily internally.  But boy I want to get away from email and to a 
pull request kind of flow.


jeff


RE: Updated Sourceware infrastructure plans

2024-04-24 Thread Aktemur, Tankut Baris via Gcc
On Tuesday, April 23, 2024 5:26 PM, Simon Marchi wrote:
> On 2024-04-23 11:08, Tom Tromey wrote:
> >> Indeed.  Though Patchwork is another option for patch tracking, that
> >> glibc seem to be having success with.
> >
> > We tried this in gdb as well.  It was completely unworkable -- you have
> > to manually clear out the patch queue, meaning it's normally full of
> > patches that already landed.  I know glibc has success with it, but I
> > wouldn't consider it for gdb unless it gained some new abilities.
> 
> The thing that Gerrit excels at is tracking the different versions of a
> given patch, being able to easily diff versions, etc.  And then mark a
> patch as merged once it's committed to master.

FWIW, I think this is a very useful feature.  I used gitlab and github, too;
their revision tracking is far worse.

Regards
-Baris

> Patchwork doesn't have this capability built-in (AFAIK).  You can try to
> do some automation, but I doubt that any system based solely on getting
> patches from a mailing list can ever be as good as something like Gerrit
> for this.
> 
> Simon



Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de 
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Simon Marchi via Gcc



On 2024-04-23 11:08, Tom Tromey wrote:
>> Indeed.  Though Patchwork is another option for patch tracking, that
>> glibc seem to be having success with.
> 
> We tried this in gdb as well.  It was completely unworkable -- you have
> to manually clear out the patch queue, meaning it's normally full of
> patches that already landed.  I know glibc has success with it, but I
> wouldn't consider it for gdb unless it gained some new abilities.

The thing that Gerrit excels at is tracking the different versions of a
given patch, being able to easily diff versions, etc.  And then mark a
patch as merged once it's committed to master.

Patchwork doesn't have this capability built-in (AFAIK).  You can try to
do some automation, but I doubt that any system based solely on getting
patches from a mailing list can ever be as good as something like Gerrit
for this.

Simon


Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Tom Tromey
> Indeed.  Though Patchwork is another option for patch tracking, that
> glibc seem to be having success with.

We tried this in gdb as well.  It was completely unworkable -- you have
to manually clear out the patch queue, meaning it's normally full of
patches that already landed.  I know glibc has success with it, but I
wouldn't consider it for gdb unless it gained some new abilities.

Tom


Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Ian Lance Taylor via Gcc
On Tue, Apr 23, 2024 at 2:32 AM Richard Earnshaw (lists) via Gcc
 wrote:
>
> I've been forced to use gerrit occasionally.  I hate it.  No, I LOATHE it.  
> The UI is anything but intuitive with features hidden behind unobvious 
> selections.  IMO It's not a tool for a casual developer, which makes it a bad 
> introduction to developing software.

I would be shocked if GCC ever adopted Gerrit.  But I don't understand
this objection.  Yes, Gerrit is not a tool for a casual developer.
But so what?  Casual developers don't have to use it, except that they
have to run a particular git command to submit a patch.  It's the GCC
maintainers who have to use Gerrit.

Ian


Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Florian Weimer via Gcc
* Jason Merrill:

> On Mon, Apr 22, 2024 at 11:42 AM Tom Tromey  wrote:
>
>  > "Frank" == Frank Ch Eigler  writes:
>
>  >> [...]  I suggest that a basic principle for such a system is that it
>  >> should be *easy* to obtain and maintain a local copy of the history
>  >> of all pull requests.  That includes all versions of a pull request,
>  >> if it gets rebased, and all versions of comments, if the system
>  >> allows editing comments.  A system that uses git as the source of
>  >> truth for all the pull request data and has refs [...]
>
>  Frank> Do you know of a system with these characteristics?
>
>  Based on:
>
>  https://gerrit-review.googlesource.com/Documentation/dev-design.html#_notedb
>
>  ... it sounds like this is what gerrit does.
>
> Someone mentioned earlier that gerrit was previously tried
> unsuccessfully.  I think this is a common pattern in GCC at least:
> someone has an idea for a workflow improvement, and gets it working,
> but it isn't widely adopted.

We used it for glibc briefly.  It failed in part because we were too
kind and didn't give negative feedback in the tool itself (making it
less useful for contributors), and because it was deployed on the side
alongside the usual mailing list patch submission process.

It may be worth a try again, but this time with brutally honest feedback
(-2 and whatnot).  On the other hand, Gerrit appears to require Bazel to
build, and as far as I understand it, setting up and maintaining a Bazel
build environment that meets our requirements (basically: no mystery
binaries) is a very big task.

Thanks,
Florian



Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Richard Earnshaw (lists) via Gcc
On 23/04/2024 09:56, Mark Wielaard wrote:
> Hi,
> 
> On Mon, Apr 22, 2024 at 11:51:00PM -0400, Jason Merrill wrote:
>> On Mon, Apr 22, 2024 at 11:24 PM Tom Tromey  wrote:
>>> Jason> Someone mentioned earlier that gerrit was previously tried
>>> Jason> unsuccessfully.
>>>
>>> We tried it and gdb and then abandoned it.  We tried to integrate it
>>> into the traditional gdb development style, having it send email to
>>> gdb-patches.  I found these somewhat hard to read and in the end we
>>> agreed not to use it.
>>>
>>> I've come around again to thinking we should probably abandon email
>>> instead.  For me the main benefit is that gerrit has patch tracking,
>>> unlike our current system, where losing patches is fairly routine.
>>
>> Indeed.  Though Patchwork is another option for patch tracking, that glibc
>> seem to be having success with.
> 
> Patchworks works if you have people that like it and keep on top of
> it. For elfutils Aaron and I are probably the only ones that use it,
> but if we just go over it once a week it keeps being managable and
> nobody else needs to care. That is also why it seems to work for
> glibc. If you can carve out an hour a week going over the submitted
> patches and delegate them then it is a really useful patch tracking
> tool. Obviously that only works if the patch flow isn't overwhelming
> to begin with...
> 
> I'll work with Sergio who setup the original gerrit instance to
> upgrade it to the latest gerrit version so people try it out. One nice
> thing he did was to automate self-service user registration. Although
> that is one of the things I don't really like about it. As Tom said it
> feels like gerrit is an all or nothing solution that has to be
> mandated to be used and requires everybody to have a centralized
> login. But if you do want that then how Sergio set it up is pretty
> nice. It is just one more thing to monitor for spam accounts...
> 
> Cheers,
> 
> Mark

I've been using patchwork with GCC since, roughly, last year's cauldron.  Its 
main weakness is a poor search function for finding relevant patches, which 
means that since most patches in the queue aren't being managed it's a bit 
hit-and-miss finding the relevant patches.

Its other problem is that it expects a particular workflow model, particularly 
not replying to an existing patch discussion with an updated patch (it expects 
patches to be reposted as an entire series with a new version number, 
Linux-style).

But there are some benefits to using it: I can integrate it with my mail client 
- it can show me the patch series in patchwork when I receive a mail directly; 
it integrates well with git with the git-pw module, so I can pull an entire 
patch series off the list into my worktree from the command line just by 
knowing the patch series number; and I can manage/track patches in bundles, 
which is great if I don't have time in any particular day to deal with the 
email volume.

Finally, it's been integrated with our CI systems (thanks Linaro!), so it can 
automatically pull reviews and run validations on them, then report the results 
back; often before I've even had time to look at the patch.

R.


Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Richard Earnshaw (lists) via Gcc
On 23/04/2024 04:24, Tom Tromey wrote:
> Jason> Someone mentioned earlier that gerrit was previously tried
> Jason> unsuccessfully.
> 
> We tried it and gdb and then abandoned it.  We tried to integrate it
> into the traditional gdb development style, having it send email to
> gdb-patches.  I found these somewhat hard to read and in the end we
> agreed not to use it.
> 
> I've come around again to thinking we should probably abandon email
> instead.  For me the main benefit is that gerrit has patch tracking,
> unlike our current system, where losing patches is fairly routine.
> 
> Jason> I think this is a common pattern in GCC at least: someone has an
> Jason> idea for a workflow improvement, and gets it working, but it
> Jason> isn't widely adopted.
> 
> It essentially has to be mandated, IMO.
> 
> For GCC this seems somewhat harder since the community is larger, so
> there's more people to convince.
> 
> Tom

I've been forced to use gerrit occasionally.  I hate it.  No, I LOATHE it.  The 
UI is anything but intuitive with features hidden behind unobvious selections.  
IMO It's not a tool for a casual developer, which makes it a bad introduction 
to developing software.

R.


Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Mark Wielaard
Hi,

On Mon, Apr 22, 2024 at 11:51:00PM -0400, Jason Merrill wrote:
> On Mon, Apr 22, 2024 at 11:24 PM Tom Tromey  wrote:
> > Jason> Someone mentioned earlier that gerrit was previously tried
> > Jason> unsuccessfully.
> >
> > We tried it and gdb and then abandoned it.  We tried to integrate it
> > into the traditional gdb development style, having it send email to
> > gdb-patches.  I found these somewhat hard to read and in the end we
> > agreed not to use it.
> >
> > I've come around again to thinking we should probably abandon email
> > instead.  For me the main benefit is that gerrit has patch tracking,
> > unlike our current system, where losing patches is fairly routine.
> 
> Indeed.  Though Patchwork is another option for patch tracking, that glibc
> seem to be having success with.

Patchworks works if you have people that like it and keep on top of
it. For elfutils Aaron and I are probably the only ones that use it,
but if we just go over it once a week it keeps being managable and
nobody else needs to care. That is also why it seems to work for
glibc. If you can carve out an hour a week going over the submitted
patches and delegate them then it is a really useful patch tracking
tool. Obviously that only works if the patch flow isn't overwhelming
to begin with...

I'll work with Sergio who setup the original gerrit instance to
upgrade it to the latest gerrit version so people try it out. One nice
thing he did was to automate self-service user registration. Although
that is one of the things I don't really like about it. As Tom said it
feels like gerrit is an all or nothing solution that has to be
mandated to be used and requires everybody to have a centralized
login. But if you do want that then how Sergio set it up is pretty
nice. It is just one more thing to monitor for spam accounts...

Cheers,

Mark


Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Ian Lance Taylor
Tom Tromey via Overseers  writes:

> Jason> Someone mentioned earlier that gerrit was previously tried
> Jason> unsuccessfully.
>
> We tried it and gdb and then abandoned it.  We tried to integrate it
> into the traditional gdb development style, having it send email to
> gdb-patches.  I found these somewhat hard to read and in the end we
> agreed not to use it.

Current Gerrit e-mails are pretty nice, with a nice diff of the change.
And patches can be submitted entirely via git, which is not the same as
today but should be acceptable for almost all contributors.  What
doesn't work in Gerrit, as far as I know, is a pure e-mail based
workflow for maintainers.  To approve a patch, maintainers have to go to
a web site and click a button, or they have to run a command line tool
("ssh  gerrit review").


> I've come around again to thinking we should probably abandon email
> instead.  For me the main benefit is that gerrit has patch tracking,
> unlike our current system, where losing patches is fairly routine.

You can lose patches in Gerrit quite easily, but at least there is a
dashboard showing all the ones you lost.

I'm definitely a Gerrit fan.

Ian


Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Jason Merrill via Gcc
On Mon, Apr 22, 2024 at 11:24 PM Tom Tromey  wrote:

> Jason> Someone mentioned earlier that gerrit was previously tried
> Jason> unsuccessfully.
>
> We tried it and gdb and then abandoned it.  We tried to integrate it
> into the traditional gdb development style, having it send email to
> gdb-patches.  I found these somewhat hard to read and in the end we
> agreed not to use it.
>
> I've come around again to thinking we should probably abandon email
> instead.  For me the main benefit is that gerrit has patch tracking,
> unlike our current system, where losing patches is fairly routine.
>

Indeed.  Though Patchwork is another option for patch tracking, that glibc
seem to be having success with.

Jason> I think this is a common pattern in GCC at least: someone has an
> Jason> idea for a workflow improvement, and gets it working, but it
> Jason> isn't widely adopted.
>
> It essentially has to be mandated, IMO.
>
> For GCC this seems somewhat harder since the community is larger, so
> there's more people to convince.
>

Absolutely, but now with the office hours it seems more feasible to build
momentum (or see that there isn't enough support) without having to wait
until Cauldron.

Jason


Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Tom Tromey
Jason> Someone mentioned earlier that gerrit was previously tried
Jason> unsuccessfully.

We tried it and gdb and then abandoned it.  We tried to integrate it
into the traditional gdb development style, having it send email to
gdb-patches.  I found these somewhat hard to read and in the end we
agreed not to use it.

I've come around again to thinking we should probably abandon email
instead.  For me the main benefit is that gerrit has patch tracking,
unlike our current system, where losing patches is fairly routine.

Jason> I think this is a common pattern in GCC at least: someone has an
Jason> idea for a workflow improvement, and gets it working, but it
Jason> isn't widely adopted.

It essentially has to be mandated, IMO.

For GCC this seems somewhat harder since the community is larger, so
there's more people to convince.

Tom


Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Simon Marchi via Gcc
On 2024-04-22 22:55, Jason Merrill via Overseers wrote:
> On Mon, Apr 22, 2024 at 11:42 AM Tom Tromey  wrote:
> 
>>> "Frank" == Frank Ch Eigler  writes:
>>
 [...]  I suggest that a basic principle for such a system is that it
 should be *easy* to obtain and maintain a local copy of the history
 of all pull requests.  That includes all versions of a pull request,
 if it gets rebased, and all versions of comments, if the system
 allows editing comments.  A system that uses git as the source of
 truth for all the pull request data and has refs [...]
>>
>> Frank> Do you know of a system with these characteristics?
>>
>> Based on:
>>
>>
>> https://gerrit-review.googlesource.com/Documentation/dev-design.html#_notedb
>>
>> ... it sounds like this is what gerrit does.
>>
> 
> Someone mentioned earlier that gerrit was previously tried unsuccessfully.
> I think this is a common pattern in GCC at least: someone has an idea for a
> workflow improvement, and gets it working, but it isn't widely adopted.  I
> think this is a problem with the "if you build it he will come" model
> rather than with any particular technology; people are more or less
> comfortable with their current workflow and uninclined to experiment with
> someone else's new thing, even if it could eventually be a big improvement.

Agreed.

Gerrit has many nice features, but using it would require doing some
compromises over some things that some community members consider very
important.  Would have to give up some things we take for granted today,
such as being able to interact 100% by email.  But staying with the
current way of working because we can't find another way of working that
checks 100% of the checkboxes also has an opportunity cost.  I doubt
we'll ever find a system that checks absolutely all the checkboxes, but
that doesn't mean that the current system is the best.

> I think that the office hours, for both sourceware and the toolchain, offer
> a better path for building enthusiasm about a new direction.

As someone who pushed to try Gerrit back then, I'd be happy to chat at
the next office hours (if you remind me).

> Is gerrit still installed?

Hum yes actually, you can check `gnutoolchain-gerrit dot osci dot io`.
But it should really be taken down, it's not responsible to leave an
unattended outdated web service like that.

Simon


Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Jason Merrill via Gcc
On Mon, Apr 22, 2024 at 11:42 AM Tom Tromey  wrote:

> > "Frank" == Frank Ch Eigler  writes:
>
> >> [...]  I suggest that a basic principle for such a system is that it
> >> should be *easy* to obtain and maintain a local copy of the history
> >> of all pull requests.  That includes all versions of a pull request,
> >> if it gets rebased, and all versions of comments, if the system
> >> allows editing comments.  A system that uses git as the source of
> >> truth for all the pull request data and has refs [...]
>
> Frank> Do you know of a system with these characteristics?
>
> Based on:
>
>
> https://gerrit-review.googlesource.com/Documentation/dev-design.html#_notedb
>
> ... it sounds like this is what gerrit does.
>

Someone mentioned earlier that gerrit was previously tried unsuccessfully.
I think this is a common pattern in GCC at least: someone has an idea for a
workflow improvement, and gets it working, but it isn't widely adopted.  I
think this is a problem with the "if you build it he will come" model
rather than with any particular technology; people are more or less
comfortable with their current workflow and uninclined to experiment with
someone else's new thing, even if it could eventually be a big improvement.

I think that the office hours, for both sourceware and the toolchain, offer
a better path for building enthusiasm about a new direction.

Is gerrit still installed?

Jason


Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Frank Ch. Eigler via Gcc
Hi -

> Would it be possible for gitsigur to support signing commits with ssh
> keys as well as gpg? Git supports this, and it's much easier for
> everybody than having to set up gpg. [...]

It would save some effort, but OTOH plenty of people have gpg keys
too, and the common desktop key agents support both.

> We already need an SSH key on sourceware.org to push to Git, so all
> those public keys could be treated as trusted (via git config
> gpg.ssh.allowedSignersFile). [...]

One difference is that gitsigur aims to prevent impersonation, by
tying the recorded committer to a designated set of keys for that
committer.  The git builtin ssh-signing gadget doesn't attempt this.
But maybe just a small matter of wrapping might do the job.

Filed https://sourceware.org/bugzilla/show_bug.cgi?id=31670 .

> I'm already signing my GCC commits that way, without needing to use
> gpg or gitsigur:

Great, keep it up!  Nothing has been stopping people from signing
their commits any way they like, including even more complex ways like
sigstore.  gitsigur verification is not enabled (even in permissive
mode) at all for gcc at this time.

> commit 7c2a9dbcc2c1cb1563774068c59d5e09edc59f06 [r14-10008-g7c2a9dbcc2c1cb]
> Good "git" signature for jwak...@redhat.com with RSA key
> SHA256:8rFaYhDWn09c3vjsYIg2JE9aSpcxzTnCqajoKevrUUo

Thanks, this will help test a prototype later on.

- FChE



Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Tom Tromey
> "Frank" == Frank Ch Eigler  writes:

>> [...]  I suggest that a basic principle for such a system is that it
>> should be *easy* to obtain and maintain a local copy of the history
>> of all pull requests.  That includes all versions of a pull request,
>> if it gets rebased, and all versions of comments, if the system
>> allows editing comments.  A system that uses git as the source of
>> truth for all the pull request data and has refs [...]

Frank> Do you know of a system with these characteristics?

Based on:

https://gerrit-review.googlesource.com/Documentation/dev-design.html#_notedb

... it sounds like this is what gerrit does.

Tom


Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Joseph Myers via Gcc
On Mon, 22 Apr 2024, Mark Wielaard wrote:

> >   A system that uses git as the source of 
> > truth for all the pull request data and has refs through which all this 
> > can be located (with reasonably straightforward, documented formats for 
> > the data, not too closely tied to any particular implementation of a 
> > pull-request system), so that a single clone --mirror has all the data, 
> > might be suitable (people have worked on ensuring git scales well with 
> > very large numbers of refs, which you'd probably get in such a system 
> > storing all the data in git);
> 
> Yes, git is pretty nice for storing lots of variants of somewhat
> identical sources/texts. But this also seems to imply that when we
> offer a system to store "contributor" git trees/forks of projects to
> easily create "pull requests" then we can never remove such users/forks
> and must disallow rebasing any trees that have been "submitted".

For example, GitHub has some version of the source branch for a pull 
request under refs/pull/ in the target respository - that doesn't rely on 
the source branch or repository staying around.  However, that's only one 
version - it doesn't work so well when the source branch is rebased 
(though GitHub itself is reported to keep all forks of a repository in a 
single repository internally, rarely garbage collected, so the previous 
versions probably remain there, just not accessible from any ref).  But 
you could certainly have a convention for ref naming that ensures all 
versions of a PR are available even when it's rebased.  Things like the 
"git evolve" proposal  could also be 
relevant (maybe that particular proposal wasn't intended for the goal of 
ensuring all submitted versions of a change remain permanently available, 
but at least it's dealing with a similar problem - and the more you have a 
standard way of representing this kind of information in git, rather than 
something very specific to a particular system built on top of git, the 
better).

-- 
Joseph S. Myers
josmy...@redhat.com



Re: Updated Sourceware infrastructure plans

2024-04-22 Thread Mark Wielaard
Hi Joseph,

On Thu, 2024-04-18 at 15:56 +, Joseph Myers wrote:
> On Thu, 18 Apr 2024, Mark Wielaard wrote:
> 
> > But we like to get more feedback on what people really think a
> > "pull-request" style framework should look like. We used to have a
> > gerrit setup which wasn't really popular. And we already have a
> > sourcehut mirror that can be used to turn your "pull-requests" into a
> > git send-email style submission (without having to setup any
> > email/smtp yourself): https://sr.ht/~sourceware/
> 
> The xz backdoor showed up one issue with some implementations of 
> pull-request systems: GitHub removed access to the repository, and with it 
> access to the past pull requests, so disrupting investigation into the 
> sequence of bad-faith contributions.

Agreed. I tried to analyze the valgrind issues after the fact (we
clearly missed them before, there were warning, but they were fixed so
quickly we didn't really look into them as we should have). And it was
a bit difficult because the github repository had disappeared. But
luckily the project did have a "real" git repo:
https://git.tukaani.org/
This obviously didn't contain any "pull requests" but I am not sure
they were used on the xz github mirror. Does github require pull
requests to keep around? What if someone closes/removes their own
fork/repo/account, do the commits transfer to the project?

>   I suggest that a basic principle for 
> such a system is that it should be *easy* to obtain and maintain a local 
> copy of the history of all pull requests.  That includes all versions of a 
> pull request, if it gets rebased, and all versions of comments, if the 
> system allows editing comments.

So in a somewhat crude form we now have that with our email workflow.
In theory every patch is posted and reviewed on one of the patches
mailinglists and the public-inbox instance at
https://inbox.sourceware.org/ allows you to git clone the whole archive
for local inspection.

>   A system that uses git as the source of 
> truth for all the pull request data and has refs through which all this 
> can be located (with reasonably straightforward, documented formats for 
> the data, not too closely tied to any particular implementation of a 
> pull-request system), so that a single clone --mirror has all the data, 
> might be suitable (people have worked on ensuring git scales well with 
> very large numbers of refs, which you'd probably get in such a system 
> storing all the data in git);

Yes, git is pretty nice for storing lots of variants of somewhat
identical sources/texts. But this also seems to imply that when we
offer a system to store "contributor" git trees/forks of projects to
easily create "pull requests" then we can never remove such users/forks
and must disallow rebasing any trees that have been "submitted".

That can probably be done, but it is different from what we now require
from user or devel branches in our git repos. Where we do allow users
to delete their branches and devel branches can be rebased. Should such
branches also become "immutable"?

Cheers,

Mark


Re: Updated Sourceware infrastructure plans

2024-04-19 Thread Jonathan Wakely via Gcc
On Thu, 18 Apr 2024 at 07:06, Thomas Koenig via Gcc  wrote:
>
> Am 18.04.24 um 01:27 schrieb Mark Wielaard:
> > We also should make sure that all generated files (either in git or in
> > the release/snapshot tar balls) can be reliably and reproducibly
> > regenerated. This also helps the (pre-commit) CI buildbots. We already
> > have the autoregen bots for gcc and binutils-gdb. And Christoph has
> > been working on extending the scripts to regenerate more kinds of
> > files.
>
> I regenerate auto* files from time to time for libgfortran. Regenerating
> them has always been very fragile (using --enable-maintainer-mode),
> and difficult to get right.

I've been curious for ages why gfortran requires using maintainer mode
for that. Nobody else uses maintainer mode for GCC development, so it
causes friction when somebody has to do changes across the whole of
gcc including gfortran parts.

If it doesn't work with a simple autoreconf then that should be fixed.


Re: Updated Sourceware infrastructure plans

2024-04-18 Thread Matt Rice via Gcc
On Thu, Apr 18, 2024 at 5:38 PM Frank Ch. Eigler  wrote:
>
> Hi -
>
> > [...]  I suggest that a basic principle for such a system is that it
> > should be *easy* to obtain and maintain a local copy of the history
> > of all pull requests.  That includes all versions of a pull request,
> > if it gets rebased, and all versions of comments, if the system
> > allows editing comments.  A system that uses git as the source of
> > truth for all the pull request data and has refs [...]
>
> Do you know of a system with these characteristics?
>
> - FChE

The closest thing I know of which may have these characteristics is
alibaba's AGit-Flow described here:
https://git-repo.info/en/2020/03/agit-flow-and-git-repo/
It actually sends pull-requests through the git protocol using a
custom proc-receive hook.

I'm a bit uncertain how code-review comments are handled in their system,
And it isn't exactly something which can just be used off-the-shelf,
AFAIK their server
side implementation hasn't been released.

I had written a prototype-worthy implementation of the server-side git
hook here.
It basically allows sending a pull-request through git push, along
with a cover letter.
But i've never really used it in the full PR review cycle beyond that.

https://github.com/pullreqr/pullreqr_githook

But protocol-wise IMO it seems like a good basis for building a system
with these characteristics to me.


Re: Updated Sourceware infrastructure plans

2024-04-18 Thread Joseph Myers via Gcc
On Thu, 18 Apr 2024, Frank Ch. Eigler via Gcc wrote:

> Hi -
> 
> > [...]  I suggest that a basic principle for such a system is that it
> > should be *easy* to obtain and maintain a local copy of the history
> > of all pull requests.  That includes all versions of a pull request,
> > if it gets rebased, and all versions of comments, if the system
> > allows editing comments.  A system that uses git as the source of
> > truth for all the pull request data and has refs [...]
> 
> Do you know of a system with these characteristics?

I haven't studied the features supported by different systems, though (for 
example) there are some discussions in comments on 
https://lwn.net/Articles/867956/ of various systems.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: Updated Sourceware infrastructure plans

2024-04-18 Thread Frank Ch. Eigler via Gcc
Hi -

> [...]  I suggest that a basic principle for such a system is that it
> should be *easy* to obtain and maintain a local copy of the history
> of all pull requests.  That includes all versions of a pull request,
> if it gets rebased, and all versions of comments, if the system
> allows editing comments.  A system that uses git as the source of
> truth for all the pull request data and has refs [...]

Do you know of a system with these characteristics?

- FChE



Re: Updated Sourceware infrastructure plans

2024-04-18 Thread Joseph Myers via Gcc
On Thu, 18 Apr 2024, Mark Wielaard wrote:

> But we like to get more feedback on what people really think a
> "pull-request" style framework should look like. We used to have a
> gerrit setup which wasn't really popular. And we already have a
> sourcehut mirror that can be used to turn your "pull-requests" into a
> git send-email style submission (without having to setup any
> email/smtp yourself): https://sr.ht/~sourceware/

The xz backdoor showed up one issue with some implementations of 
pull-request systems: GitHub removed access to the repository, and with it 
access to the past pull requests, so disrupting investigation into the 
sequence of bad-faith contributions.  I suggest that a basic principle for 
such a system is that it should be *easy* to obtain and maintain a local 
copy of the history of all pull requests.  That includes all versions of a 
pull request, if it gets rebased, and all versions of comments, if the 
system allows editing comments.  A system that uses git as the source of 
truth for all the pull request data and has refs through which all this 
can be located (with reasonably straightforward, documented formats for 
the data, not too closely tied to any particular implementation of a 
pull-request system), so that a single clone --mirror has all the data, 
might be suitable (people have worked on ensuring git scales well with 
very large numbers of refs, which you'd probably get in such a system 
storing all the data in git); a system that requires use of rate-limited 
APIs to access pull request data, not designed for maintaining such a 
local copy, rather less so.

There are some other considerations as well, such as ensuring the proposed 
commit message is just as much subject to review as the proposed code 
changes, and allowing both pull requests that propose a single commit 
(with subsequent fixups in the PR branch intended to be squashed) and pull 
requests that propose a series of commits (where fixups found in the 
review process need to be integrated into the relevant individual commit 
and the branch rebased before merge).

-- 
Joseph S. Myers
josmy...@redhat.com



Re: Updated Sourceware infrastructure plans

2024-04-18 Thread Janne Blomqvist via Gcc
On Thu, Apr 18, 2024 at 11:15 AM FX Coudert  wrote:
>
> > I regenerate auto* files from time to time for libgfortran. Regenerating
> > them has always been very fragile (using --enable-maintainer-mode),
> > and difficult to get right.
>
> I have never found them difficult to regenerate, but if you have only a non 
> maintainer build, it is a pain to have to make a new maintainer build for a 
> minor change.
>
> Moreover, our m4 code is particularly painful to use and unreadable. I have 
> been wondering for some time: should we switch to simpler Python scripts? It 
> would also mean that we would have fewer files in the generated/ folder: 
> right now, every time we add new combinations of types, we have a 
> combinatorial explosion of files.
>
> $ ls generated/sum_*
> generated/sum_c10.c generated/sum_c17.c generated/sum_c8.c  
> generated/sum_i16.c generated/sum_i4.c  generated/sum_r10.c 
> generated/sum_r17.c generated/sum_r8.c
> generated/sum_c16.c generated/sum_c4.c  generated/sum_i1.c  
> generated/sum_i2.c  generated/sum_i8.c  generated/sum_r16.c generated/sum_r4.c
>
> We could imagine having a single file for all sum intrinsics.
>
> How do Fortran maintainers feel about that?

For the time being I'm not an active maintainer, so my opinion doesn't
per se have weight, but back when I was active I did think about this
issue. IMHO the best of my ideas was to convert these into C++
templates. What we're essentially doing with the M4 stuff and the
proposed in-house Python reimplementation is to make up for lack of
monomorphization in plain old C. Rather than doing some DIY templates,
switch the implementation language to something which has that feature
built-in, in this case C++.  No need to convert the entire libgfortran
to C++ if you don't want to, just those objects that are generated
from the M4 templates. Something like

template
void matmul(T* a, T* b, T* c, ...)
{
   // actual matmul code here
}

extern "C" {
  // Instantiate template for every type and export the symbol
  void matmul_r4(gfc_array_r4* a, gfc_array_r4* b, gfc_array_r4* c, ...)
  {
matmul(a, b, c, ...);
  }
  // And so on for other types
}


-- 
Janne Blomqvist


Re: Updated Sourceware infrastructure plans

2024-04-18 Thread Christophe Lyon via Gcc
Hi,

On Thu, 18 Apr 2024 at 10:15, FX Coudert  wrote:
>
> > I regenerate auto* files from time to time for libgfortran. Regenerating
> > them has always been very fragile (using --enable-maintainer-mode),
> > and difficult to get right.
>
> I have never found them difficult to regenerate, but if you have only a non 
> maintainer build, it is a pain to have to make a new maintainer build for a 
> minor change.
>

FWIW, we have noticed lots of warnings from autoreconf in libgfortran.
I didn't try to investigate, since the regenerated files are identical
to what is currently in the repo.

For instance, you can download the "stdio" output from the
autoregen.py step in
https://builder.sourceware.org/buildbot/#/builders/269/builds/4373

Thanks,

Christophe


> Moreover, our m4 code is particularly painful to use and unreadable. I have 
> been wondering for some time: should we switch to simpler Python scripts? It 
> would also mean that we would have fewer files in the generated/ folder: 
> right now, every time we add new combinations of types, we have a 
> combinatorial explosion of files.
>
> $ ls generated/sum_*
> generated/sum_c10.c generated/sum_c17.c generated/sum_c8.c  
> generated/sum_i16.c generated/sum_i4.c  generated/sum_r10.c 
> generated/sum_r17.c generated/sum_r8.c
> generated/sum_c16.c generated/sum_c4.c  generated/sum_i1.c  
> generated/sum_i2.c  generated/sum_i8.c  generated/sum_r16.c generated/sum_r4.c
>
> We could imagine having a single file for all sum intrinsics.
>
> How do Fortran maintainers feel about that?
>
> FX


Re: Updated Sourceware infrastructure plans

2024-04-18 Thread FX Coudert via Gcc
> I regenerate auto* files from time to time for libgfortran. Regenerating
> them has always been very fragile (using --enable-maintainer-mode),
> and difficult to get right.

I have never found them difficult to regenerate, but if you have only a non 
maintainer build, it is a pain to have to make a new maintainer build for a 
minor change.

Moreover, our m4 code is particularly painful to use and unreadable. I have 
been wondering for some time: should we switch to simpler Python scripts? It 
would also mean that we would have fewer files in the generated/ folder: right 
now, every time we add new combinations of types, we have a combinatorial 
explosion of files.

$ ls generated/sum_*
generated/sum_c10.c generated/sum_c17.c generated/sum_c8.c  generated/sum_i16.c 
generated/sum_i4.c  generated/sum_r10.c generated/sum_r17.c generated/sum_r8.c
generated/sum_c16.c generated/sum_c4.c  generated/sum_i1.c  generated/sum_i2.c  
generated/sum_i8.c  generated/sum_r16.c generated/sum_r4.c

We could imagine having a single file for all sum intrinsics.

How do Fortran maintainers feel about that?

FX

Re: Updated Sourceware infrastructure plans

2024-04-18 Thread Thomas Koenig via Gcc

Am 18.04.24 um 01:27 schrieb Mark Wielaard:

We also should make sure that all generated files (either in git or in
the release/snapshot tar balls) can be reliably and reproducibly
regenerated. This also helps the (pre-commit) CI buildbots. We already
have the autoregen bots for gcc and binutils-gdb. And Christoph has
been working on extending the scripts to regenerate more kinds of
files.


I regenerate auto* files from time to time for libgfortran. Regenerating
them has always been very fragile (using --enable-maintainer-mode),
and difficult to get right.

If there is a better process available to do it the right way is
that is documented and easy to use, this will make work easier.

If not, it has the potential to stop the work I am planning to
contribute in a project that is about a month from starting
(and maybe stop the project altogether).

Can anybody point me towards the tools that will be the
gold standard in the future, and the reproducible way
of regenerating them?

Best regards

Thomas