On Tue, Jul 12, 2022 at 11:08:30PM +0000, Klemens Nanni wrote:
> On 13/07/2022 02:52, Stuart Henderson wrote:
> > On 2022/07/12 08:41, Aaron Bieber wrote:
> > > On Mon, 11 Jul 2022 at 11:17:55 +0000, Klemens Nanni wrote:
> > > > GH_* ports obviously set MASTER_SITES to MASTER_SITES_GITHUB which
> > > > defaults github.com/${account}/${project}/archive/${commit-or-tag}.
> > > >
> > > > If additional patches need to be fetched, e.g. pending PRs to fix the
> > > > local port, additional MASTER_SITES0-9 must be defined which always
> > > > duplicate the GH_* values:
> > > >
> > > > MASTER_SITES0 = https://github.com/account/project/pull/
> > > > PATCHFILES += many_fixes-{}number.patch:0
> > > > or
> > > > MASTER_SITES0 = https://github.com/account/project/commit/
> > > > PATCHFILES += one_fix-{}id.patch:0
> > > > or
> > > > MASTER_SITES0 = https://github.com/account/project/
> > > > PATCHFILES += many_fixes-{pull/}number.patch:0
> > > > PATCHFILES += one_fix-{commit/}id.patch:0
> > > >
> > > > Not super important but a bit annoying, the automatically generated
> > > > MASTER_SITES_GITHUB can't really be reused (without dirty make fiddling)
> > > > so MASTER_SITES0 is always a duplicate and PATCHFILES need :0 to work.
> > > >
> > > >
> > > > But all we need to make this reusable is move the "archive/" part from
> > > > MASTER_SITES_GITHUB to GH_DISTFILE, i.e. internally do
> > > >
> > > > MASTER_SITES = https://github.com/account/project/
> > > > DISTNAME = account-project-{archive/}commit_or_tag.tar.gz
> > > > instead of
> > > > MASTER_SITES = https://github.com/account/project/archive/
> > > > DISTNAME = account-project-{}commit_or_tag.tar.gz
> >
> > If that is changing I think it might be better to go for
> >
> > MASTER_SITES = https://github.com/
> > DISTNAME = account-project-{account/project/archive/}commit_or_tag.tar.gz
Yeah, that'd work as well.
It blows up DISTFILES in Makefile a little when collecting distfiles
and/or patches from github, but that's not much of a problem, I think.
This would further simply what I currently do for WIP ports where I have
to fetch submodules since I'm building latest HEAD, i.e. no release
tarball containing prefetched submodules and stuff.
This way, I could condense four MASTER_SITES into one currently.
I'll test and send out a new diff for this once the following two nits
landed (just sent them to ports@):
Subject: bsd.port.mk: cosmetic readability fix
Subject: bsd.port.mk: tiny GH_DISTFILE improvement
> >
> > Then we don't need multiple MASTER_SITES for a second github distfile either
> >
> > > > since this way the aforementioned patch fetching can be simplified to
> > > >
> > > > PATCHFILES += many_fixes-{pull/}number.patch
> > > > or
> > > > PATCHFILES += one_fix-{commit/}id.patch
> > > >
> > > > without additional MASTER_SITES0-9.
> > > >
> > > > It's a rather unimportant detail but juggling with WIP ports, own PRs,
> > > > fixes, etc. gets a little easier with this since the Makefile stays a
> > > > bit smaller.
> > > >
> > > > For testing, I have fetched several ports, both using GH_COMMIT and
> > > > GH_TAGNAME, without any failures or distinfo changes.
> > > >
> > > > As for the logic in bsd.port.mk itself, this also lifts the quirky
> > > > add-trailing-slash-to-MASTER_SITES_GITHUB-only-if-GH_TAGNAME-is-set
> > > > since it now ends with a fixed string and the additional URL paths only
> > > > come in where GH_TAGNAME or whatever follows is definitely set.
> > > >
> > > > Feedback? Objection? OK?
> > > >
> > >
> > > I like it! I can't speak to the make voodoo but the functionality is OK
> > > abieber@ :D
> >
> > I worry that these upstream patchfiles might become inconsistent fairly
> > easily (even more than git-archive tarballs). A post-commit tweak to the
> > commit log message or an edit to the PR will break fetches because the
> > file will change.
>
> That's only true if /pull/<number>.patch is used and certainly a valid
> point, but that's already a problem regardless of this diff.
>
> <commit>.patch URLs remain the same, we'd just keep fetching the old
> stuff if someone updated a PR or force pushed to a repo.
>
> > Another problem is that it can be used to misrepresent patches as being
> > real repo commits when they are spoofed. e.g. this was never committed to
> > the real github/dmca repo, yet the URL looks official:
> > https://github.com/github/dmca/commit/48c5663c5f7dd9ecc4720f7c1522627665197939.patch
>
> While this is a fundemantally GitHub specific issue, we can try our best
> to avoid shady URLs by using author/project/pull/<number>/<commit>
> instead of author/project/<commit>.
correction: author/project/pull/<number>/commits/<commit>
> I myself usually annotate fetched patches with
> # pending "do this"
> # https://link.to/PR/or/so
>
> Does the proposed change make it easier to misuse GitHub URLs?
>
> > If you view the non-patch version it does look a bit more suspicious with
> > the header on the page, but there's no cue like that in a port with such a
> > line
> > https://github.com/github/dmca/commit/48c5663c5f7dd9ecc4720f7c1522627665197939
> >
> > (Method at https://gist.github.com/lrvick/02088ee5466ca51116bdaf1e709ddd7c)
> >
> >
> >
> > > patches upstream.. for example:
> > > https://patch-diff.githubusercontent.com/raw/tailscale/tailscale/pull/4838.patch
> > > )
> > >
> > > I wonder if it would make sense to have an "alias" for a
> > > MASTER_SITES entry that could be referenced in a more human
> > > frindly/generic way. Maybe something like:
> > >
> > > PATCHFILES += github:account/project/many_fixes-{pull/}number.patch
> > > PATCHFILES += gitlab:account/project/one_fix-{commit/}id.patch
> >
> > There's a lot of common handling between DISTFILES, PATCHFILES,
> > SUPDISTFILES.
> > Considering that some ports have thousands of distfiles I'd worry that more
> > complex processing of this will really slow down make(1), it's already
> > pretty
> > bad in some cases.
>
> Yes, we've had DISTFILES tweaks in go.port.mk with noticable performance
> impacts already, so this should be looked after.
>