On 13/07/2022 02:52, Stuart Henderson wrote:
On 2022/07/12 08:41, Aaron Bieber wrote:
On Mon, 11 Jul 2022 at 11:17:55 +0000, Klemens Nanni wrote:
GH_* ports obviously set MASTER_SITES to MASTER_SITES_GITHUB which
defaults github.com/${account}/${project}/archive/${commit-or-tag}.
If additional patches need to be fetched, e.g. pending PRs to fix the
local port, additional MASTER_SITES0-9 must be defined which always
duplicate the GH_* values:
MASTER_SITES0 = https://github.com/account/project/pull/
PATCHFILES += many_fixes-{}number.patch:0
or
MASTER_SITES0 = https://github.com/account/project/commit/
PATCHFILES += one_fix-{}id.patch:0
or
MASTER_SITES0 = https://github.com/account/project/
PATCHFILES += many_fixes-{pull/}number.patch:0
PATCHFILES += one_fix-{commit/}id.patch:0
Not super important but a bit annoying, the automatically generated
MASTER_SITES_GITHUB can't really be reused (without dirty make fiddling)
so MASTER_SITES0 is always a duplicate and PATCHFILES need :0 to work.
But all we need to make this reusable is move the "archive/" part from
MASTER_SITES_GITHUB to GH_DISTFILE, i.e. internally do
MASTER_SITES = https://github.com/account/project/
DISTNAME = account-project-{archive/}commit_or_tag.tar.gz
instead of
MASTER_SITES = https://github.com/account/project/archive/
DISTNAME = account-project-{}commit_or_tag.tar.gz
If that is changing I think it might be better to go for
MASTER_SITES = https://github.com/
DISTNAME = account-project-{account/project/archive/}commit_or_tag.tar.gz
Then we don't need multiple MASTER_SITES for a second github distfile either
since this way the aforementioned patch fetching can be simplified to
PATCHFILES += many_fixes-{pull/}number.patch
or
PATCHFILES += one_fix-{commit/}id.patch
without additional MASTER_SITES0-9.
It's a rather unimportant detail but juggling with WIP ports, own PRs,
fixes, etc. gets a little easier with this since the Makefile stays a
bit smaller.
For testing, I have fetched several ports, both using GH_COMMIT and
GH_TAGNAME, without any failures or distinfo changes.
As for the logic in bsd.port.mk itself, this also lifts the quirky
add-trailing-slash-to-MASTER_SITES_GITHUB-only-if-GH_TAGNAME-is-set
since it now ends with a fixed string and the additional URL paths only
come in where GH_TAGNAME or whatever follows is definitely set.
Feedback? Objection? OK?
I like it! I can't speak to the make voodoo but the functionality is OK
abieber@ :D
I worry that these upstream patchfiles might become inconsistent fairly
easily (even more than git-archive tarballs). A post-commit tweak to the
commit log message or an edit to the PR will break fetches because the
file will change.
That's only true if /pull/<number>.patch is used and certainly a valid
point, but that's already a problem regardless of this diff.
<commit>.patch URLs remain the same, we'd just keep fetching the old
stuff if someone updated a PR or force pushed to a repo.
Another problem is that it can be used to misrepresent patches as being
real repo commits when they are spoofed. e.g. this was never committed to
the real github/dmca repo, yet the URL looks official:
https://github.com/github/dmca/commit/48c5663c5f7dd9ecc4720f7c1522627665197939.patch
While this is a fundemantally GitHub specific issue, we can try our best
to avoid shady URLs by using author/project/pull/<number>/<commit>
instead of author/project/<commit>.
I myself usually annotate fetched patches with
# pending "do this"
# https://link.to/PR/or/so
Does the proposed change make it easier to misuse GitHub URLs?
If you view the non-patch version it does look a bit more suspicious with
the header on the page, but there's no cue like that in a port with such a line
https://github.com/github/dmca/commit/48c5663c5f7dd9ecc4720f7c1522627665197939
(Method at https://gist.github.com/lrvick/02088ee5466ca51116bdaf1e709ddd7c)
patches upstream.. for example:
https://patch-diff.githubusercontent.com/raw/tailscale/tailscale/pull/4838.patch
)
I wonder if it would make sense to have an "alias" for a
MASTER_SITES entry that could be referenced in a more human
frindly/generic way. Maybe something like:
PATCHFILES += github:account/project/many_fixes-{pull/}number.patch
PATCHFILES += gitlab:account/project/one_fix-{commit/}id.patch
There's a lot of common handling between DISTFILES, PATCHFILES, SUPDISTFILES.
Considering that some ports have thousands of distfiles I'd worry that more
complex processing of this will really slow down make(1), it's already pretty
bad in some cases.
Yes, we've had DISTFILES tweaks in go.port.mk with noticable performance
impacts already, so this should be looked after.