Re: distexpand for autogenerated upstream distfile resources (was: standardize and simplify GitHub submodule handling in ports?)

2023-08-09 Thread Marc Espie
On Wed, Aug 09, 2023 at 12:54:12AM -0400, Thomas Frohwein wrote:
> - It includes logic that finds the first MASTER_SITESn that isn't
>   otherwise used, and throws an ERROR if it overruns past
>   MASTER_SITES9.

That logic will hopefully be soon 100% obsolete.

I need some okays on the .VARIABLES make patch.

I have code to be able to use more or less arbitrary
suffixes in MASTER_SITES in bsd.port.mk, and the corresponding
stuff for sqlports and dpb ought to be fairly trivial.



distexpand for autogenerated upstream distfile resources (was: standardize and simplify GitHub submodule handling in ports?)

2023-08-08 Thread Thomas Frohwein
On Mon, Aug 07, 2023 at 09:17:05PM +0100, Stuart Henderson wrote:

[...]

> I think maybe I'd prefer to have some variable that could be used
> *instead* of the existing GH_* variables rather than in conjunction with
> them (so they can be used for all GH archive ports, rather than have
> them a special case for multi-distfile ports). If that's the standard
> way to do things we can have a sweep of the tree converting other
> ports (or at least the ones that don't use go.port.mk ;)
> 
> It would be kind-of helpful if it could support more than just github
> too (gitlab.com, sr.ht, ..). While that could be done with different
> variables (GH_xx, GL_xx, SRHT_xx etc) they're all a similar enough
> layout to each other that making the site part of the variable itself
> rather than the name would be simpler and easier to add more sites
> (plus it covers the case where you have some port using one file from
> github and one from gitlab, etc).
> 
> Playing with syntax ideas, maybe something like this would be easy to
> use for pprts not needing a rename -
> 
> SOMEVAR+= github vim vim refs/tags/v9.0.1677
> SOMEVAR+= github vim colorschemes 22986fa2a3d2f7229efd4019fcbca411caa6afbb
> 
> or with some auto-renaming (and specifying more of the path to avoid the
> extra GH_WRKSRC which I think might not be enough in some cases anyway -
> a port may have several distfiles that need to go into different base
> dirs) -
> 
> SOMEVAR+= github fortran-lang fpm refs/tags/v0.7.0
> OTHERVAR+=github toml-f toml-f e49f5523e4ee67db6628618864504448fb8c8939 
> vendor/toml-f
> OTHERVAR+=github urbanjost M_CLI2 
> 90a1a146e19c8ad37b0469b8cbd04bc28eb67a50 vendor/M_CLI2
> 
> (no idea what to use as real names instead of SOMEVAR/OTHERVAR though!)
> 
> How does that sort of thing seem to you? (i.e. using the same basic idea as
> you have for submodules, but making it the standard for all gh distfiles)?

I ran with your suggestion and came up with a solution that I've named
distexpand. The idea is to use templates for commonly used,
automatically generated and therefore predictably named, stored, and
packaged dist files. 2 variables take different arguments/parts that
are 'expanded' with the template to working MASTER_SITESn and
DISTFILES.

The current configuration in the ports Makefile is done like this,
after putting distexpand.port.mk into /usr/ports/infrastructure/mk/:

MODULES += distexpand
DISTEXPAND += template account1 project1 id1(commithash/tag)
DISTEXPANDX += template account2 project2 id2(commithash/tag) targetdir

'template' is currently set up for github, gitlab, and sourcehut. You
can use multiple DISTEXPAND and DISTEXPANDX as needed. This will _not_
use up more MASTER_SITESn, as long as the template stays the same.

Regarding the naming, I'm definitely open to discuss other suggestions.
DISTEXPAND is what I've been able to think of that most clearly conveys
the use of the fragments that are expanded to a full address for
fetching the distfile. DISTEXPANDX - the last 'X' is meant to stand for
'extended' as this is the version that relocates the extracted files to
a target dir. I'm slightly partial to consider naming the variables
instead 'DISTEXPAND4' and 'DISTEXPAND5' which would remind the porter
of the number of components for each version.

For the templating, I used %account, %project, %id, %subdir as the
placeholders. Those are substituted later with :S. I'm open to
suggestions if there may be a more established pattern for placeholders
in strings in Makefile context.

This can replace GH_{ACCOUNT,PROJECT,TAGNAME,COMMIT}. Tags are detected
as such, and in that case a DISTNAME will be set to $project-$tag if
not otherwise set. In other scenarios, a DISTNAME or PKGNAME may need
to be set.

A couple of other things to note compared to before:
- GH_WRKSRC is gone without replacement. Its usefulness was
  questionable.
- It includes logic that finds the first MASTER_SITESn that isn't
  otherwise used, and throws an ERROR if it overruns past
  MASTER_SITES9.
- Using tags is now by just proving '0.1.0' or 'v0.11.2' or other
  non-commithash string (the heuristic checks for length to determine
  if this is a tag or a commit hash).
- It currently uses 2 longer for-loops that are almost identical, but
  one for DISTEXPAND, and the other one for DISTEXPANDX. Given the
  limitations in Makefiles, I couldn't think of a way to reuse more
  code there.

This doesn't need to be in a module, but this way it's easy to plug in
and experiment with.

I'm attaching the distexpand.port.mk, as well as the patch for using it
with neovim as an example. I've tested this with about 3 dozen ports
that use combinations of mostly github sites, but also a gitlab and a
sourcehut dist source [1].

[1] https://thfr.info/tmp/distexpand-ports.txt
Index: Makefile
===
RCS file: /cvs/ports/editors/neovim/Makefile,v
retrieving revision 1.37
diff -u -p -r1.37 Makefile
---