Hi, everyone.

Here's a topic that came to me with relevance to the recent events. I
don't think we have some official guidelines how to handle the various
kinds of forks, to ensure that users know what they're getting. So I'd
like to start a discussion on the topic.

For a start, I'd like to classify forks into three groups:

1. classic forks -- package B is forked out of A, and the development of
both continue independently (eudev/systemd, ffmpeg/libav);

2. large patch sets / continuously rebased forks -- where the particular
set of changes is usually applied to mainline or regularly rebased
against mainline but without full separation (kernel patchsets, bitcoin
patches);

3. abandoned package forks -- package A stops being maintained upstream,
someone forks it and starts working on top of that (xarchiver).


For group 1., I think it's pretty clear that most of the time we want to
use separate packages for the forks. This is because:

1a. usually upstreams use distinct names, and supplying packages with
matching names avoids confusion;

1b. even if releases of the forks are somehow synced, upstreams rarely
use matching version numbers;

1c. usually one of the forks diverges enough to require major
differences in the ebuild.

The remaining question is, should we inform users about the fork
somehow? Tell them 'you can now decide to use Y instead of X'? If yes,
then how should we do it? For libraries, the users will usually notice
something, like a new virtual being installed or new USE flags being
added -- but that doesn't cover end user packages.


The second group (patch sets) is more unclear. AFAICS some people argue
that packages with major patch sets applied should be distinguished by
separate package names. Others see that applying them via USE flags is
easier.

Separate packages are used e.g. for different kernel patch sets. This
has the following advantages:

2a1. more clear distinction between base and patched version,

2a2. cleaner when patch sets imply major changes, e.g. when some
of the USE flags apply to patched version only,

2a3. the packages can be bumped independently, without worrying that
the patch set has not been updated yet.

A single package with USE flags is used e.g. for openssl (hpn patch
set), bitcoincore (ljr patch set). This has the following advantages:

2b1. available patches are cleanly exposed via USE flags,

2b2. multiple patch sets can be combined in a single package,

2b3. usually there is less work for the package maintainer.


The third group (dead package forks) is most unclear to me, especially
that those kinds of forks frequently continue using the original package
name.

The advantage of treating the fork as a continuation of the original
package is that it requires no effort from users, and is clear from
keywording/stabilization perspective. However, it means that users
suddenly start using a package from different upstream -- but then, does
it differ much compared to when upstream developers change?

Using separate packages would clearly indicate that we're switching to 
a fork. However, usually this would mean inventing a custom package name
(like 'xarchiver-ib'), and somehow informing users about the switch. For
stable packages, we'd also have to figure out some reasonable way to
suggest the upgrade first to ~arch users, then to stable.


What do you think?

-- 
Best regards,
Michał Górny

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to