Hi, everyone. Here's a topic that came to me with relevance to the recent events. I don't think we have some official guidelines how to handle the various kinds of forks, to ensure that users know what they're getting. So I'd like to start a discussion on the topic.
For a start, I'd like to classify forks into three groups: 1. classic forks -- package B is forked out of A, and the development of both continue independently (eudev/systemd, ffmpeg/libav); 2. large patch sets / continuously rebased forks -- where the particular set of changes is usually applied to mainline or regularly rebased against mainline but without full separation (kernel patchsets, bitcoin patches); 3. abandoned package forks -- package A stops being maintained upstream, someone forks it and starts working on top of that (xarchiver). For group 1., I think it's pretty clear that most of the time we want to use separate packages for the forks. This is because: 1a. usually upstreams use distinct names, and supplying packages with matching names avoids confusion; 1b. even if releases of the forks are somehow synced, upstreams rarely use matching version numbers; 1c. usually one of the forks diverges enough to require major differences in the ebuild. The remaining question is, should we inform users about the fork somehow? Tell them 'you can now decide to use Y instead of X'? If yes, then how should we do it? For libraries, the users will usually notice something, like a new virtual being installed or new USE flags being added -- but that doesn't cover end user packages. The second group (patch sets) is more unclear. AFAICS some people argue that packages with major patch sets applied should be distinguished by separate package names. Others see that applying them via USE flags is easier. Separate packages are used e.g. for different kernel patch sets. This has the following advantages: 2a1. more clear distinction between base and patched version, 2a2. cleaner when patch sets imply major changes, e.g. when some of the USE flags apply to patched version only, 2a3. the packages can be bumped independently, without worrying that the patch set has not been updated yet. A single package with USE flags is used e.g. for openssl (hpn patch set), bitcoincore (ljr patch set). This has the following advantages: 2b1. available patches are cleanly exposed via USE flags, 2b2. multiple patch sets can be combined in a single package, 2b3. usually there is less work for the package maintainer. The third group (dead package forks) is most unclear to me, especially that those kinds of forks frequently continue using the original package name. The advantage of treating the fork as a continuation of the original package is that it requires no effort from users, and is clear from keywording/stabilization perspective. However, it means that users suddenly start using a package from different upstream -- but then, does it differ much compared to when upstream developers change? Using separate packages would clearly indicate that we're switching to a fork. However, usually this would mean inventing a custom package name (like 'xarchiver-ib'), and somehow informing users about the switch. For stable packages, we'd also have to figure out some reasonable way to suggest the upgrade first to ~arch users, then to stable. What do you think? -- Best regards, Michał Górny
signature.asc
Description: This is a digitally signed message part
