[gentoo-dev] Re: About EGO_SUM
* "Robin H. Johnson" : Wrote on Wed, 8 Jun 2022 20:42:48 +: > EGO_SUM vs dependency tarballs: > - bloats ebuilds > - bloats Manifests > - bloats metadata/md5-cache/ (SRC_URI etc) > - doesn't bloat mirrors with gentoo-unique distfiles > - EGO_SUM is verifiable/reproducible from Upstream Go systems > - less downloads on upgrades (only changed Go deps, not entire dep tarballs) > > EGO_SUM data right now adds, to every user's system: > - 2.6MB of text to ebuilds (340k after de-dupe) > - 7MB of text to Manifests (2M after de-dupe) > - 6.4MB+ of text to metadata/md5-cache (I don't have a easy way to > calc deduped amount here) > On the server side: > - The sum total of Go distfiles mirrored on Gentoo mirrors right now > is only 3.4GB. > - less downloads > > Dependency tarballs: > - Right now ~15GiB on each mirror, plus storage of the primary copy > somewhere (dev.g.o right now, but not great) > - Conservatively if the remaining EGO_SUM packages converted to Dep > tarballs, it would need another 8GB each of primary location and > mirrors. > - larger downloads for users who DO want to upgrade a Go package (all > new deps tarball even if only one or two deps changed) > - must be preserved much longer, unless we can introduce a guaranteed > way to regenerate them for any prior ebuild. > > I was trying to introduce a third option, but I haven't had the time to > write an entire GLEP. > > The TL;DR is introducing a 2nd-level Manifest+metadata file, that tries > to move just the metadata out of the tree, in a way that can be > regenerated (specifically, a 1:1 reproducible creation from a given go.sum). > It DOES need to contain slightly more data than the present Manifest, > specifically a full SRC_URI entry for each file (upstream URI plus what > to rename it to on Gentoo side) > > The 2nd-level Manifest would be listed as SRC_URI, and be handled in > src_fetch/src_unpack. Download & verify the extra distfiles, against the > Manifest checksum data (and for Golang against go.sum checksums). > > The Portage mirrordist code needs the most work in this case, as it > would need to fetch the 2nd-level Manifests so it can populate Gentoo > mastermirror with the distfiles mirrored from upstream. > > The storage costs for the proposed idea: > - same 1:1 base distfile storage as EGO_SUM (e.g. upstream distfiles are > mirrored 1:1 content, just different naming) > - Probably 1 Metadata-Manifest file per ebuild $PVR (conceptually it > could be split more or shared between some ebuilds/packages) > - Main tree Manifests: 1 DIST entry per Metadata-Manifest in a given package > - Main tree ebuilds: 1 line for the Metadata-Manifest in the ebuild. > - metadata/md5-cache: 1 src_uri line! > - mirrors: add the Metadata-Manifest [Without claiming to have fully understood the proposal above: around Apr 15th 22 I tried suggesting to WilliamH on IRC that perhaps portage should implement the dirhash approach that go has taken to solve the problem of upstream sources when they invented go.sum. from hash.go in sources go/src/cmd/vendor/golang.org/x/mod/sumdb/dirhash/hash.go // Hash1 is "h1:" followed by the base64-encoded SHA-256 hash of a summary prepared as if by the Unix command:find . -type f | sort | sha256sum loosely speaking the "manifest" could publish this dirhash of contents of go-mod/cache (which would have been bundled in the -deps.tar.xz) The immediate motivation was to avoid the network when I already had the sources locally: instead of downloading a -deps.tar.xz I could create it locally and dump it in distdir. portage would check the (hypothetically) published dirhash and let it through. the local timestamps and uid in my tarball and the upstream tarball wouldn't upset it. One unchecked assumption is that go-mod/cache can be recreated by unpacking sources. If so then with a notion of a "second level manifest" (the equivalent of go.sum) the contents can be assembled without having to store or download the actual -deps tarball. I didn't get very far in convincing WilliamH of my need so I dropped the idea. (I'm not sure if I'm being any clearer, if I'm missing something, do let me know)
Re: [gentoo-dev] Interest in a yarn / NPM eclass
I have an npm.eclass ( https://github.com/bekcpear/ryans-repos/blob/main/eclass/npm.eclass) that works like EGO_SUM. However, the package-lock.json file should be patched to convert sha1 to sha512 due to the default mechanism (script: https://github.com/bekcpear/npm-lockfile-to-sha512.sh). I maintain a package www-apps/filebrowser by this eclass in my overlay. On Thu, Jun 9, 2022 at 4:44 AM Robin H. Johnson wrote: > On Wed, Jun 08, 2022 at 07:23:15PM +0200, Alessandro Barbieri wrote: > > I'm interested in an eclass that doesn't bundle everything together. Also > > I'm interested in anyone that can share the package maintainership (in > guru > > first). > > > > I've already tried 3 approaches: > ... > > Since you know this yarn/NPM ecosystem well, could you evaluate two > other ideas? > 4) Solutions like EGO_SUM > 5) EGO_SUM successor of 2nd-level-Metadata-Manifest that I described in > the recent EGO_SUM thread. > > -- > Robin Hugh Johnson > Gentoo Linux: Dev, Infra Lead, Foundation Treasurer > E-Mail : robb...@gentoo.org > GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 > GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 >
[gentoo-dev] Re: Packages up for grabs: e.g. www-servers/nginx, www-apps/nikola, app-admin/rsyslog, ...
On 2022-06-05 09:28, Joonas Niilola wrote: sys-process/incron I'll take this one, with Infra as fallback maintainers. -- Marecki OpenPGP_signature Description: OpenPGP digital signature
Re: [gentoo-dev] Packages up for grabs: e.g. www-servers/nginx, www-apps/nikola, app-admin/rsyslog, ...
Infra need/wants a few of these packages, so please consider us fallback maintainers: On Sun, Jun 05, 2022 at 11:28:30AM +0300, Joonas Niilola wrote: > app-metrics/mysqld_exporter > net-libs/zeromq > net-misc/httpstat > sys-apps/hponcfg > sys-block/hpacucli > sys-block/hpssacli > sys-block/storcli > sys-process/incron -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: PGP signature
Re: [gentoo-dev] Interest in a yarn / NPM eclass
On Wed, Jun 08, 2022 at 07:23:15PM +0200, Alessandro Barbieri wrote: > I'm interested in an eclass that doesn't bundle everything together. Also > I'm interested in anyone that can share the package maintainership (in guru > first). > > I've already tried 3 approaches: ... Since you know this yarn/NPM ecosystem well, could you evaluate two other ideas? 4) Solutions like EGO_SUM 5) EGO_SUM successor of 2nd-level-Metadata-Manifest that I described in the recent EGO_SUM thread. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: PGP signature
Re: [gentoo-dev] About EGO_SUM
On Fri, Jun 03, 2022 at 01:18:08PM +0200, Florian Schmaus wrote: > EGO_SUM is marked as 'deprecated' in go-module.eclass [1, 2]. I > acknowledge that there are packages where the usage of EGO_SUM is very > problematic. However, I wonder if there are packages where using > dependency tarballs is problematic while using EGO_SUM would be not. ... [snip all the great points] > Even more problematic are that dependency tarballs require additional > steps that would not be required when EGO_SUM is used. While those steps > appear simple, behavioral theory shows that even the tiniest additional > steps have a huge impact (e.g., online shops loose a relative large > share of customers if for each an additional checkout step). If we force > dependency tarballs for Go software, then packaging Go software just > become a little bit harder. Your above is entirely correct, and I was against the plan to introduce dependency tarballs. > This leads me to the question why are we actually deprecating EGO_SUM? > It seems like a nice alternative for Go packaging that we may want to > keep. But maybe I am missing something? EGO_SUM vs dependency tarballs: - bloats ebuilds - bloats Manifests - bloats metadata/md5-cache/ (SRC_URI etc) - doesn't bloat mirrors with gentoo-unique distfiles - EGO_SUM is verifiable/reproducible from Upstream Go systems - less downloads on upgrades (only changed Go deps, not entire dep tarballs) EGO_SUM data right now adds, to every user's system: - 2.6MB of text to ebuilds (340k after de-dupe) - 7MB of text to Manifests (2M after de-dupe) - 6.4MB+ of text to metadata/md5-cache (I don't have a easy way to calc deduped amount here) On the server side: - The sum total of Go distfiles mirrored on Gentoo mirrors right now is only 3.4GB. - less downloads Dependency tarballs: - Right now ~15GiB on each mirror, plus storage of the primary copy somewhere (dev.g.o right now, but not great) - Conservatively if the remaining EGO_SUM packages converted to Dep tarballs, it would need another 8GB each of primary location and mirrors. - larger downloads for users who DO want to upgrade a Go package (all new deps tarball even if only one or two deps changed) - must be preserved much longer, unless we can introduce a guaranteed way to regenerate them for any prior ebuild. I was trying to introduce a third option, but I haven't had the time to write an entire GLEP. The TL;DR is introducing a 2nd-level Manifest+metadata file, that tries to move just the metadata out of the tree, in a way that can be regenerated (specifically, a 1:1 reproducible creation from a given go.sum). It DOES need to contain slightly more data than the present Manifest, specifically a full SRC_URI entry for each file (upstream URI plus what to rename it to on Gentoo side) The 2nd-level Manifest would be listed as SRC_URI, and be handled in src_fetch/src_unpack. Download & verify the extra distfiles, against the Manifest checksum data (and for Golang against go.sum checksums). The Portage mirrordist code needs the most work in this case, as it would need to fetch the 2nd-level Manifests so it can populate Gentoo mastermirror with the distfiles mirrored from upstream. The storage costs for the proposed idea: - same 1:1 base distfile storage as EGO_SUM (e.g. upstream distfiles are mirrored 1:1 content, just different naming) - Probably 1 Metadata-Manifest file per ebuild $PVR (conceptually it could be split more or shared between some ebuilds/packages) - Main tree Manifests: 1 DIST entry per Metadata-Manifest in a given package - Main tree ebuilds: 1 line for the Metadata-Manifest in the ebuild. - metadata/md5-cache: 1 src_uri line! - mirrors: add the Metadata-Manifest -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: PGP signature
Re: [gentoo-dev] Interest in a yarn / NPM eclass
Il Lun 6 Giu 2022, 13:16 Andrew Ammerlaan ha scritto: > Hi Chris, > > I think @Alessandro (CC) has already done some work on this over at [1]. > I'm not sure what the status of it is though. Maybe you two can discuss > this together to avoid doing double work. > > Best regards, > Andrew > > [1] https://github.com/Alessandro-Barbieri/node-overlay > > On 05/06/2022 07:47, Chris Pritchard wrote: > > Hello, > > > > Would there be any interest in a yarn / NPM eclass that supports offline > > installs. For a personal overlay I’ve got a working yarn.eclass > > ( > https://github.com/chriscpritchard/overseerr-overlay/blob/main/eclass/yarn.eclass > > < > https://github.com/chriscpritchard/overseerr-overlay/blob/main/eclass/yarn.eclass>) > > > and I’ve been able to make a version that supports npm from > > NPM-Shrinkwrap or npm-lock.json files (this is still being tested). > > > > If there is an interest, would anyone be willing to support me in having > > an eclass added to the tree? > > > > Thanks, > > > > Chris > > I'm interested in an eclass that doesn't bundle everything together. Also I'm interested in anyone that can share the package maintainership (in guru first). I've already tried 3 approaches: 1) one package per dependency. The npm eclass is working fine for now, you can unbundle packages that depend on system libs (like sqlite), the major issue I've found is in the circular dependencies of the rollup package. 2) bundle everything. This approach doesn't always work. Some packages fail to build dependencies written in C and you can't unbundle them. 3) package every runtime dependency and bundle build time dependencies. Since rollup is a build time dep, I've tried to bundle every build time dep but this require to create a custom stuff and host it somewhere.
[gentoo-dev] Reliably find MPI implementation
I have a package that uncorrectly guesses the mpi implementation because it's grepping the /usr/include/mpi.h and since it's a multilib wrapper, it doesn't contain the defines. I can specify the mpi implementation while configuring. How to reliably detect the MPI implementation? I need something (eclass ?) that return one the following: openmpi/mpich/mpich2/mpich3/lam/etc. based on the current installed status