Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.)

2023-07-04 Thread Oskari Pirhonen
On Tue, Jul 04, 2023 at 21:56:26 +, Robin H. Johnson wrote:
> On Tue, Jul 04, 2023 at 12:44:39PM +0200, Gerion Entrup wrote:
> > just to be curious about the whole discussion. I did not follow in the
> > deepest detail but what I got is:
> > - EGO_SUM blows up the Manifest file, since every little Go module needs
> >   to be respected. A lot of these Manifest files lead to a extremely
> >   increased Portage tree size. EGO_SUM is just one example (though the
> >   biggest one). Statically linked languages like Rust etc. have the same
> >   problem.
> > - The current solution is to prepackage all modules, put it somewhere on
> >   a webserver and just manifest that file. This make the Portage tree
> >   small in size again, but requires a webserver/mirror and is thus
> >   unfriendly for overlay devs.
> > 
> > I'm not sure if it was mentioned before but has anyone considered hash
> > trees / Merkle trees for the manifest file? The idea would be to hash
> > the standard manifest file a second time if it gets too big and write
> > down that hash as new manifest file and leave EGO_SUM as is.
> This is out-of-tree/indirect Manifests, that I proposed here, more than
> a year ago:
> https://marc.info/?l=gentoo-dev=168280762310716=2
> https://marc.info/?l=gentoo-dev=165472088822215=2
> 
> Developing it requires PMS work in addition to package manager
> development, because it introduces phases.
> 
> - primary fetch of $SRC_URI per ebuild, including indirect Manifest
> - primary validation of distfiles
> - secondary fetch of $SRC_URI per indirect Manifest
> - secondary validation of additional distfiles
> 
> A significantly impacted use case is "emerge -f", it now needs to run
> downloads twice.
> 

I'm not sure double downloading is required. Consider a flow similar to
this:

1. distfiles are fetched as per the ebuild
2. distfiles are hashed into a temporary Manifest
3. temporary Manifest is hashed and compared with the hashes stored in
   the in-tree Manifest for the direct Manifest

A new Manifest format would be required in order to differentiate the
current ones from an indirect one. This may require PMS changes,
although I suspect ammending GLEP 74 may be enough since the PMS seems
to just refer to the GLEP for a description of Manifests.

This would also either rely on a stable ordering of Manifest contents
when generating it or having a separate file listing in the indirect
Manifest which corresponds to the order in the direct Manifest. For the
latter, it should also have separate entries for different package
versions so that every single distfile for every single version of said
package does not need to be fetched in order to build the direct
Manifest.

I'm imagining something along these lines:

INDIRECT true
PACKAGE category/package-version distfile1 distfile2 ... ALGO1 hash1 ALGO2 
hash2 ...
PACKAGE ...

Here `ALGO1` and `hash1` correspond to the hash of the direct Manifest
containing the distfiles (and potentially other files if a repo does not
have thin-manifests enabled) and their hashes in the order specified
previously.

The indirect Manifest as described above would be large-ish for a
package that has lots of distfiles, but likely much smaller than if each
distfile had its set of hashes stored directly.

Please correct me if there's some detail I've overlooked.

- Oskari

> The rest of the posts also go into the matter of duplication within
> EGO_SUM & the indirect Manifests: limiting the growth requires some form
> of content-addressed layout.
> 
> It's absolutely something we should get developed, but it's a lot of
> work.
> 
> The indirect Manifests still provide a hosting challenge for overlays.
> 
> -- 
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
> E-Mail   : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136




signature.asc
Description: PGP signature


[gentoo-dev] Up for grabs: Percona DB and friends

2023-07-04 Thread Sam James

Various open bugs for these and bumps pending. mysql@ has very few
members now and needs help in general, but doesn't have the resources
to maintain these.

commit ab270c702a21d69c4ebd099951ff7a79142081d1
Author: Sam James 
Date:   Tue Jul 4 23:20:03 2023 +0100
dev-db/percona-xtrabackup: drop to maintainer-needed

Signed-off-by: Sam James 

commit 632d464008bcaadb49811b5dceef1091db91b99f
Author: Sam James 
Date:   Tue Jul 4 23:19:51 2023 +0100

dev-db/percona-toolkit: drop to maintainer-needed

Signed-off-by: Sam James 

commit dbe60f7ea017a23fd79ac3b1828cd5599e4941cd
Author: Sam James 
Date:   Tue Jul 4 23:19:34 2023 +0100

dev-db/percona-server: drop to maintainer-needed


signature.asc
Description: PGP signature


Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.)

2023-07-04 Thread Robin H. Johnson
On Tue, Jul 04, 2023 at 12:44:39PM +0200, Gerion Entrup wrote:
> just to be curious about the whole discussion. I did not follow in the
> deepest detail but what I got is:
> - EGO_SUM blows up the Manifest file, since every little Go module needs
>   to be respected. A lot of these Manifest files lead to a extremely
>   increased Portage tree size. EGO_SUM is just one example (though the
>   biggest one). Statically linked languages like Rust etc. have the same
>   problem.
> - The current solution is to prepackage all modules, put it somewhere on
>   a webserver and just manifest that file. This make the Portage tree
>   small in size again, but requires a webserver/mirror and is thus
>   unfriendly for overlay devs.
> 
> I'm not sure if it was mentioned before but has anyone considered hash
> trees / Merkle trees for the manifest file? The idea would be to hash
> the standard manifest file a second time if it gets too big and write
> down that hash as new manifest file and leave EGO_SUM as is.
This is out-of-tree/indirect Manifests, that I proposed here, more than
a year ago:
https://marc.info/?l=gentoo-dev=168280762310716=2
https://marc.info/?l=gentoo-dev=165472088822215=2

Developing it requires PMS work in addition to package manager
development, because it introduces phases.

- primary fetch of $SRC_URI per ebuild, including indirect Manifest
- primary validation of distfiles
- secondary fetch of $SRC_URI per indirect Manifest
- secondary validation of additional distfiles

A significantly impacted use case is "emerge -f", it now needs to run
downloads twice.

The rest of the posts also go into the matter of duplication within
EGO_SUM & the indirect Manifests: limiting the growth requires some form
of content-addressed layout.

It's absolutely something we should get developed, but it's a lot of
work.

The indirect Manifests still provide a hosting challenge for overlays.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: PGP signature


Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.)

2023-07-04 Thread Gerion Entrup
Am Dienstag, 4. Juli 2023, 09:13:30 CEST schrieb Tim Harder:
> On 2023-07-03 Mon 04:17, Florian Schmaus wrote:
> >On 30/06/2023 13.33, Eray Aslan wrote:
> >>On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote:
> >>>Why do we have to keep exporting the related variables that generally
> >>>cause these size issues to the environment?
> >>
> >>I really do not want to make a +1 response but this is an excellent
> >>question that we need to answer before implementing EGO_SUM.
> >
> >Could you please discuss why you make the reintroduction of EGO_SUM 
> >dependent on this question?
> 
> Just to be clear, I don't particularly care about EGO_SUM enough to gate
> its reintroduction (and don't have any leverage to do so anyway). I'm
> just tired of the circular discussions around env issues that all seem
> to avoid actual fixes, catering instead to functionality used by a
> vanishingly small subset of ebuilds in the main repo that compels a
> certain design mostly due to how portage functioned before EAPI 0.
> 
> Other than that, supporting EGO_SUM (or any other language ecosystem
> trending towards distro-unfriendly releases) is fine as long as devs are
> cognizant how the related global-scope eclass design affects everyone
> running or working on the raw repo. I hope devs continue leveraging the
> relatively recent benchmark tooling (and perhaps more future support) to
> improve their work. Along those lines, it could be nice to see sample
> benchmark data in commit messages for large, global-scope eclass work
> just to reinforce that it was taken into account.
> 
> Tim

Hi,

just to be curious about the whole discussion. I did not follow in the
deepest detail but what I got is:
- EGO_SUM blows up the Manifest file, since every little Go module needs
  to be respected. A lot of these Manifest files lead to a extremely
  increased Portage tree size. EGO_SUM is just one example (though the
  biggest one). Statically linked languages like Rust etc. have the same
  problem.
- The current solution is to prepackage all modules, put it somewhere on
  a webserver and just manifest that file. This make the Portage tree
  small in size again, but requires a webserver/mirror and is thus
  unfriendly for overlay devs.

I'm not sure if it was mentioned before but has anyone considered hash
trees / Merkle trees for the manifest file? The idea would be to hash
the standard manifest file a second time if it gets too big and write
down that hash as new manifest file and leave EGO_SUM as is.

When Portage tries to install the package, it can download all modules,
build the "normal" Manifest file like normally, but instead of directly
compare it to the Manifest in the tree it can hash it again and compare
that to the provided Manifest. With this, Portage should have more less
the same guarantees about the validity of the source code, but the
manifest file consists of just two hashes again.
What one would loose is the direct comparison of file names (they are
included in the "meta"-hash, though) or do I miss something?

Gerion


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.)

2023-07-04 Thread Tim Harder

On 2023-07-03 Mon 04:17, Florian Schmaus wrote:

On 30/06/2023 13.33, Eray Aslan wrote:

On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote:

Why do we have to keep exporting the related variables that generally
cause these size issues to the environment?


I really do not want to make a +1 response but this is an excellent
question that we need to answer before implementing EGO_SUM.


Could you please discuss why you make the reintroduction of EGO_SUM 
dependent on this question?


Just to be clear, I don't particularly care about EGO_SUM enough to gate
its reintroduction (and don't have any leverage to do so anyway). I'm
just tired of the circular discussions around env issues that all seem
to avoid actual fixes, catering instead to functionality used by a
vanishingly small subset of ebuilds in the main repo that compels a
certain design mostly due to how portage functioned before EAPI 0.

Other than that, supporting EGO_SUM (or any other language ecosystem
trending towards distro-unfriendly releases) is fine as long as devs are
cognizant how the related global-scope eclass design affects everyone
running or working on the raw repo. I hope devs continue leveraging the
relatively recent benchmark tooling (and perhaps more future support) to
improve their work. Along those lines, it could be nice to see sample
benchmark data in commit messages for large, global-scope eclass work
just to reinforce that it was taken into account.

Tim



[gentoo-dev] Last rites: dev-ruby/inflecto

2023-07-04 Thread Hans de Graaff
# Hans de Graaff  (2023-07-04)
# Discontinued by upstream. No reverse dependencies. Upstream
recommends
# using dry-inflector. Please file a but if you would like us to
package
# this. Masked for removal on 2023-08-04.
dev-ruby/inflecto


signature.asc
Description: This is a digitally signed message part


[gentoo-dev] Last rites: dev-ruby/instantiator

2023-07-04 Thread Hans de Graaff
# Hans de Graaff  (2023-07-04)
# Archived by upstream. No reverse dependencies. Does not work with
# ruby32. Masked for removal on 2023-08-04.
dev-ruby/instantiator


signature.asc
Description: This is a digitally signed message part