Re: Building with many cores without OOM

2024-12-09 Thread sre4ever

Hi,

Let me plop in right into this discussion with no general solution and 
more things to think about. For the context I'm packaging Java things, 
and Java has historically been notoriously bad at guessing how much 
memory it could actually use on a given system. I'm not sure things are 
much better these days. This is just to remind that the issue is nowhere 
near as easy as it looks and that many attempts to generalize an 
approach that works well in some cases have failed.


Le 2024-12-09 14:42, Guillem Jover a écrit :


My thinking here was also about the general case too, say a system
that has many cores relative to its available memory, where each core
would get what we'd consider not enough memory per core


This is actually a common situation on most systems but a few privileged 
developers configurations. This is especially true in cloud-like, 
VM/containerized environments where it is much easier (i.e. with less 
immediate consequences) to overcommit CPU cores than RAM. Just look at 
the price list of any cloud computing provider to get an idea of the 
ratios you could start with. And then the provider may well lie about 
the actual availability of the cores they will readily bill you for, and 
you will only notice that when your application will grind to a halt at 
the worst possible time (e.g. on a Black Friday if your business is to 
sell stuff), but at least it won't get OOM-killed.


There are a few packages that are worrying me about how I'm going to 
make them build and run their test suites on Salsa without either timing 
out on one side, and getting immediately OOM killed at the other end of 
the slider. One of them wants to allocate 17GiB of RAM per test worker, 
and wants at least 3 of them. Another (Gradle) needs approximately 4 GiB 
of RAM (JVM processes alone, adding OS cache + overhead to that probably 
makes the total around 6-7 GiB) per additional worker for its build, and 
I don't know yet how much is needed for its tests suites as my current 
setup lacks the storage space necessary to run them. On my current 
low-end laptop (4 threads, 16 GiB RAM) dpkg guesses [1] are wrong , I 
can only run a single worker if I want to keep an IDE and a web browser 
running on the side. Two if I close the IDE and kill all browser tabs 
and other memory hogs. I would expect FTBFS bug reports if a 
run-of-the-mill dpkg-buildpackage command failed to build the package on 
such a system.



(assuming for
example a baseline for what dpkg-deb might require, plus build helpers
and their interpreters, and what a compiler with say an empty C, C++
or similar file might need, etc).


+1 for taking a baseline into consideration, as the first worker is 
usually significantly more expensive than additional workers. In my 
experience with Java build processes the first worker penalty is in the 
vicinity of +35% and can be much higher for lighter build processes (but 
then they are lighter and less likely to hit a limit excepted on very 
constrained environments).


Another thing I would like to add is that the requirements may change 
depending on the phase of the build, especially between building and 
testing. For larger projects, building requires usually more memory but 
less parallelism than testing. You could always throw more workers at 
building, but at some point additional workers will just sit mostly idle 
consuming RAM and resources as there is a limited number of tasks that 
the critical path will allow at any given point. Testing, especially 
with larger test suites, usually allows for (and sometimes needs) much 
more parallelism.


Also worth noting, on some projects the time spent testing can be orders 
of magnitude greater than the time spent building.



This could also imply alternatively or in addition, providing a tool
or adding some querying logic in an existing tools (in the dpkg 
toolset)

to gather that information which the packaging could use, or…


Additional tooling may help a bit, but I think what would really help at 
that point would be to write and publish guidelines relevant to the 
technology being packaged, based on empirical evidence collected while 
fine tuning the build or packaging, and kept reasonably up-to-date (i.e. 
never more than 2-3 years old) with the current state of technologies 
and projects. Salsa (or other CI) pipelines could be instrumented to 
provide some data and once the guidelines cover a majority of packages 
you will have a better insight of what, if anything, needs to be done 
with the tooling.



[1]: 
https://salsa.debian.org/jpd/gradle/-/blob/upgrade-to-8.11.1-wip/debian/rules#L49


--
Julien Plissonneau Duquène



Re: Building with many cores without OOM

2024-12-09 Thread Guillem Jover
Hi!

On Thu, 2024-12-05 at 09:23:24 +0100, Helmut Grohne wrote:
> On Wed, Dec 04, 2024 at 02:03:29PM +0100, Guillem Jover wrote:
> > On Thu, 2024-11-28 at 10:54:37 +0100, Helmut Grohne wrote:
> > > For one thing, I propose extending debhelper to provide
> > > --min-ram-per-parallel-core as that seems to be the most common way to
> > > do it. I've proposed
> > > https://salsa.debian.org/debian/debhelper/-/merge_requests/128
> > > to this end.
> > 
> > To me this looks too high in the stack (and too Linux-specific :).

> I don't think being Linux-specific is necessarily bad here and note that
> the /proc interface is also supported by Hurd (I actually checked on a
> porter box). The problem we are solving here is a practical one and the
> solution we pick now probably is no longer relevant in twenty years.
> That's about the time frame I am expect Linux to be the preferred kernel
> used by Debian (could be longer, but unlikely shorter).

See below for the portability part.

> > I think adding this in dpkg-buildpackage itself would make most sense
> > to me, where it is already deciding what amount of parallelism to use
> > when specifying «auto» for example.
> > 
> > Given that this would be and outside-in interface, I think this would
> > imply declaring these parameters say as debian/control fields for example,
> > or some other file to be parsed from the source tree.
> 
> I find that outside-in vs inside-out distinction quite useful, but I
> actually prefer an inside-out approach. You detail that picking a
> sensible ram-per-core value is environment-specific. Others gave
> examples of how build-systems address this in ways of specifying linker
> groups with reduced parallelism and you go into detail of how the
> compression parallelism is limited based on system ram already. Given
> all of these, I no longer am convinced that reducing the package-global
> parallelism is the desired solution. Rather, each individual step may
> benefit from its own limiting and that's what is already happening in
> the archive. It is that inside-out approach that we see in debian/rules
> in some packages. What I now find missing is better tooling to support
> this inside-out approach.

Not all outside-in interfaces are made equal, as I hinted on that
other mail, some are (let's call them) permeable, where the build
driver performs some defaults setup or data gathering that it does
not necessarily uses itself, and which can be easily overridden by
the inner packaging files.

I don't have a strong opinion on this case though, my initial reaction
was that because dpkg-buildpackage is already trying to provide a good
default for the number of parallel jobs to use, it seemed like a good
global place to potentially improve that number to influence all users,
if say the only thing needed is a declarative hint from the packing
itself. This being a permeable interface also means the inner processes
could still ignore or tune that value further or whatever (except for
the --jobs-force option which is harder to revert from inside).

But I guess it depends on whether we can have a better general heuristic
in the outer parallel job number computation. Or whether for the cases
that we need to tune, any such general heuristic would serve no actual
purpose and all/most of them would need to be overridden anyway.

> > My main concerns would be:
> > 
> >   * Portability.
> 
> I am not concerned. The parallelism limit is a mechanism to increase
> efficiency of builder deployments and not much more. The portable
> solution is to stuff in more RAM or supply a lower parallel value
> outside-in. A 90% solutions is more than good enough here.

Right, I agree with the above, because this should be considered an
opportunistic quality of life improvement, where the user can always
manually override it, if the tool does not get it right. My concern
was about the above debhelper MR failing hard on several conditions
where it should just simply disable the improved clamping. See for
example the parallel auto handling in dpkg-buildpackage (after the
«run_hook('preinit')»), or lib/dpkg/compress.c:filter_xz_get_memlimit()
and lib/dpkg/meminfo.c:meminfo_get_available_from_file()) for the
dpkg-deb one, where these should gracefully fallback to less accurate
methods if they cannot gather the needed information.

(Now that you mention it, I should probably enable Hurd for the
/proc/meminfo codepath. :)

> >   * Whether this is a local property of the package (so that the
> > maintainer has the needed information to decide on a value, or
> > whether this depends on the builder's setup, or perhaps both).
> 
> All of what I wrote in this thread thus far assumed that this was a
> local property. That definitely is an oversimplification of the matter
> as an upgraded clang, gcc, ghc or rustc has historically yielded
> increased RAM consumption. The packages affected tend to be sensitive to
> changes in these packages in other ways, so they generally know q

Re: Building with many cores without OOM

2024-12-05 Thread Helmut Grohne
Hi Guillem and others,

Thanks for your extensive reply and the followup clarifying the
inside-out and outside-in distinction.

On Wed, Dec 04, 2024 at 02:03:29PM +0100, Guillem Jover wrote:
> On Thu, 2024-11-28 at 10:54:37 +0100, Helmut Grohne wrote:
> > I think this demonstrates that we probably have something between 10 and
> > 50 packages in unstable that would benefit from a generic parallelism
> > limit based on available RAM. Do others agree that this is a problem
> > worth solving in a more general way?
> 
> I think the general idea make sense, yes.

Given the other replies on this thread, I conclude that we have rough
consensus on this being a problem worth solving (expending effort and
code and later maintenance cost on).

> > For one thing, I propose extending debhelper to provide
> > --min-ram-per-parallel-core as that seems to be the most common way to
> > do it. I've proposed
> > https://salsa.debian.org/debian/debhelper/-/merge_requests/128
> > to this end.
> 
> To me this looks too high in the stack (and too Linux-specific :).

Let me take the opportunity to characterize this proposal inside-out
given your distinction.

I don't think being Linux-specific is necessarily bad here and note that
the /proc interface is also supported by Hurd (I actually checked on a
porter box). The problem we are solving here is a practical one and the
solution we pick now probably is no longer relevant in twenty years.
That's about the time frame I am expect Linux to be the preferred kernel
used by Debian (could be longer, but unlikely shorter).

> I think adding this in dpkg-buildpackage itself would make most sense
> to me, where it is already deciding what amount of parallelism to use
> when specifying «auto» for example.
> 
> Given that this would be and outside-in interface, I think this would
> imply declaring these parameters say as debian/control fields for example,
> or some other file to be parsed from the source tree.

I find that outside-in vs inside-out distinction quite useful, but I
actually prefer an inside-out approach. You detail that picking a
sensible ram-per-core value is environment-specific. Others gave
examples of how build-systems address this in ways of specifying linker
groups with reduced parallelism and you go into detail of how the
compression parallelism is limited based on system ram already. Given
all of these, I no longer am convinced that reducing the package-global
parallelism is the desired solution. Rather, each individual step may
benefit from its own limiting and that's what is already happening in
the archive. It is that inside-out approach that we see in debian/rules
in some packages. What I now find missing is better tooling to support
this inside-out approach.

> My main concerns would be:
> 
>   * Portability.

I am not concerned. The parallelism limit is a mechanism to increase
efficiency of builder deployments and not much more. The portable
solution is to stuff in more RAM or supply a lower parallel value
outside-in. A 90% solutions is more than good enough here.

>   * Whether this is a local property of the package (so that the
> maintainer has the needed information to decide on a value, or
> whether this depends on the builder's setup, or perhaps both).

All of what I wrote in this thread thus far assumed that this was a
local property. That definitely is an oversimplification of the matter
as an upgraded clang, gcc, ghc or rustc has historically yielded
increased RAM consumption. The packages affected tend to be sensitive to
changes in these packages in other ways, so they generally know quite
closely what version of dependencies will be in use and can tailor their
guesses. So while this is a non-local property in principle, my
expectation is that treating it as if it was local is good enough for a
90% solution.

>   * We might need a way to percolate these parameters to children of
> the build/test system (as Paul has mentioned), where some times
> you cannot specify this directly in the parent. Setting some
> standardize environment variables would seem sufficient I think,
> but while all this seems kind of optional, this goes a bit into
> reliance on dpkg-buildpackage being the only supported build
> entry point. :)

To me, this reads as an argument for using an inside-out approach.

Given all of the other replies (on-list and off-list), my vision of how
I'd like to see this approached has changed. I see more and more value
in leaving this in close control of the package maintainer (i.e.
inside-out) to the point where different parts of the build may use
different limits.

How about instead we try to extend coreutils' nproc? How about adding
more options to it?

  --assume-units=N
  --max-units=N
  --min-ram-per-unit=Z

Then, we could continue to use buildopts.mk and other mechanism to
extract the passed parallel value from DEB_BUILD_OPTIONS as before and
run it through an nproc invocation for passing it down to a build sys

Re: Building with many cores without OOM

2024-12-04 Thread Simon Richter

Hi,

On 12/4/24 23:37, Stefano Rivera wrote:


I don't think this can be entirely outside-in, the package needs to say
how much ram it needs per-core, to be able to calculate the appropriate
degree of parallelism. So, we have to declare a value that then gets
calculated against the proposed parallelism.


This.

Also, the Ninja build system provides resource pools -- typical packages 
use only the "CPU" pool, and reserve one CPU per task started, but it is 
possible to also define a "memory" pool and declare tasks with memory 
reservations, which reduces parallel execution only while 
memory-intensive tasks are running.


For LLVM specifically, GettingStarted.md documents

- -DLLVM_PARALLEL_{COMPILE,LINK,TABLEGEN}_JOBS=N — Limit the number
  of compile/link/tablegen jobs running in parallel at the same
  time. This is especially important for linking since linking can
  use lots of memory. If you run into memory issues building LLVM,
  try setting this to limit the maximum number of compile/link/
  tablegen jobs running at the same time.

How we arrive at N is left as an exercise for the reader though.

   Simon



Re: Building with many cores without OOM

2024-12-04 Thread Guillem Jover
Hi!

On Wed, 2024-12-04 at 14:37:45 +, Stefano Rivera wrote:
> Hi Guillem (2024.12.04_13:03:29_+)
> > > Are there other layers that could reasonably be used to implement a more
> > > general form of parallelism limiting based on system RAM? Ideally, we'd
> > > consolidate these implementations into fewer places.
> > 
> > I think adding this in dpkg-buildpackage itself would make most sense
> > to me, where it is already deciding what amount of parallelism to use
> > when specifying «auto» for example.
> > 
> > Given that this would be and outside-in interface, I think this would
> > imply declaring these parameters say as debian/control fields for example,
> > or some other file to be parsed from the source tree.
> 
> I don't think this can be entirely outside-in, the package needs to say
> how much ram it needs per-core, to be able to calculate the appropriate
> degree of parallelism. So, we have to declare a value that then gets
> calculated against the proposed parallelism.

I _think_ we are saying the same, and there might just be a mismatch
in nomenclature (most probably stemming from me being non-native and
using/reusing terms incorrectly)? So let me clarify what I meant,
otherwise I might be misunderstanding your comment, and I'd appreciate
a clarification. :)

When dealing with dpkg packaging build interfaces, in my mind there are
two main models:

  * outside-in: where the build driver (dpkg-buildpackage in this case)
can reach for all needed information and then do stuff based on that,
or pass that information down into debian/rules process hierarchy,
or to tools it invokes itself (say dpkg-genchanges); another such
interface could be R³ where trying to change the default from
debian/rules is already too late, as that's managed by the
build driver.

  * inside-out: where debian/rules, files sourced from it, or tools
invoked from it, fully control the outcome of the operation, and
then dpkg-buildpackage might not be able to tell beforehand
exactly what will happen and will need to pick up the results after
the fact, for example that would include dpkg-deb or dpkg-distaddfile
being currently fully delegated to debian/rules, and then
dpkg-buildpackage, et al. picking that up through debian/files;
debhelper would be a similar paradigm.

(With some exceptions, I consider that the bulk of our build interfaces
are unfortunately mostly inside-out.)

For this particular case, I'd envision the options could look something
like:

  * outside-in:

- We add a new field, say (with this not very good name that would
  need more thought) Build-Parallel-Mem-Limit-Per-Core for the
  debian/control source stanza, then dpkg-buildpackage would be able
  to check the current system memory, and clamp the number of
  computed parallel jobs based on the number of system cores, the
  number of specified parallel jobs and the limit from the above
  field. This then would be passed down as the usual parallel=
  DEB_BUILD_OPTIONS.

- If we needed the package to provide a dynamic value depending on
  other external factors outside its control, although there's no
  current precedent that I'm aware, and it seems a bit ugly, I guess
  we could envision some kind of new entry point and a way to
  let the build drivers know it needs to call it, for example a
  debian/rules target that gets called and generates some file or
  a program to call under debian/ that prints some value, which
  dpkg-buildpackage could use in a similar way as the above point.

  * inside-out:

For this, there could be multiple variants, where a build driver
like dpkg-buildpackage is completely out of the picture, and were
we might end up with parallel settings that are out-of-sync
between DEB_BUILD_OPTIONS parallel= and the inner one, for example:

- One could be the initially proposed buildopts.mk extension,

- Add a new dpkg-something helper or a new command to an existing
  tool, that could compute the value that debian/rules would use
  or pass further down,

- debhelper/debputy/etc does it, but that leaves out non-helper
  using packages, which was one of the initial concerns from
  Helmut.

Hope this clarifies.

Thanks,
Guillem



Re: Building with many cores without OOM

2024-12-04 Thread Stefano Rivera
Hi Guillem (2024.12.04_13:03:29_+)
> > Are there other layers that could reasonably be used to implement a more
> > general form of parallelism limiting based on system RAM? Ideally, we'd
> > consolidate these implementations into fewer places.
> 
> I think adding this in dpkg-buildpackage itself would make most sense
> to me, where it is already deciding what amount of parallelism to use
> when specifying «auto» for example.
> 
> Given that this would be and outside-in interface, I think this would
> imply declaring these parameters say as debian/control fields for example,
> or some other file to be parsed from the source tree.

I don't think this can be entirely outside-in, the package needs to say
how much ram it needs per-core, to be able to calculate the appropriate
degree of parallelism. So, we have to declare a value that then gets
calculated against the proposed parallelism.

Stefano

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272


Re: Building with many cores without OOM

2024-12-04 Thread Guillem Jover
Hi!

On Wed, 2024-12-04 at 14:03:30 +0100, Guillem Jover wrote:
> On Thu, 2024-11-28 at 10:54:37 +0100, Helmut Grohne wrote:
> > Are there other layers that could reasonably be used to implement a more
> > general form of parallelism limiting based on system RAM? Ideally, we'd
> > consolidate these implementations into fewer places.
> 
> I think adding this in dpkg-buildpackage itself would make most sense
> to me, where it is already deciding what amount of parallelism to use
> when specifying «auto» for example.
> 
> Given that this would be and outside-in interface, I think this would
> imply declaring these parameters say as debian/control fields for example,
> or some other file to be parsed from the source tree.
> 
> My main concerns would be:
> 
>   * Portability.
>   * Whether this is a local property of the package (so that the
> maintainer has the needed information to decide on a value, or
> whether this depends on the builder's setup, or perhaps both).
>   * We might need a way to percolate these parameters to children of
> the build/test system (as Paul has mentioned), where some times
> you cannot specify this directly in the parent. Setting some
> standardize environment variables would seem sufficient I think,
> but while all this seems kind of optional, this goes a bit into
> reliance on dpkg-buildpackage being the only supported build
> entry point. :)

Ah, and forgot to mention, that for example dpkg-deb (via libdpkg)
already implements this kind of parallelism limiter based on system
memory when compressing to xz. But in that case we are assisted by
liblzma telling us the amount of memory expected to be used, so it
makes it easier to clamp the parallelism based on that. Unfortunately
I'm not sure, in general, we have this kind of information available,
and my assumption is that in many cases we might end up deciding on
clamping factors out of current observations based on current
implementation details, that might need manual tracking and adjustment
going on.

Thanks,
Guillem



Re: Building with many cores without OOM

2024-12-04 Thread Guillem Jover
Hi!

On Thu, 2024-11-28 at 10:54:37 +0100, Helmut Grohne wrote:
> I am one of those who builds a lot of different packages with different
> requirements and found that picking a good parallel=... value in
> DEB_BUILD_OPTIONS is hard. Go too low and your build takes very long. Go
> too high and you swap until the OOM killer terminates your build. (Usage
> of choom recommended in any case.)

> I think this demonstrates that we probably have something between 10 and
> 50 packages in unstable that would benefit from a generic parallelism
> limit based on available RAM. Do others agree that this is a problem
> worth solving in a more general way?

I think the general idea make sense, yes.

> For one thing, I propose extending debhelper to provide
> --min-ram-per-parallel-core as that seems to be the most common way to
> do it. I've proposed
> https://salsa.debian.org/debian/debhelper/-/merge_requests/128
> to this end.

To me this looks too high in the stack (and too Linux-specific :).

> Unfortunately, a the affeted packages tend to not just be big, but also
> so special that they cannot use dh_auto_*. As a result, I also looked at
> another layer to support this and found /usr/share/dpkg/buildopts.mk,
> which sets DEB_BUILD_OPTION_PARALLEL by parsing DEB_BUILD_OPTIONS. How
> about extending this file with a mechanism to reduce parallelity? I am
> attaching a possible extension to it to this mail to see what you think.
> Guillem, is that something you consider including in dpkg?

I'm not a huge fan of the make fragment files, as make programming is
rather brittle, and it easily causes lots of processes to spawn if you
look at it the wrong way (ideally I'd really like to be able to get
rid of them once we can rely on something else!). I think we could
consider adding it there, but as a last resort option, if there's no
other better place.

> Are there other layers that could reasonably be used to implement a more
> general form of parallelism limiting based on system RAM? Ideally, we'd
> consolidate these implementations into fewer places.

I think adding this in dpkg-buildpackage itself would make most sense
to me, where it is already deciding what amount of parallelism to use
when specifying «auto» for example.

Given that this would be and outside-in interface, I think this would
imply declaring these parameters say as debian/control fields for example,
or some other file to be parsed from the source tree.

My main concerns would be:

  * Portability.
  * Whether this is a local property of the package (so that the
maintainer has the needed information to decide on a value, or
whether this depends on the builder's setup, or perhaps both).
  * We might need a way to percolate these parameters to children of
the build/test system (as Paul has mentioned), where some times
you cannot specify this directly in the parent. Setting some
standardize environment variables would seem sufficient I think,
but while all this seems kind of optional, this goes a bit into
reliance on dpkg-buildpackage being the only supported build
entry point. :)

> As I am operating build daemons (outside Debian), I note that I have to
> limit their cores below what is actually is available to avoid OOM
> kills and even that is insufficient in some cases. In adopting such a
> mechanism, we could generally raise the core count per buildd and
> consider OOM a problem of the package to be fixed by applying a sensible
> parallelism limit.

See above, on whether this is really package or setup dependent.

Thanks,
Guillem



Re: Building with many cores without OOM

2024-11-29 Thread Paul Gevers

Hi Helmut,

On 11/29/24 07:59, Helmut Grohne wrote:

On Thu, Nov 28, 2024 at 02:39:36PM +0100, Paul Gevers wrote:

And doing it in a way that can be reused by how autopkgtests are run would
maybe be good too.


Can you clarify what you mean here? There is autopkgtest
--build-parallel and my understanding is that as packages lower the
requested parallelity by themselves, this aspect of autopkgtest would be
implicitly covered by the proposals at hand. Do you refer to test
parallelity here? Is there any setting or flag to configure that that I
may have missed?


I'm not talking about how /usr/bin/autopkgtest is called (and thus (?) 
not about a package build if that's needed), but I'm talking about how 
tests themselves should be dealing with parallelism. I recall some tests 
running out of memory (https://ci.debian.net/status/reject_list/ 
mentions at least two currently).


Paul



Re: Building with many cores without OOM

2024-11-29 Thread Niels Thykier

Helmut Grohne:

Hi Guillem and other developers,

I am one of those who builds a lot of different packages with different
requirements and found that picking a good parallel=... value in
DEB_BUILD_OPTIONS is hard. Go too low and your build takes very long. Go
too high and you swap until the OOM killer terminates your build. (Usage
of choom recommended in any case.)

[...]

I think this demonstrates that we probably have something between 10 and
50 packages in unstable that would benefit from a generic parallelism
limit based on available RAM. Do others agree that this is a problem
worth solving in a more general way?

For one thing, I propose extending debhelper to provide
--min-ram-per-parallel-core as that seems to be the most common way to
do it. I've proposed
https://salsa.debian.org/debian/debhelper/-/merge_requests/128
to this end.

Unfortunately, a the affected packages tend to not just be big, but also
so special that they cannot use dh_auto_*. As a result, I also looked at
another layer to support this and found /usr/share/dpkg/buildopts.mk,
which sets DEB_BUILD_OPTION_PARALLEL by parsing DEB_BUILD_OPTIONS. How
about extending this file with a mechanism to reduce parallelity? I am
attaching a possible extension to it to this mail to see what you think.
Guillem, is that something you consider including in dpkg?



My suggestion would be to have `dpkg` have a "RAM restrained" 
parallelization limit next to the default one. This is similar to how 
the `debhelper` one works (the option only applies to the dh_auto_ 
steps). This might be what you are proposing here (was unsure).


Generally, the RAM limit only applies to the upstream build side of 
things and this can be relevant for some packages. As an example, 
`firefox-esr` builds 20+ packages, so there is value in having "post 
processing" (dh_install ... dh_builddeb) happening with a higher degree 
of parallelization than the upstream build part to keep the build time 
as low as possible.


I think 2 degrees of parallelization limits would be sufficient for most 
cases (at least from a cost/benefit PoV).



Are there other layers that could reasonably be used to implement a more
general form of parallelism limiting based on system RAM? Ideally, we'd
consolidate these implementations into fewer places.

[...]

Helmut


We do have some custom implementations of debhelper build systems around 
in the archive[1], that in theory could do with this (though they are 
probably not worth hunting out - more of "update if they become a 
problem"). Then there is `debputy`, but I can have a look at that later 
after I have reviewed the patch for `debhelper`.


Best regards,
Niels

[1] Any `dh-sequence-X` sequences that replace `dh_auto_*` commands 
would likely fall into this category.





OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Building with many cores without OOM

2024-11-28 Thread Helmut Grohne
Hi Paul,

On Thu, Nov 28, 2024 at 02:39:36PM +0100, Paul Gevers wrote:
> On 11/28/24 13:01, Chris Hofstaedtler wrote:
> > IMO it would be good to support dealing with this earlier than
> > later.
> 
> And doing it in a way that can be reused by how autopkgtests are run would
> maybe be good too.

Can you clarify what you mean here? There is autopkgtest
--build-parallel and my understanding is that as packages lower the
requested parallelity by themselves, this aspect of autopkgtest would be
implicitly covered by the proposals at hand. Do you refer to test
parallelity here? Is there any setting or flag to configure that that I
may have missed?

Helmut



Re: Building with many cores without OOM

2024-11-28 Thread Paul Gevers

Hi Helmut,

On 11/28/24 13:01, Chris Hofstaedtler wrote:

IMO it would be good to support dealing with this earlier than
later.


And doing it in a way that can be reused by how autopkgtests are run 
would maybe be good too.


Paul



Re: Building with many cores without OOM

2024-11-28 Thread Holger Levsen
On Thu, Nov 28, 2024 at 10:54:37AM +0100, Helmut Grohne wrote:
> I think this demonstrates that we probably have something between 10 and
> 50 packages in unstable that would benefit from a generic parallelism
> limit based on available RAM. Do others agree that this is a problem
> worth solving in a more general way?

yes.


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Historians have a word for Germans who joined the Nazi party, not because they
hated Jews,  but out of hope for  restored patriotism,  or a sense of economic
anxiety,  or a hope  to preserve their  religious values,  or dislike of their
opponents,  or raw  political opportunism,  or convenience,  or ignorance,  or 
greed.
That word is "Nazi". Nobody cares about their motives anymore.


signature.asc
Description: PGP signature


Re: Building with many cores without OOM

2024-11-28 Thread Chris Hofstaedtler
* Helmut Grohne  [241128 10:59]:
> I think this demonstrates that we probably have something between 10 and
> 50 packages in unstable that would benefit from a generic parallelism
> limit based on available RAM. Do others agree that this is a problem
> worth solving in a more general way?

Yes. Looking at hardware trends, machines will be more
RAM-constrained per CPU core than ever.

IMO it would be good to support dealing with this earlier than
later.

  Chris