Re: swap on ZRAM, zswap, and Rust was: Better interactivity in low-memory situations

2019-09-18 Thread Chris Murphy
On Wed, Sep 18, 2019 at 6:57 PM Tom Seewald  wrote:
>
> Hi Chris,
>
> Does zswap actually keep the data compressed when the DRAM-based swap is 
> full, and it writes to the spill-over non-volatile swap device?
>
> I'm not an expert on this at all, however my understanding was that zswap 
> must decompress the data before it writes to the backing swap.  But perhaps I 
> am misunderstanding the purpose of zswap_writeback_entry()[1] and/or what it 
> does.
>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/zswap.c#n828

I don't know. But based on the tests I mention upthread, I'm not sure
how uncompressed pages are being swapped to disk. But then those times
don't account for even a 2:1 compression ratio, which is the best that
zbud/lz4 can achieve.


--
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: swap on ZRAM, zswap, and Rust was: Better interactivity in low-memory situations

2019-09-18 Thread Tom Seewald
Hi Chris,

Does zswap actually keep the data compressed when the DRAM-based swap is full, 
and it writes to the spill-over non-volatile swap device?

I'm not an expert on this at all, however my understanding was that zswap must 
decompress the data before it writes to the backing swap.  But perhaps I am 
misunderstanding the purpose of zswap_writeback_entry()[1] and/or what it does.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/zswap.c#n828
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: swap on ZRAM, zswap, and Rust was: Better interactivity in low-memory situations

2019-09-18 Thread Chris Murphy
Zbyszek,

Do you have any advice on how to assess 'swap on ZRAM' versus 'zswap'
by default for Fedora Workstation? They're really too similar from a
user point of view, I think it really comes down to the technical
arguments.

1a. 'swap on ZRAM' compresses only that which goes to the ZRAM device
1b. zswap compresses everything whether it goes to the memory pool or
swap on disk.
2a. 'swap on ZRAM' must be configured to give priority to the ZRAM
device; once full, disk swap (if present) is used
2b. zswap anticipates the future usage of data, favoring the memory or
disk swap locations accordingly

They both appear equally easy to enable by default for clean installs
and upgrades.

I'd say 'swap on ZRAM' is well suited for the cases where there's no
existing swap partition, and low memory devices. Whereas zswap is
better suited for average to higher end systems, where the main goal
is swap avoidance, but where zswap can help moderate the worst
performance effects of the transition to disk based swap.

It seems premature to drop the creation of a swap partition at
installation time. I think that'd be unexpected by most users. And
might have some consequences other than (unsupported) hibernation use
case.

So my assessment, at this point, would be to recommend zswap for
Fedora Workstation. Likely using zbud/lz4. Maybe by Fedora 33 there
will be more confidence and testing done on z3fold.

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: swap on ZRAM, zswap, and Rust was: Better interactivity in low-memory situations

2019-09-18 Thread Zbigniew Jędrzejewski-Szmek
Hi,

thank you for all the testing and comparisons between different
approaches. It looks really interesting.

> The ideal scenario is to get everyone on the same page, and so far it
> looks like systemd's zram-generator, built in Rust, meets all the
> requirements. That needs to be confirmed, but also right now there's a
> small problem, it's not working. So we kinda need a someone familiar
> with Rust and systemd to take this on, if we want to use the same
> thing everywhere.
> https://github.com/systemd/zram-generator/issues/4

For a while, the only feedback I had for zram-generator was from
people interested in rust. It's great that somebody is giving it a go ;)

I think the report in that issue is a slight exaggeration — IIUC, this
failure only occurs if zram-generator.conf is created and systemctl 
daemon-reload
and systemctl start swap.target called on a running system. After reboot,
things would still work. Obviously, it would be better to handle this case too.
I pushed some commits to the master branch now that close all the four
open issues, and this case should be handled too now.
If anything is wrong, please report it here or in bugzilla.

I'll tag a new version with those changes in a few days if nothing
else pops up.

Zbyszek

PS. I had a really surprising failure mode: on a VM with 2GB RAM (as
shown by free), the genarator was doing nothing and simply exiting
with no error. It turns out that the machine had "maximum allocation"
bigger than "current allocation", and for a brief momment during boot
/proc/meminfo would report more memory. Took me a while to figure this
one out.

Zbyszek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: swap on ZRAM, zswap, and Rust was: Better interactivity in low-memory situations

2019-09-14 Thread Chris Murphy
On Fri, Aug 30, 2019 at 1:55 PM Chris Murphy  wrote:
>
> Hi,
> This is yet another follow-up for this thread:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/XUZLHJ5O32OX24LG44R7UZ2TMN6NY47N/



(Benchmarks being fraught with peril, synthetic benchmarks even more
fraught with peril but at least their bias is obvious rather than
obfuscated behind unknown cutsy attempts to simulate an environment no
one has.)

This old bash fork bomb example fails (Fedora 31, earlier versions not tested)
$ :(){ :|:& };:
[ 1765.728408] cgroup: fork rejected by pids controller in
/user.slice/user-1000.slice/session-3.scope

So use 'munch' instead
https://gist.github.com/n3rve/7897c8ce1e17c22dc17a1df1b4e645f4
kernel 5.2.13

$ time ./munch

Measuring the time it takes to fill all memory and swap. I'm waiting
about 1 minutes between each run, with no other activity happening
during that time.

The typical outcome is:
"Allocated 14729 MB
Killed"

8GiB RAM, 8GiB swap on SSD plain partition

1. 0m43s
2. 0m47s
3. 0m46s

now enable zswap, lz4/zbud, with a 20% pool

1. 0m10s
2. 0m10s
3. 0m11s

now disable all of that (zswap and the swap on SSD)
enable swap on ZRAM at 1:1 ratio with installed RAM, using lz4

1. 0m11s
2. 0m10s
3. 0m11s

---

In the 2nd case, with zswap, swapon does in fact report 8GiB gets used
just before the kill. That's a little confusing because zswap
compresses both the memory pool as well as what spills over to the
swap partition. Do I have more swap available because it's compressed?
Doesn't seem to be the case as reported by 'free' or 'vmstat'. Do I
use less swap on disk? swapon says no. And yet the speed it's getting
to the swap partition suggests it's really completely compressed
already and isn't really generating much of any writes (because it's a
synthetic test, I assume all zeros, and thus highly compressible).

?

So, it's vaguely interesting.

Slightly more interesting, comparison to swapfiles on the same SSD,
per file system.

Btrfs
1. 0m50s
2. 0m46s
3. 0m53s

Ext4
1. 1m18s
2. 1m2s
3. 1m6s

XFS
1. 0m48s
2. 1m2s
3. 1m1s


Btrfs had the disadvantage in that it was not a new file system,
rather substantially used, many file system resizes. The ext4 and XFS
file systems were created just for the test, and were created on
partitions that had 'blkdiscard' issued beforehand. It's a requirement
to use chattr +C (nodatacow) on Btrfs, which implies nocsum and
nocompression. Anyway, they aren't ridiculously out of line with using
a plain partition.

So plausibly someone could create a systemd generator that dynamically
creates and destroys swapfiles. And also creates a hibernation file
just before hibernating. Not sure about encryption for any of that
though.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-13 Thread Daniel Xu
Our team at FB is working on a similar (but more generic) solution. All of our 
work is open source / upstreamed into the linux kernel and we're running it in 
production on quite a large scale already. Results are very promising. We'll be 
presenting about it at All Systems Go (multiple talks) this year.

We'd love to chat in-person if anyone is interested.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-13 Thread Daniel Xu
> On Mo, 12.08.19 09:40, Chris Murphy (lists(a)colorremedies.com) wrote:
> 
> 
> Ideally, GNOME would run all its apps as systemd --user services. We
> could then set DefaultMemoryHigh= globally for the systemd --user
> instance to some percentage value (which is taken relative to the
> physical RAM size). This would then mean every user app individually
> could use — let's say — 75% of the physical RAM size and when it wants
> more it would be penalized during reclaim compared to apps using less.
> 
> If GNOME would run all apps as user services we could do various other
> nice things too. For example, it could dynamically assign the fg app
> more CPU/IO weight than the bg apps, if the system is starved of
> both.
> 

Running each app as systemd --user services is something we've been trying to 
encourage teams to do at FB. It lets monitor things much better using the 
cgroup control files.

In addition, it lets us configure oomd ( 
https://github.com/facebookincubator/oomd ) to do much more intelligent things 
than kill the entire session. oomd is being proposed as a fedora package right 
now. I think the last missing piece for oomd to be really useful on desktop 
systems is the --user slice changes.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-04 Thread Kyle Marek
On 9/4/19 3:38 PM, Chris Murphy wrote:
> On Wed, Sep 4, 2019 at 1:03 PM Chris Murphy  wrote:
>> I'm skeptical of swapfiles for this use case, because it likely
>> increases the chance for file system disk contention, see section
>> 11.17
>> https://www.kernel.org/doc/gorman/html/understand/understand014.html
> Actually that looks outdated now.
>
> https://patchwork.kernel.org/patch/10347293/
>
> Btrfs since 5.0 also supports swapfiles by avoiding bmap, using
> btrfs_swap_activate, the btrfs equivalent of iomap_swap_activate

I misspoke when I said "file". It's an LVM logical volume.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-04 Thread Kyle Marek
On 9/4/19 3:16 PM, Florian Weimer wrote:
> * Kyle Marek:
>
>> I'm finding that 32G is not necessarily sufficient for compiling clang
>> itself. Similarly I've had a hard time compiling UnrealEngine from
>> source. I usually see ld using up to 12G of memory to link each
>> artifact. Using -j$(nproc) on a 16 VCPU system amplifies the issue. I
>> rely on adding another 32G swap file to complete the job.
> Out of curiosity, which ld do you use?  BFD ld, gold, or LLVM's?
>
> Thanks,
> Florian

In both cases, I believe it was GNU ld.

I know it was GNU ld when building clang/llvm tools itself. I only had
GCC and GNU binutils installed at the time.

UnrealEngine requires clang to compile, but I can't remember for sure if
they finally linked with ld (GNU on my $PATH) or lld.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-04 Thread Chris Murphy
On Wed, Sep 4, 2019 at 1:03 PM Chris Murphy  wrote:
>
> I'm skeptical of swapfiles for this use case, because it likely
> increases the chance for file system disk contention, see section
> 11.17
> https://www.kernel.org/doc/gorman/html/understand/understand014.html

Actually that looks outdated now.

https://patchwork.kernel.org/patch/10347293/

Btrfs since 5.0 also supports swapfiles by avoiding bmap, using
btrfs_swap_activate, the btrfs equivalent of iomap_swap_activate

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-04 Thread Chris Murphy
per cgroup swap file support
https://lwn.net/Articles/591495/

This might be interesting. What happens if swap is only assigned to
new unprivileged tasks? Your compile might take three days but it
doesn't make the GUI fall over? *shrug*

---
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-04 Thread Florian Weimer
* Kyle Marek:

> I'm finding that 32G is not necessarily sufficient for compiling clang
> itself. Similarly I've had a hard time compiling UnrealEngine from
> source. I usually see ld using up to 12G of memory to link each
> artifact. Using -j$(nproc) on a 16 VCPU system amplifies the issue. I
> rely on adding another 32G swap file to complete the job.

Out of curiosity, which ld do you use?  BFD ld, gold, or LLVM's?

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-04 Thread Chris Murphy
On Wed, Sep 4, 2019 at 12:01 PM Kyle Marek  wrote:
>
> I'm finding that 32G is not necessarily sufficient for compiling clang
> itself. Similarly I've had a hard time compiling UnrealEngine from
> source. I usually see ld using up to 12G of memory to link each
> artifact. Using -j$(nproc) on a 16 VCPU system amplifies the issue. I
> rely on adding another 32G swap file to complete the job.

I setup swap on ZRAM device at 1.5X RAM. That was just as
intermittently unresponsive (seconds, and periods above 10 minutes)
GUI, to the degree using the laptop as a laptop was pointless. But,
webkitgtk build did complete with the default command, and in a
shorter time than with a swap partition on SSD (which became
sufficiently unresponsive even via ssh that I gave up and forced power
off).

But, I don't think I can recommend such a setup as a default
installation of Fedora Workstation. On a system with 32G RAM, that
would mean a 48G /dev/zram device for swap. If it completely filled
up, it would use ~24G of RAM, leaving 8GiB for compiling + OS +
workstation stuff. You are getting a huge swap device that's almost as
fast as RAM, but it's so fast that it means burning a huge amount of
CPU cycles on compression/decompression. And kswapd is not threaded,
it maxes out at 100% of a single core, and once it's there, that's it,
you can't scale beyond it.


> I'm now using an NVME SSD for my swap file. No more hangs for the most
> part! Usually if I lock up, it's for maybe a minute before OOM steps in.
>
> However, I'd definitely like to see a non-zram "solution" for use cases
> like this. Ultimately I'd like to see the use of traditional swap files
> to not hang a system even if it is placed on a md RAID array. I'm
> guessing that the long term fix is for OOM to happen sooner, and/or
> kernel schedulers to be improved, as mentioned elsewhere in this thread.
> Ideally this wouldn't require the use of systemd or cgroups to make it
> possible.

I'm skeptical of swapfiles for this use case, because it likely
increases the chance for file system disk contention, see section
11.17
https://www.kernel.org/doc/gorman/html/understand/understand014.html

And in heavy swap use and memory pressure, I already see evidence of
applications evicting their own executable code and constantly
re-reading it from disk. That's increasing the chance of IO pressure
resulting in stalls.

Incidental swap is fine, and I think it's possible to use either zram
or zswap to moderate the transition to a swap partition. But once this
grows to an aggressive swap dependency, the desktop is responsiveness
is lost.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-04 Thread Kyle Marek
On 9/3/19 12:30 AM, John Harris wrote:
> On Tuesday, August 20, 2019 10:48:06 PM MST John Harris wrote:
>> On Sunday, August 18, 2019 4:33:47 AM MST Gordan Bobic wrote:
>>
>>> On Sun, Aug 11, 2019 at 10:36 AM  wrote:
>>>
 This seems like a distraction from the real goal here, which is to
 ensure Fedora remains responsive under heavy memory pressure,
>>>
>>> I think this is an overwhelmingly important point, and as somebody
>>> regularly working with ARM machines with tiny amounts of RAM, it is of
>>> considerable interest to me.
>>> I typically use CentOS because stability is important to me, but most
>>> worthwhile things filter to there, so I hope what I'm about to say is not
>>> _too_ deprecated.
>>>
>>> 1) Compile options
>>> From what I can tell from rpm macro options, default on C7 seems to be
>>> -O2.
>  -Os seems to help in most cases.
>>> Adding -ffunction-sections -fdata-sections to defaults can help
>>> considerably in producing smaller binaries, and is not the default.
>>> Linking with -Wl,--gc-sections helps a lot and is not the default
>>> Extensive stripping seems to already be the default (--strip-unneeded,
>>> removal of .comment and .note sections)
>>>
>>> 2) Runtime condiguration
>>> Default stack size is 8192 (ulimit -s). This unnecessarily eats a
>>> considerably amount of memory. I have yet to see anything that actually
>>> experiences problems with 1M.
>>>
>>> 3) zram
>>> This was mentioned earlier in the thread, and on most of my systems,
>>> memory constrained or otherwise, unless I have an overwhelming reason
>>> not to, I run with zram swap equal in size to RAM with lz4 compression
>>> and
>>> vm.swappiness=100. I typically see compression ratios between 2:1 and 3:1
>>> in zram, so on a system with, say, 10GB of RAM, it would provide 10GB of
>>> very fast swap at a cost of 3-5GB of RAM. This seems like a favourable
>>> trade off, especially on systems with extremely constrained RAM (e.g. ARM
>>> devices with 512MB of RAM).
>>>
>>> I'm sure there is more that can be done, but this seems like a good start
>>> as far as the cost / benefit is concerned.
>>
>> Python, Lua and a few other common programs can have issues with a stack
>> size  of 1MiB.
>>
>> -- 
>> John M. Harris, Jr. 
>> Splentity
>> https://splentity.com/
> I would also like to add that I don't see how it's even possible to run into 
> a 
> low-memory scenario on a system with 10 GiB (That's a *lot* of memory! I run 
> on a Core 2 Duo based system that can support a max of 8 GiB as my daily 
> driver.) often enough to have a problem with oom_killer.

I'm finding that 32G is not necessarily sufficient for compiling clang
itself. Similarly I've had a hard time compiling UnrealEngine from
source. I usually see ld using up to 12G of memory to link each
artifact. Using -j$(nproc) on a 16 VCPU system amplifies the issue. I
rely on adding another 32G swap file to complete the job.

I'm now using an NVME SSD for my swap file. No more hangs for the most
part! Usually if I lock up, it's for maybe a minute before OOM steps in.

However, I'd definitely like to see a non-zram "solution" for use cases
like this. Ultimately I'd like to see the use of traditional swap files
to not hang a system even if it is placed on a md RAID array. I'm
guessing that the long term fix is for OOM to happen sooner, and/or
kernel schedulers to be improved, as mentioned elsewhere in this thread.
Ideally this wouldn't require the use of systemd or cgroups to make it
possible.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-03 Thread Dan Čermák
Chris Murphy  writes:

> On Mon, Aug 12, 2019 at 5:47 PM Emery Berger  wrote:
>>
>> For what it's worth, my research group attacked basically exactly this 
>> problem some time ago. We built a modified Linux kernel that we called 
>> Redline that was utterly resilient to fork bombs, malloc bombs, and so on. 
>> No process could take down the system, much less unprivileged ones. I think 
>> some of the ideas we described back then would be worth adopting / adapting 
>> today (the code is of course hopelessly out of date: we published our paper 
>> on this at OSDI 2008).
>
> I'm unable to find a concurring or dissenting opinions on this.  What
> kind of peer review has it received? Was it ever raised with upstream
> kernel developers? What were there responses?

I have only read parts of the Redline paper so I do not know if it was
ever tried to submit this upstream.

Judging from the Redline webpage
(https://emeryberger.com/research/redline/), it appears to only ever
been implemented on i386 and nowhere else (albeit that shouldn't be hard
to fix). Furthermore it does not support NUMA, which might be a bigger
blocker.

My guess is that Redline might clash with the general idea how processes
should be scheduled of upstream Linux. Redline solves the problem of
keeping interactive applications interactive even under severe memory
pressure by changing the way they are scheduled, allocated memory and
how much data they are allowed to read from disks. If an application is
classified as interactive (in contrast to best-effort tasks, which
corresponds to a process in the current Linux kernel), then it will get
a requested amount of CPU time each x ms (e.g. to be able to run at 25
fps). Something comparable is done with memory and disk usage.

This is a pretty nice approach in my opinion but it has certain
downsides:
- scheduling gets more complicated
- you need additional system calls to tell the kernel which processes
  are interactive (otherwise they are treated the "old" way and you gain
  nothing)
- you need a userspace component that has a database of interactive
  tasks (with a small set of configs, e.g. how often does your process
  need a chunk of the CPU time)

It could be that the kernel community would perceive that as a blocker
and would instead prefer a different and more generic solution (this is
just my personal guess). It could also very well be that no one had time
to actually upstream this, as it was an academic project (no offense
intended, I've been in academia myself and know how things
go).

Unfortunately, Redline was developed more than a decade ago, so
upstreaming it nowadays is probably equivalent to a full rewrite, given
the kernel's development pace.


Cheers,

Dan


signature.asc
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-02 Thread John Harris
On Tuesday, August 20, 2019 10:48:06 PM MST John Harris wrote:
> On Sunday, August 18, 2019 4:33:47 AM MST Gordan Bobic wrote:
> 
> > On Sun, Aug 11, 2019 at 10:36 AM  wrote:
> > 
> > > This seems like a distraction from the real goal here, which is to
> > > ensure Fedora remains responsive under heavy memory pressure,
> > 
> > 
> > I think this is an overwhelmingly important point, and as somebody
> > regularly working with ARM machines with tiny amounts of RAM, it is of
> > considerable interest to me.
> > I typically use CentOS because stability is important to me, but most
> > worthwhile things filter to there, so I hope what I'm about to say is not
> > _too_ deprecated.
> > 
> > 1) Compile options
> > From what I can tell from rpm macro options, default on C7 seems to be
> > -O2.
 -Os seems to help in most cases.
> > Adding -ffunction-sections -fdata-sections to defaults can help
> > considerably in producing smaller binaries, and is not the default.
> > Linking with -Wl,--gc-sections helps a lot and is not the default
> > Extensive stripping seems to already be the default (--strip-unneeded,
> > removal of .comment and .note sections)
> > 
> > 2) Runtime condiguration
> > Default stack size is 8192 (ulimit -s). This unnecessarily eats a
> > considerably amount of memory. I have yet to see anything that actually
> > experiences problems with 1M.
> > 
> > 3) zram
> > This was mentioned earlier in the thread, and on most of my systems,
> > memory constrained or otherwise, unless I have an overwhelming reason
> > not to, I run with zram swap equal in size to RAM with lz4 compression
> > and
> > vm.swappiness=100. I typically see compression ratios between 2:1 and 3:1
> > in zram, so on a system with, say, 10GB of RAM, it would provide 10GB of
> > very fast swap at a cost of 3-5GB of RAM. This seems like a favourable
> > trade off, especially on systems with extremely constrained RAM (e.g. ARM
> > devices with 512MB of RAM).
> > 
> > I'm sure there is more that can be done, but this seems like a good start
> > as far as the cost / benefit is concerned.
> 
> 
> Python, Lua and a few other common programs can have issues with a stack
> size  of 1MiB.
> 
> -- 
> John M. Harris, Jr. 
> Splentity
> https://splentity.com/

I would also like to add that I don't see how it's even possible to run into a 
low-memory scenario on a system with 10 GiB (That's a *lot* of memory! I run 
on a Core 2 Duo based system that can support a max of 8 GiB as my daily 
driver.) often enough to have a problem with oom_killer.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: swap on ZRAM, zswap, and Rust was: Better interactivity in low-memory situations

2019-09-02 Thread Martin Kolman


- Original Message -
> From: "Chris Murphy" 
> To: "Development discussions related to Fedora" 
> 
> Sent: Friday, August 30, 2019 9:55:52 PM
> Subject: swap on ZRAM, zswap, and Rust was: Better interactivity in 
> low-memory situations
> 
> Hi,
> This is yet another follow-up for this thread:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/XUZLHJ5O32OX24LG44R7UZ2TMN6NY47N/
> 
> 
> Basics:
> "zswap" compresses swap and uses a defined memory pool as a cache,
> with spill over (still compressed) going into a conventional swap
> partition. The memory pool doesn't appear as a separate block device.
> A conventional swap partition on a drive is required.
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/blockdev/zram.txt?h=v5.2.9
> 
> "swap on ZRAM" A ZRAM device appears as a block device, and is
> effectively a compressed RAM disk. It's common for this to be the
> exclusive swap device, of course it is volatile so in that
> configuration your system can't hibernate. But it's also possible to
> use swap priority in fstab to cause the ZRAM device to be used with
> higher priority, and a conventional swap partition on a drive with a
> lower priority.
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/vm/zswap.rst?h=v5.2.9

Just a slight addition to this comparison - AFAIK there is a slight difference 
in how zswap and zram handle the
in-ram swap device being full & making use of the swap device on harddrive.

If the zswap device becomes full, zswap will, according to the docs, free up 
spase in RAM by moving the least recently used
pages to the disk, so that the "hot" pages stay in ram & new pages can be 
placed there.

In comparison AFAIK, there is no such mechanism for zram and the priority value 
simply means which swap will be used as the first
one and once it becomes full, new pages will simply go to the next swap with 
lower priority. Please correct me if I am completely
wrong and the Linux swap allocation algorithm actually moves pages between swap 
devices based on priority. :)

> 
> 
> What they do:
> Either strategy can help avoid swap thrashing, by moderating the
> transition from exclusively RAM based work, to heavy swapping on disk.
> In my testing, the most aggressive memory starved workloads still
> result in an unresponsive system. Neither are a complete solution,
> they really seem to just be moderators that kick the can down the
> road. But I do think it's an improvement especially in the incidental
> swap use case, where transition from memory to swap isn't noticeable.
> 
> 
> Which is better?
> I don't know. Seriously, that's what all of my testing as come down
> to. A user won't likely notice the difference. Both dynamically
> allocate memory to their "memory pools" on demand. But otherwise, they
> really are two very different implementations. Regardless, Fedora
> Workstation and probably even Fedora Server, should use one of them by
> default out of the box.
> 
> IoT folks are already using swap on ZRAM by default, in lieu of a disk
> based swap partition. And Anaconda folks are doing the same for low
> memory devices when the installer is launched. I've been using zswap
> on Fedora Workstation edition on my laptop, and Fedora Server on an
> Intel NUC, for maybe two years (earlier this summer I switched both of
> them swap on ZRAM to compare).
> 
> How are they different?
> There are several "swap on ZRAM" implementations. The zram package in
> Fedora right now is what IoT folks are using which installs a systemd
> service unit to setup the ZRAM block device, mkswap on it, and then
> swapon, during system startup. Simple.
> 
> The ideal scenario is to get everyone on the same page, and so far it
> looks like systemd's zram-generator, built in Rust, meets all the
> requirements. That needs to be confirmed, but also right now there's a
> small problem, it's not working. So we kinda need a someone familiar
> with Rust and systemd to take this on, if we want to use the same
> thing everywhere.
> https://github.com/systemd/zram-generator/issues/4
> 
> Whereas zswap is setup by using boot parameters, which we could have
> the installer set, contingent on a conventional swap partition being
> created.
> zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=20
> zswap.zpool=zbud
> 
> Zswap upstream tells me they're close to dropping the experimental
> status, hopefully by the end of the summer. It might be a bit longer
> before they're as confident with zpool type z3fold.
Indeed, I&

Re: testing oomd, cgroupsv2, was: Better interactivity in low-memory situations

2019-09-01 Thread Chris Murphy
Per this suggestion [1] by hakavlad (Alexey Avramov), I did a single
test with earlyoom, a user space service that's already packaged for
Fedora. I have not yet tested nohang, also mentioned.

I chose a configuration that has fairly consistently (>80%) resulted
in a total hang for more than 30m. And the result with earlyoom is
that while responsiveness in the GUI was still bad, the longest GUI
freeze lasted ~8 minutes ending in oom which is certainly a lot better
than a 30+ minute hang. It's entirely subjective, but my opinion is
the system was legitimately lost within the 1st minute of hang, and a
reasonable user can choose to give up at that point. Is this an
improvement? Yes, the oom happens sooner, and maybe more testing will
prove it's more predictable. Is it good enough? No. Should Fedora
enable earlyoom in Fedora 32 Workstation? Maybe.

Configuration:
CPU i7-2820QM
RAM: 7837M
swap on ZRAM (lz4): 7836M (there is no other swap)
BOOT_IMAGE=(hd5,gpt6)/boot/vmlinuz-5.3.0-0.rc6.git1.1.fc32.x86_64
root=UUID=72df6d5b-26d1-47ff-a9ab-33f6a0b2c4cf ro
rootflags=subvol=root log_buf_len=4M systemd.debug-shell=1
printk.devkmsg=on slub_debug=FZPU

Logs [2] [3] [4] all monotonic time, and screenshot [5]. The
screenshot monotonic time equivalent is ~ [ 4890.404675] which
coincides with GUI totally hung, and a sysrq+t issued via ssh. There's
a lot going on including some i915 weirdness that I've been informed
by upstream is swap related, but I don't know the significance of
these kernel/gnome-shell page allocation failure complaints (they
don't taint the kernel). But suffice to say there are questionable
things happening well before the oom kill is issued.


[1]
https://pagure.io/fedora-workstation/issue/98#comment-594295
[2]
Full journal
https://drive.google.com/open?id=1nnZupWFnGfqu41aYs-u3UL8ir180YiL_
[3]
dmesg only
https://drive.google.com/open?id=1FNc7e-XuIiIAzSdBkOg98x9jgYEQQWbr
[4]
earlyoom messages only
https://drive.google.com/open?id=1drd570PRbUiiSnCwoP26DMSgSo93SdAC
[5]
screenshot shows top and iotop at the time of heavy CPU, IO, memory
and swap pressure
https://drive.google.com/open?id=1unLv11HmHW3bYOJvlLSuZYSiv6gyw7L3
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-09-01 Thread Chris Murphy
On Mon, Aug 12, 2019 at 5:47 PM Emery Berger  wrote:
>
> For what it's worth, my research group attacked basically exactly this 
> problem some time ago. We built a modified Linux kernel that we called 
> Redline that was utterly resilient to fork bombs, malloc bombs, and so on. No 
> process could take down the system, much less unprivileged ones. I think some 
> of the ideas we described back then would be worth adopting / adapting today 
> (the code is of course hopelessly out of date: we published our paper on this 
> at OSDI 2008).

I'm unable to find a concurring or dissenting opinions on this.  What
kind of peer review has it received? Was it ever raised with upstream
kernel developers? What were there responses?

I wonder if the question of interactivity is just not a priority
upstream still, as they see various competing user space solutions for
this problem and that this suggests a generic solution is either not
practical to incorporate into the kernel, or maybe it isn't desired?

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


swap on ZRAM, zswap, and Rust was: Better interactivity in low-memory situations

2019-08-30 Thread Chris Murphy
Hi,
This is yet another follow-up for this thread:
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/XUZLHJ5O32OX24LG44R7UZ2TMN6NY47N/


Basics:
"zswap" compresses swap and uses a defined memory pool as a cache,
with spill over (still compressed) going into a conventional swap
partition. The memory pool doesn't appear as a separate block device.
A conventional swap partition on a drive is required.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/blockdev/zram.txt?h=v5.2.9

"swap on ZRAM" A ZRAM device appears as a block device, and is
effectively a compressed RAM disk. It's common for this to be the
exclusive swap device, of course it is volatile so in that
configuration your system can't hibernate. But it's also possible to
use swap priority in fstab to cause the ZRAM device to be used with
higher priority, and a conventional swap partition on a drive with a
lower priority.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/vm/zswap.rst?h=v5.2.9


What they do:
Either strategy can help avoid swap thrashing, by moderating the
transition from exclusively RAM based work, to heavy swapping on disk.
In my testing, the most aggressive memory starved workloads still
result in an unresponsive system. Neither are a complete solution,
they really seem to just be moderators that kick the can down the
road. But I do think it's an improvement especially in the incidental
swap use case, where transition from memory to swap isn't noticeable.


Which is better?
I don't know. Seriously, that's what all of my testing as come down
to. A user won't likely notice the difference. Both dynamically
allocate memory to their "memory pools" on demand. But otherwise, they
really are two very different implementations. Regardless, Fedora
Workstation and probably even Fedora Server, should use one of them by
default out of the box.

IoT folks are already using swap on ZRAM by default, in lieu of a disk
based swap partition. And Anaconda folks are doing the same for low
memory devices when the installer is launched. I've been using zswap
on Fedora Workstation edition on my laptop, and Fedora Server on an
Intel NUC, for maybe two years (earlier this summer I switched both of
them swap on ZRAM to compare).

How are they different?
There are several "swap on ZRAM" implementations. The zram package in
Fedora right now is what IoT folks are using which installs a systemd
service unit to setup the ZRAM block device, mkswap on it, and then
swapon, during system startup. Simple.

The ideal scenario is to get everyone on the same page, and so far it
looks like systemd's zram-generator, built in Rust, meets all the
requirements. That needs to be confirmed, but also right now there's a
small problem, it's not working. So we kinda need a someone familiar
with Rust and systemd to take this on, if we want to use the same
thing everywhere.
https://github.com/systemd/zram-generator/issues/4

Whereas zswap is setup by using boot parameters, which we could have
the installer set, contingent on a conventional swap partition being
created.
zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=20 zswap.zpool=zbud

Zswap upstream tells me they're close to dropping the experimental
status, hopefully by the end of the summer. It might be a bit longer
before they're as confident with zpool type z3fold.

Hackfest anyone?



-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


testing oomd, cgroupsv2, was: Better interactivity in low-memory situations

2019-08-30 Thread Chris Murphy
Hi,
This is a follow-up for this thread:
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/XUZLHJ5O32OX24LG44R7UZ2TMN6NY47N/

Has anyone looked at oomd, or is anyone interested in testing and
comparing it to alternatives?

https://facebookmicrosites.github.io/oomd/docs/overview
https://news.ycombinator.com/item?id=17590858

The origin for oomd is servers, and the case study at Facebook is also
server centric. But oomd is also very flexible, with the option to
arrive over the medium term a cooperative approach to resource
management.

However, my more immediate interest is to make heavy memory pressure
and swap usage (versus incidental use of swap) result in a more
predictable outcome. Right now this is all over the map, maybe the
process you would have picked to kill (if you could) is killed. Maybe
something else is killed and you don't notice, but it frees up just
enough memory to prevent anything else from being killed, and now
you're stuck in swap hell. It's a lot of maybes.

And the final implosion of a system isn't really what matters because
at this point, once it happens, the system is already in some kind of
tail spin. And what that means is we can't even really iterate on any
improvements because all the outcomes right now appear to suck. So how
about avoiding tail spins in the first place?

And a quick search, Lennart mentions oomd in the 'raise fileno
limit...' thread from Oct 2018 during early discussions of cgroupsv2.

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-20 Thread John Harris
On Sunday, August 18, 2019 4:33:47 AM MST Gordan Bobic wrote:
> On Sun, Aug 11, 2019 at 10:36 AM  wrote:
> > This seems like a distraction from the real goal here, which is to
> > ensure Fedora remains responsive under heavy memory pressure,
> 
> I think this is an overwhelmingly important point, and as somebody
> regularly working with ARM machines with tiny amounts of RAM, it is of
> considerable interest to me.
> I typically use CentOS because stability is important to me, but most
> worthwhile things filter to there, so I hope what I'm about to say is not
> _too_ deprecated.
> 
> 1) Compile options
> From what I can tell from rpm macro options, default on C7 seems to be -O2.
> -Os seems to help in most cases.
> Adding -ffunction-sections -fdata-sections to defaults can help
> considerably in producing smaller binaries, and is not the default.
> Linking with -Wl,--gc-sections helps a lot and is not the default
> Extensive stripping seems to already be the default (--strip-unneeded,
> removal of .comment and .note sections)
> 
> 2) Runtime condiguration
> Default stack size is 8192 (ulimit -s). This unnecessarily eats a
> considerably amount of memory. I have yet to see anything that actually
> experiences problems with 1M.
> 
> 3) zram
> This was mentioned earlier in the thread, and on most of my systems, memory
> constrained or otherwise, unless I have an overwhelming reason not to, I
> run with zram swap equal in size to RAM with lz4 compression and
> vm.swappiness=100. I typically see compression ratios between 2:1 and 3:1
> in zram, so on a system with, say, 10GB of RAM, it would provide 10GB of
> very fast swap at a cost of 3-5GB of RAM. This seems like a favourable
> trade off, especially on systems with extremely constrained RAM (e.g. ARM
> devices with 512MB of RAM).
> 
> I'm sure there is more that can be done, but this seems like a good start
> as far as the cost / benefit is concerned.

Python, Lua and a few other common programs can have issues with a stack size 
of 1MiB.

-- 
John M. Harris, Jr. 
Splentity
https://splentity.com/

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-20 Thread Chris Murphy
On Tue, Aug 20, 2019 at 2:15 AM Lennart Poettering  wrote:
>
> On Mo, 19.08.19 13:58, Chris Murphy (li...@colorremedies.com) wrote:
>
> > I'm skeptical as well. But to further explore this:
> >
> > 1. Does the kernel know better than to write a hibernation image (all
> > or part) to a /dev/zram device? e.g. a system with: 8GiB RAM, 8GiB
> > swap on ZRAM, 8GiB swap partition. We can use swap priority to use the
> > ZRAM device first, and conventional swap partition second. If the
> > user, today, were to hibernate, what happens?
>
> Usespace takes care of this. It tells the kernel which swap device to
> hibernate to and it nowadays understands that zswap is not a
> candidate, and picks the largest swap with the highes prio these days:

For what it's worth, swap on /dev/zram is a totally different thing than zswap.

/dev/zram is just a compressed RAM disk. You can configure a size, but
it only consumes memory as it actually gets used, dynamic allocation.
This can be used for swap standalone, no conventional disk based swap
partition is needed. But if there is one, and it's set to a lower
priority than swap on /dev/zram, then it has the effect of spilling
over (but spill over is uncompressed).

zswap basically always compresses all of swap, with a predefined size
memory pool "cache", and requires a conventional disk based swap
partition as the spill over. Spill over is also compressed.

They superficially sound very similar but the strategies are different
on the details. I've been using both strategies (separately), but have
the most experience with zswap even though above I was referring to
swap on a ZRAM device. I know, so many Z's.  But gist is, I can't
really discern any differences from a user point of view.

Zwap uses just a few kernel parameters to set it up. Whereas with swap
on zram, it requires a service unit file to setup the block device,
mkswap, and then swapon.

The swap on ZRAM thing is further complicated by multiple
implementations, and the preferred systemd zram-generator is
apparently broken.
https://github.com/systemd/zram-generator/issues/4

IoT folks are using swap on ZRAM now, via the Fedora zram package
(systemd unit file to set everything up). Anaconda folks have their
own built-in swap on ZRAM setup that runs on low memory systems when
anaconda is launched. This happens on both Fedora netinstalls and
LiveOS. And it makes sense for those use cases where a disk based swap
partition doesn't exist, and maybe shouldn't.

Whereas for servers and workstations, zswap is well suited, as they're
perhaps more likely to have a conventional swap partition and have use
cases where spillover is likely.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/vm/zswap.rst?h=v5.1.12
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/vm/z3fold.rst

And

https://www.mjmwired.net/kernel/Documentation/blockdev/zram.txt

So why not zswap? Well, kernel documentation shows it as being
experimental still, but upstream considers it stable enough for
production use using zbud allocator now, and z3fold allocator by the
end of the summer they think.
https://bugzilla.kernel.org/show_bug.cgi?id=204563#c6

*shrug*


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-20 Thread Lennart Poettering
On Mo, 19.08.19 13:58, Chris Murphy (li...@colorremedies.com) wrote:

> I'm skeptical as well. But to further explore this:
>
> 1. Does the kernel know better than to write a hibernation image (all
> or part) to a /dev/zram device? e.g. a system with: 8GiB RAM, 8GiB
> swap on ZRAM, 8GiB swap partition. We can use swap priority to use the
> ZRAM device first, and conventional swap partition second. If the
> user, today, were to hibernate, what happens?

Usespace takes care of this. It tells the kernel which swap device to
hibernate to and it nowadays understands that zswap is not a
candidate, and picks the largest swap with the highes prio these days:

https://github.com/systemd/systemd/blob/master/src/shared/sleep-config.c#L189

> 2. Are you suggesting it would be possible to build support for
> multiple swaps and have them dynamically enabled/disabled? e.g. the
> same system as above, but the 8GiB swap on disk is actually made
> across two partitions. i.e. a 2GiB partition and 6GiB partition.
> Normal operation would call for swapon for /dev/zram *and* the small
> on-disk swap. Only for hibernation would swapon happen for the larger
> on-disk swap partition (the 2GiB one always stays on).

Yes, that's what I was suggesting.

> That's... interesting. It sounds potentially complicated. I can't
> estimate if it could be fragile.

Yeah. It's an idea. No sure it's a good one though.

> Let's consider something else: Hibernation is subject to kernel
> lockdown policy on UEFI Secure Boot enabled computers. What percentage
> of Fedora users these days are likely subject to this lockdown? Are we
> able to effectively support hibernation? On the one hand, Fedora does
> not block on hibernation bugs (kernel or firmware), thus not
> supported. But tacitly hibernation is supported because a bunch of
> users pushed an effort with Anaconda folks to make sure the swap
> device is set with "resume=" boot parameter with out of the box
> installations.

We probably should look into supporting hibernation to encrypted swap
with a key tied to the TPM. That way hibernation should be fully safe.

> Another complicating issue: the Workstation working group has an issue
> to explore better protecting user data by encrypting /home by default.
> Of course, user data absolutely can and does leak into swap. Therefore
> I think we're obligated to consider encrypting swap too. And if swap
> is encrypted, how does resume from hibernation work? I guess
> kernel+initramfs load, and plymouth asks for passphrase which unlocks
> encrypted swap, and the kernel knows to resume from that device-mapper
> device?

I am pretty sure swap encryption really should be tied to the TPM. In
fact, it's one of the very few cases where tying things to the TPM
exclusively really makes sense.

So far noone prepared convincing patches to do this though. If anyone
wants to look into this, I'd be happy to review a patch for
systemd-cryptsetup for example.

Lennart

--
Lennart Poettering, Berlin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-19 Thread Chris Murphy
On Mon, Aug 12, 2019 at 10:20 AM Lennart Poettering
 wrote:
>
> On Mo, 12.08.19 09:40, Chris Murphy (li...@colorremedies.com) wrote:
>
> > Right now the only lever to avoid swap, is to not create a swap
> > partition at installation time. Or create a smaller one instead of 1:1
> > ratio with RAM. Or use a 1/4 RAM sized swap on ZRAM. A consequence of
> > each of these alternatives, is hibernation can't be used. Fedora
> > already explicitly does not support hibernation, but strictly that
> > means we don't block release on hibernation related bugs. Fedora does
> > still create a swap that meets the minimum size for hibernation, and
> > also inserts the required 'resume' kernel  parameter to locate the
> > hibernation image at the next boot. So we kinda sorta do support it.
>
> We could add a mode to systemd's hibernation support to only "swapon"
> a swap partition immediately before hibernating, and "swapoff" it
> right after coming back. This has been proposed before, but noone so
> far did the work on it. But quite frankly this feels just like taping
> over the fact that the Linux kernel is rubbish when it comes to
> swapping...

I'm skeptical as well. But to further explore this:

1. Does the kernel know better than to write a hibernation image (all
or part) to a /dev/zram device? e.g. a system with: 8GiB RAM, 8GiB
swap on ZRAM, 8GiB swap partition. We can use swap priority to use the
ZRAM device first, and conventional swap partition second. If the
user, today, were to hibernate, what happens?

2. Are you suggesting it would be possible to build support for
multiple swaps and have them dynamically enabled/disabled? e.g. the
same system as above, but the 8GiB swap on disk is actually made
across two partitions. i.e. a 2GiB partition and 6GiB partition.
Normal operation would call for swapon for /dev/zram *and* the small
on-disk swap. Only for hibernation would swapon happen for the larger
on-disk swap partition (the 2GiB one always stays on).

That's... interesting. It sounds potentially complicated. I can't
estimate if it could be fragile.

Let's consider something else: Hibernation is subject to kernel
lockdown policy on UEFI Secure Boot enabled computers. What percentage
of Fedora users these days are likely subject to this lockdown? Are we
able to effectively support hibernation? On the one hand, Fedora does
not block on hibernation bugs (kernel or firmware), thus not
supported. But tacitly hibernation is supported because a bunch of
users pushed an effort with Anaconda folks to make sure the swap
device is set with "resume=" boot parameter with out of the box
installations.

Another complicating issue: the Workstation working group has an issue
to explore better protecting user data by encrypting /home by default.
Of course, user data absolutely can and does leak into swap. Therefore
I think we're obligated to consider encrypting swap too. And if swap
is encrypted, how does resume from hibernation work? I guess
kernel+initramfs load, and plymouth asks for passphrase which unlocks
encrypted swap, and the kernel knows to resume from that device-mapper
device?

I'm really skeptical of pissing off users who want hibernation to
work. But I'm also very skeptical of compromising other priorities,
and diverting resources, just for hibernation.

If you wait long enough between replies, I will find another log to
throw on this fire, somewhere. :-D


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-19 Thread Gordan Bobic
This seems very reminiscent of what is often referred to as "swapping
insanity", often in the context of MySQL.
The root cause there is NUMA with memory allocation only happening on the
node that requests the memory, resulting in the possibility of there being
plenty of free memory on one node but the allocating node deciding to swap
out instead of allocating far memory. I don't think there was ever a
solution fielded, but the workaround was to set the memory heavy process up
with numa round-robin allocation. This results in even allocation across
the nodes, thus avoiding swapping, but it capitulates to the fact that half
of the memory is always going to be twice as far away in terms of latency.
I don't know if the underlying root cause is shared in this case (something
less than clever in the way memory allocation is handled).

On Mon, Aug 19, 2019 at 9:55 AM Florian Weimer  wrote:

> * Gordan Bobic:
>
> > That may be so, but this thread started off with memory pressure also
> > being an issue for regular desktop x86 use.
>
> I think the problem there is that the system has sufficient reclaimable
> memory, but cannot make that memory available to applications in a
> timely fashion.
>
> Reducing compiled application footprint will only increase the amount of
> reclaimable memory, probably not changing anything as far as the actual
> problem is concerned.
>
> Thanks,
> Florian
>
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-19 Thread Florian Weimer
* Gordan Bobic:

> That may be so, but this thread started off with memory pressure also
> being an issue for regular desktop x86 use.

I think the problem there is that the system has sufficient reclaimable
memory, but cannot make that memory available to applications in a
timely fashion.

Reducing compiled application footprint will only increase the amount of
reclaimable memory, probably not changing anything as far as the actual
problem is concerned.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Emery Berger
For what it's worth, my research group attacked basically exactly this
problem quite some time ago. We built a modified Linux kernel that we
called Redline that was impervious to fork bombs, malloc bombs, and so on.
No process could take down the system, much less unprivileged ones. I think
some of the ideas we described back then would be worth adopting / adapting
today (the code is of course hopelessly out of date: we published our paper
on this at OSDI 2008).

We had a demo where we would run two identical systems, side by side, with
the same workloads (a number of videos playing simultaneously), but with
one running Redline, and the other running stock Linux. We would launch a
fork/malloc bomb on both. The Redline system barely hiccuped. The stock
Linux kernel would freeze and become totally unresponsive (or panic). It
was a great demo, but also a pain, since we invariably had to restart the
stock Linux box :).

Redline: first class support for interactivity in commodity operating
systems

While modern workloads are increasingly interactive and resource-intensive
(e.g., graphical user interfaces, browsers, and multimedia players),
current operating systems have not kept up. These operating systems, which
evolved from core designs that date to the 1970s and 1980s, provide good
support for batch and command-line applications, but their ad hoc attempts
to handle interactive workloads are poor. Their best-effort, priority-based
schedulers provide no bounds on delays, and their resource managers (e.g.,
memory managers and disk I/O schedulers) are mostly oblivious to response
time requirements. Pressure on any one of these resources can significantly
degrade application responsiveness.

We present Redline, a system that brings first-class support for
interactive applications to commodity operating systems. Redline works with
unaltered applications and standard APIs. It uses lightweight
specifications to orchestrate memory and disk I/O management so that they
serve the needs of interactive applications. Unlike realtime systems that
treat specifications as strict requirements and thus pessimistically limit
system utilization, Redline dynamically adapts to recent load, maximizing
responsiveness and system utilization. We show that Redline delivers
responsiveness to interactive applications even in the face of extreme
workloads including fork bombs, memory bombs and bursty, large disk I/O
requests, reducing application pauses by up to two orders of magnitude.

Paper here:

https://www.usenix.org/legacy/events/osdi08/tech/full_papers/yang/yang.pdf

And links to code here:

https://emeryberger.com/research/redline/

There has been some recent follow-on work in this direction: see this work
out of Remzi and Andrea's lab at Wisconsin:
http://pages.cs.wisc.edu/~remzi/Classes/739/Fall2016/Papers/splitio-sosp15.pdf

-- emery

--
Professor Emery Berger
College of Information and Computer Sciences
University of Massachusetts Amherst
www.emeryberger.org, @emeryberger


On Sun, Aug 18, 2019 at 2:53 PM Chris Murphy 
wrote:

> On Sun, Aug 18, 2019 at 2:55 PM Gordan Bobic  wrote:
> >
> > On Sun, Aug 18, 2019 at 9:07 PM Kevin Kofler 
> wrote:
> >>
> >> Gordan Bobic wrote:
> >> > Right, but is it better that _everything_ else suffers with more
> memory
> >> > pressure for the handful of relatively infrequent use cases for which
> >> > ulimit can be used to explicitly raise the limit?
> >>
> >> Well, as I wrote, a lower limit might actually make sense on ARM. But
> modern
> >> x86 computers have gigabytes of RAM, so 1 MiB is ridiculously small
> there.
> >> So this would have to be an architecture-specific setting for ARM.
> >
> >
> > That may be so, but this thread started off with memory pressure also
> being an issue for regular desktop x86 use.
> >
>
> I think optimizations like this, and including compile time defaults
> should get smarter to do such optimizations and have a lot of
> intrinsic value. But in any case, I think it's fair to say that we're
> in very broad agreement that no matter what options get used or what
> optimization do or don't happen, unprivileged processes should not be
> able to effectively take down the system. That to me is really
> incredible to discover.
>
> Everything else: no swap at all and tolerate abrupt and random
> oom-killer killoffs, double the swap or use /dev/zram, or use 1/4 RAM
> for swap, or throw a metric f ton of RAM at it, all of those are
> different ways of dodging a cannon ball. Dodging the problem doesn't
> actually fix the problem.Iff your dodge doesn't work out, you get hit
> by a cannon ball. Not OK. It's an unprivileged task! I'm aghast.
>
>
> --
> Chris Murphy
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Ma

Re: Better interactivity in low-memory situations

2019-08-18 Thread Chris Murphy
On Sun, Aug 18, 2019 at 2:55 PM Gordan Bobic  wrote:
>
> On Sun, Aug 18, 2019 at 9:07 PM Kevin Kofler  wrote:
>>
>> Gordan Bobic wrote:
>> > Right, but is it better that _everything_ else suffers with more memory
>> > pressure for the handful of relatively infrequent use cases for which
>> > ulimit can be used to explicitly raise the limit?
>>
>> Well, as I wrote, a lower limit might actually make sense on ARM. But modern
>> x86 computers have gigabytes of RAM, so 1 MiB is ridiculously small there.
>> So this would have to be an architecture-specific setting for ARM.
>
>
> That may be so, but this thread started off with memory pressure also being 
> an issue for regular desktop x86 use.
>

I think optimizations like this, and including compile time defaults
should get smarter to do such optimizations and have a lot of
intrinsic value. But in any case, I think it's fair to say that we're
in very broad agreement that no matter what options get used or what
optimization do or don't happen, unprivileged processes should not be
able to effectively take down the system. That to me is really
incredible to discover.

Everything else: no swap at all and tolerate abrupt and random
oom-killer killoffs, double the swap or use /dev/zram, or use 1/4 RAM
for swap, or throw a metric f ton of RAM at it, all of those are
different ways of dodging a cannon ball. Dodging the problem doesn't
actually fix the problem.Iff your dodge doesn't work out, you get hit
by a cannon ball. Not OK. It's an unprivileged task! I'm aghast.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Gordan Bobic
On Sun, Aug 18, 2019 at 9:07 PM Kevin Kofler  wrote:

> Gordan Bobic wrote:
> > Right, but is it better that _everything_ else suffers with more memory
> > pressure for the handful of relatively infrequent use cases for which
> > ulimit can be used to explicitly raise the limit?
>
> Well, as I wrote, a lower limit might actually make sense on ARM. But
> modern
> x86 computers have gigabytes of RAM, so 1 MiB is ridiculously small there.
> So this would have to be an architecture-specific setting for ARM.


That may be so, but this thread started off with memory pressure also being
an issue for regular desktop x86 use.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Kevin Kofler
Gordan Bobic wrote:
> Right, but is it better that _everything_ else suffers with more memory
> pressure for the handful of relatively infrequent use cases for which
> ulimit can be used to explicitly raise the limit?

Well, as I wrote, a lower limit might actually make sense on ARM. But modern 
x86 computers have gigabytes of RAM, so 1 MiB is ridiculously small there. 
So this would have to be an architecture-specific setting for ARM.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Gordan Bobic
On Sun, Aug 18, 2019 at 8:51 PM Kevin Kofler  wrote:

> Gordan Bobic wrote:
> > It may be simpler to approach the question from the other side, i.e. is
> > there anything that actually ever needs more than 1MB of stack space? If
> > there is, I haven't seen it in the decade since I've been using this
> tweak
> > with various Fedora derived distributions.
>
> I've more than once had Java applications crash with a StackOverflowError
> because Java has such a retarded 1 MiB default stack size independently of
> the amount of available RAM. (You have to explicitly use the -Xss
> parameter
> to get more.) It happened at least once in Java code and at least once in
> C++ code interfaced through JNI.
>
> So I don't think 1 MiB is a reasonable default stack size for
> general-purpose computers, though it might make sense on ARM.
>

Right, but is it better that _everything_ else suffers with more memory
pressure for the handful of relatively infrequent use cases for which
ulimit can be used to explicitly raise the limit?
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Kevin Kofler
Gordan Bobic wrote:
> Adding -ffunction-sections -fdata-sections to defaults can help
> considerably in producing smaller binaries, and is not the default.
> Linking with -Wl,--gc-sections helps a lot and is not the default

Well, -ffunction-sections -fdata-sections -Wl,--gc-sections mostly helps if 
the binary contains a lot of unused code. This is not often the case in 
dynamically linked executables or shared libraries. Where it typically 
happens is statically linked binaries (where the static libraries contain 
APIs not used by the statically linked executable), and even there only if 
the library is not actually designed for static linking (in which case it 
would put separate APIs into separate compilation units to begin with). And 
static linking is a bad idea for RAM consumption anyway, at least on 
GNU/Linux where dynamically linked executables actually share the code 
sections of shared libraries in RAM too, not just on disk.

So I'm afraid you will probably find those options to be less useful in 
practice than you expect.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Kevin Kofler
Gordan Bobic wrote:
> It may be simpler to approach the question from the other side, i.e. is
> there anything that actually ever needs more than 1MB of stack space? If
> there is, I haven't seen it in the decade since I've been using this tweak
> with various Fedora derived distributions.

I've more than once had Java applications crash with a StackOverflowError 
because Java has such a retarded 1 MiB default stack size independently of 
the amount of available RAM. (You have to explicitly use the -Xss parameter 
to get more.) It happened at least once in Java code and at least once in 
C++ code interfaced through JNI.

So I don't think 1 MiB is a reasonable default stack size for
general-purpose computers, though it might make sense on ARM.

Kevin Kofler

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Dridi Boukelmoune
> It may be simpler to approach the question from the other side, i.e. is there 
> anything that actually ever needs more than 1MB of stack space? If there is, 
> I haven't seen it in the decade since I've been using this tweak with various 
> Fedora derived distributions.

Any application allowing arbitrary regular expressions and/or regex
input using libpcre...

Dridi
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Gordan Bobic
On Sun, Aug 18, 2019 at 2:33 PM Hans de Goede  wrote:

>
> >  > Adding -ffunction-sections -fdata-sections to defaults can help
> considerably in producing smaller binaries, and is not the default.
> >  > Linking with -Wl,--gc-sections helps a lot and is not the default
> >
> > These OTOH are interesting I know that e.g. uboot combines these and
> it helps a lot to get smaller binaries,
> > and this should help with RAM size too, since if a page of a binary
> contains mostly unused things and 1 symbol
> > which is actually used it will still get paged in.
> >
> > Can you perhaps start a new devel list thread about just this ?
> Maybe with some binary size numbers for
> > some apps / libs build with and without these options?
> >
> >
> > It's pretty well documented in various articles, e.g.:
> > https://wiki.wxwidgets.org/Reducing_Executable_Size
> > It also covers how much difference -Os can make.
>
> Interesting, thank you for that link.
>
>
> >  > Extensive stripping seems to already be the default
> (--strip-unneeded, removal of .comment and .note sections)
> >  >
> >  > 2) Runtime condiguration
> >  > Default stack size is 8192 (ulimit -s). This unnecessarily eats a
> considerably amount of memory. I have yet to see anything that actually
> experiences problems with 1M.
> >
> > Actually ulimit -s is the *maximum* stack size, I'm pretty sure the
> stack will start much smaller and
> > grow dynamically. So changing this is not saving any RAM and it will
> makes apps which do have high
> > stack usage crash when they hit the new lower limit.
> >
> >
> > Either way, it makes a noticeable difference to memory consumption on a
> very memory constrained system without any other obvious adverse effects.
>
> Interesting unless I'm reading the manpage wrong, "ulimit -s" sets the
> maximum stack-size.
> Maybe that also influences the initial sizing of the stack ?
>

I believe it does. Or at least that is the only explanation I can come up
with for the observation.


>
> Can someone who knows more about this shed some light on this? Is there a
> way to go with
> a smaller initial stack-size without changing the maximum size?
>

It may be simpler to approach the question from the other side, i.e. is
there anything that actually ever needs more than 1MB of stack space? If
there is, I haven't seen it in the decade since I've been using this tweak
with various Fedora derived distributions.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Hans de Goede

Hi,

On 18-08-19 15:25, Gordan Bobic wrote:

On Sun, Aug 18, 2019 at 2:06 PM Hans de Goede mailto:hdego...@redhat.com>> wrote:

Hi,

On 18-08-19 13:33, Gordan Bobic wrote:
 > On Sun, Aug 11, 2019 at 10:36 AM http://gnome.org> 
> wrote:
 >  > This seems like a distraction from the real goal here, which is to
 >  > ensure Fedora remains responsive under heavy memory pressure,
 >
 > I think this is an overwhelmingly important point, and as somebody 
regularly working with ARM machines with tiny amounts of RAM, it is of 
considerable interest to me.
 > I typically use CentOS because stability is important to me, but most 
worthwhile things filter to there, so I hope what I'm about to say is not _too_ 
deprecated.
 >
 > 1) Compile options
 >  From what I can tell from rpm macro options, default on C7 seems to be 
-O2. -Os seems to help in most cases.

I don't think it is likely that Fedora will switch to -Os


It is not my place to argue about whether it will. The thread was asking for 
things that might contribute toward alleviating the memory pressure problem. 
This can make a fairly dramatic difference and it would contribute toward 
alleviating the problem because smaller binaries mean less to mmap().


 > Adding -ffunction-sections -fdata-sections to defaults can help 
considerably in producing smaller binaries, and is not the default.
 > Linking with -Wl,--gc-sections helps a lot and is not the default

These OTOH are interesting I know that e.g. uboot combines these and it 
helps a lot to get smaller binaries,
and this should help with RAM size too, since if a page of a binary 
contains mostly unused things and 1 symbol
which is actually used it will still get paged in.

Can you perhaps start a new devel list thread about just this ? Maybe with 
some binary size numbers for
some apps / libs build with and without these options?


It's pretty well documented in various articles, e.g.:
https://wiki.wxwidgets.org/Reducing_Executable_Size
It also covers how much difference -Os can make.


Interesting, thank you for that link.



 > Extensive stripping seems to already be the default (--strip-unneeded, 
removal of .comment and .note sections)
 >
 > 2) Runtime condiguration
 > Default stack size is 8192 (ulimit -s). This unnecessarily eats a 
considerably amount of memory. I have yet to see anything that actually 
experiences problems with 1M.

Actually ulimit -s is the *maximum* stack size, I'm pretty sure the stack 
will start much smaller and
grow dynamically. So changing this is not saving any RAM and it will makes 
apps which do have high
stack usage crash when they hit the new lower limit.


Either way, it makes a noticeable difference to memory consumption on a very 
memory constrained system without any other obvious adverse effects.


Interesting unless I'm reading the manpage wrong, "ulimit -s" sets the maximum 
stack-size.
Maybe that also influences the initial sizing of the stack ?

Can someone who knows more about this shed some light on this? Is there a way 
to go with
a smaller initial stack-size without changing the maximum size?

Regards,

Hans
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Gordan Bobic
On Sun, Aug 18, 2019 at 2:06 PM Hans de Goede  wrote:

> Hi,
>
> On 18-08-19 13:33, Gordan Bobic wrote:
> > On Sun, Aug 11, 2019 at 10:36 AM  http://gnome.org/>> wrote:
> >  > This seems like a distraction from the real goal here, which is to
> >  > ensure Fedora remains responsive under heavy memory pressure,
> >
> > I think this is an overwhelmingly important point, and as somebody
> regularly working with ARM machines with tiny amounts of RAM, it is of
> considerable interest to me.
> > I typically use CentOS because stability is important to me, but most
> worthwhile things filter to there, so I hope what I'm about to say is not
> _too_ deprecated.
> >
> > 1) Compile options
> >  From what I can tell from rpm macro options, default on C7 seems to be
> -O2. -Os seems to help in most cases.
>
> I don't think it is likely that Fedora will switch to -Os
>

It is not my place to argue about whether it will. The thread was asking
for things that might contribute toward alleviating the memory pressure
problem. This can make a fairly dramatic difference and it would contribute
toward alleviating the problem because smaller binaries mean less to mmap().


> > Adding -ffunction-sections -fdata-sections to defaults can help
> considerably in producing smaller binaries, and is not the default.
> > Linking with -Wl,--gc-sections helps a lot and is not the default
>
> These OTOH are interesting I know that e.g. uboot combines these and it
> helps a lot to get smaller binaries,
> and this should help with RAM size too, since if a page of a binary
> contains mostly unused things and 1 symbol
> which is actually used it will still get paged in.
>
> Can you perhaps start a new devel list thread about just this ? Maybe with
> some binary size numbers for
> some apps / libs build with and without these options?
>

It's pretty well documented in various articles, e.g.:
https://wiki.wxwidgets.org/Reducing_Executable_Size
It also covers how much difference -Os can make.


> > Extensive stripping seems to already be the default (--strip-unneeded,
> removal of .comment and .note sections)
> >
> > 2) Runtime condiguration
> > Default stack size is 8192 (ulimit -s). This unnecessarily eats a
> considerably amount of memory. I have yet to see anything that actually
> experiences problems with 1M.
>
> Actually ulimit -s is the *maximum* stack size, I'm pretty sure the stack
> will start much smaller and
> grow dynamically. So changing this is not saving any RAM and it will makes
> apps which do have high
> stack usage crash when they hit the new lower limit.


Either way, it makes a noticeable difference to memory consumption on a
very memory constrained system without any other obvious adverse effects.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Hans de Goede

Hi,

On 18-08-19 13:33, Gordan Bobic wrote:

On Sun, Aug 11, 2019 at 10:36 AM http://gnome.org/>> 
wrote:
 > This seems like a distraction from the real goal here, which is to
 > ensure Fedora remains responsive under heavy memory pressure,

I think this is an overwhelmingly important point, and as somebody regularly 
working with ARM machines with tiny amounts of RAM, it is of considerable 
interest to me.
I typically use CentOS because stability is important to me, but most 
worthwhile things filter to there, so I hope what I'm about to say is not _too_ 
deprecated.

1) Compile options
 From what I can tell from rpm macro options, default on C7 seems to be -O2. 
-Os seems to help in most cases.


I don't think it is likely that Fedora will switch to -Os


Adding -ffunction-sections -fdata-sections to defaults can help considerably in 
producing smaller binaries, and is not the default.
Linking with -Wl,--gc-sections helps a lot and is not the default


These OTOH are interesting I know that e.g. uboot combines these and it helps a 
lot to get smaller binaries,
and this should help with RAM size too, since if a page of a binary contains 
mostly unused things and 1 symbol
which is actually used it will still get paged in.

Can you perhaps start a new devel list thread about just this ? Maybe with some 
binary size numbers for
some apps / libs build with and without these options?


Extensive stripping seems to already be the default (--strip-unneeded, removal 
of .comment and .note sections)

2) Runtime condiguration
Default stack size is 8192 (ulimit -s). This unnecessarily eats a considerably 
amount of memory. I have yet to see anything that actually experiences problems 
with 1M.


Actually ulimit -s is the *maximum* stack size, I'm pretty sure the stack will 
start much smaller and
grow dynamically. So changing this is not saving any RAM and it will makes apps 
which do have high
stack usage crash when they hit the new lower limit.

Regards,

Hans
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-18 Thread Gordan Bobic
On Sun, Aug 11, 2019 at 10:36 AM  wrote:
> This seems like a distraction from the real goal here, which is to
> ensure Fedora remains responsive under heavy memory pressure,

I think this is an overwhelmingly important point, and as somebody
regularly working with ARM machines with tiny amounts of RAM, it is of
considerable interest to me.
I typically use CentOS because stability is important to me, but most
worthwhile things filter to there, so I hope what I'm about to say is not
_too_ deprecated.

1) Compile options
>From what I can tell from rpm macro options, default on C7 seems to be -O2.
-Os seems to help in most cases.
Adding -ffunction-sections -fdata-sections to defaults can help
considerably in producing smaller binaries, and is not the default.
Linking with -Wl,--gc-sections helps a lot and is not the default
Extensive stripping seems to already be the default (--strip-unneeded,
removal of .comment and .note sections)

2) Runtime condiguration
Default stack size is 8192 (ulimit -s). This unnecessarily eats a
considerably amount of memory. I have yet to see anything that actually
experiences problems with 1M.

3) zram
This was mentioned earlier in the thread, and on most of my systems, memory
constrained or otherwise, unless I have an overwhelming reason not to, I
run with zram swap equal in size to RAM with lz4 compression and
vm.swappiness=100. I typically see compression ratios between 2:1 and 3:1
in zram, so on a system with, say, 10GB of RAM, it would provide 10GB of
very fast swap at a cost of 3-5GB of RAM. This seems like a favourable
trade off, especially on systems with extremely constrained RAM (e.g. ARM
devices with 512MB of RAM).

I'm sure there is more that can be done, but this seems like a good start
as far as the cost / benefit is concerned.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-15 Thread David Airlie
On Fri, Aug 16, 2019 at 7:48 AM Chris Murphy  wrote:
>
> On Thu, Aug 15, 2019 at 2:19 PM Chris Murphy  wrote:
> >
> > [  718.068633] fmac.local kernel: SLUB: Unable to allocate memory on
> > node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
> > [  718.068636] fmac.local kernel:   cache: page->ptl, object size: 72,
> > buffer size: 72, default order: 0, min order: 0
> > [  718.068639] fmac.local kernel:   node 0: slabs: 296, objs: 16576, free: 0
> > [  718.068704] fmac.local kernel: chronyd: page allocation failure:
> > order:0, mode:0x800(GFP_NOWAIT),
> > nodemask=(null),cpuset=/,mems_allowed=0
> >
> > Not sure what to make of that.
>
> Asked on #fedora-kernel, it's a known issue with 5.3.0-rc4 and drm.
>

Nope it's not that.

Something has leaked all your memory (not drm).

Dave.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-15 Thread Chris Murphy
On Thu, Aug 15, 2019 at 2:19 PM Chris Murphy  wrote:
>
> [  718.068633] fmac.local kernel: SLUB: Unable to allocate memory on
> node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
> [  718.068636] fmac.local kernel:   cache: page->ptl, object size: 72,
> buffer size: 72, default order: 0, min order: 0
> [  718.068639] fmac.local kernel:   node 0: slabs: 296, objs: 16576, free: 0
> [  718.068704] fmac.local kernel: chronyd: page allocation failure:
> order:0, mode:0x800(GFP_NOWAIT),
> nodemask=(null),cpuset=/,mems_allowed=0
>
> Not sure what to make of that.

Asked on #fedora-kernel, it's a known issue with 5.3.0-rc4 and drm.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-15 Thread Chris Murphy
On Thu, Aug 15, 2019 at 1:51 AM Artem Tim  wrote:
>
> BFQ scheduler help a lot with this issue. Using it on Fedora since 4.19 
> kernel. Also there was previous discussion about make it default for 
> Workstation
> https://lists.fedoraproject.org/archives/list/ker...@lists.fedoraproject.org/message/I2OZWDD4QCDYUXJ5NHYTMGNAB4KLJN2K/

It's mentioned in the workstation issue as having no effect in this case.
https://pagure.io/fedora-workstation/issue/98

I just switched to it and repeating the test case and the GUI still
hangs, is unresponsive, even without substantial pressure on the SSD,
and swap isn't even 1/2 used.
https://drive.google.com/open?id=13_5XIBMu01HfOdzGVH-4qTgd-PLpFsaN

But I am getting something new in kernel messages:


542sysrq+t during a GUI freeze that lasted over 1 minute, and then:

[  718.068633] fmac.local kernel: SLUB: Unable to allocate memory on
node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
[  718.068636] fmac.local kernel:   cache: page->ptl, object size: 72,
buffer size: 72, default order: 0, min order: 0
[  718.068639] fmac.local kernel:   node 0: slabs: 296, objs: 16576, free: 0
[  718.068704] fmac.local kernel: chronyd: page allocation failure:
order:0, mode:0x800(GFP_NOWAIT),
nodemask=(null),cpuset=/,mems_allowed=0

Not sure what to make of that. Complete 'journalctl -k' is here:
https://drive.google.com/open?id=1Z1jAjMrmdXAxuSELdFfd4IKdceeufmVu

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-15 Thread Artem Tim
BFQ scheduler help a lot with this issue. Using it on Fedora since 4.19 kernel. 
Also there was previous discussion about make it default for Workstation
https://lists.fedoraproject.org/archives/list/ker...@lists.fedoraproject.org/message/I2OZWDD4QCDYUXJ5NHYTMGNAB4KLJN2K/
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-14 Thread S.

(Oops, sorry, re-post because I messed up the threading.)

I'm not a developer, nor do I pretend to understand the nuances of memory management. But 
I signed up for this list just to say "thanks" to all the devs and others that 
are finally discussing what I consider to be one of the biggest problems with Linux on 
the desktop.

My experience with desktop Linux distros with SSDs when a few processes start 
to leak memory, or if I launch a new program when my system is right at the 
limits, is a full system hang where only the mouse occasionally moves jerkily, 
and I can't switch to a virtual terminal. I recently learned the SysRq trick to 
evoke the OOM killer, but I personally think that the kernel should deal with 
that, not the user. As unfortunate as it is for the OOM killer to have to 
randomly kill something, I am of the opinion that the OS should *never* lock 
up, period. I would strongly prefer that one application get killed instead of 
losing all my applications and working data because of a necessary hard reboot.

I don't know if this helps or not, but anecdotally I started see this issue 
*after* SSDs became more common, i.e. I don't think I ever experienced it with 
spinning rust. Maybe something to do with the vastly faster I/O of an SSD, 
which allows it to more quickly saturate the RAM before the OOM killer has time 
to react?

Also, I've had relatively low memory KVM guests running on a VPS under very high load, 
and they never lockup. The OOM killer does occasionally kick in, but the affected daemon 
or systemd service restarts and it's amazingly undramatic. It appears that this issue 
only occurs with Xorg (and I imagine Wayland) and "desktop" usage.

As for the problem of the randomness of the OOM killer, couldn't it be made to 
take into account the PID and/or how long the process has been running? 
Normally Xorg (and I assume Wayland stuff) gets started before the other 
desktop programs that tend to consume a lot of memory. So if it's a higher PID 
and/or has been running for less time, give it a higher score for killability.

In my experience on a system with 8GB of RAM and an SSD, the amount of swap 
space makes no difference. I've tried with no swap space, with 2GB, with 8GB, 
etc, and it still hangs under high memory usage. I've also tried tuning a lot 
of sysctl parameters such as vm.swappiness, vm.vfs_cache_pressure, and 
vm.min_free_kbytes, to no avail.

Don't know if this helps, but here are some additional discussions of Linux 
unresponsiveness under low memory situations from a layman's perspective:
- osnews.com/story/130117/kde-usability-and-productivity-are-we-there-yet/ (in 
the comments)
- 
unix.stackexchange.com/questions/373312/oom-killer-doesnt-work-properly-leads-to-a-frozen-os
- bbs.archlinux.org/viewtopic.php?id=233843
- 
askubuntu.com/questions/432809/why-is-kswapd0-running-on-a-computer-with-no-swap/432827#432827
- 
unix.stackexchange.com/questions/24625/how-to-completely-disable-swap/24646#24646

Thanks again to everyone for looking into this!
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


re: Better interactivity in low-memory situations

2019-08-14 Thread S.

I'm not a developer, nor do I pretend to understand the nuances of memory management. But 
I signed up for this list just to say "thanks" to all the devs and others that 
are finally discussing what I consider to be one of the biggest problems with Linux on 
the desktop.

My experience with desktop Linux distros with SSDs when a few processes start 
to leak memory, or if I launch a new program when my system is right at the 
limits, is a full system hang where only the mouse occasionally moves jerkily, 
and I can't switch to a virtual terminal. I recently learned the SysRq trick to 
evoke the OOM killer, but I personally think that the kernel should deal with 
that, not the user. As unfortunate as it is for the OOM killer to have to 
randomly kill something, I am of the opinion that the OS should *never* lock 
up, period. I would strongly prefer that one application get killed instead of 
losing all my applications and working data because of a necessary hard reboot.

I don't know if this helps or not, but anecdotally I started see this issue 
*after* SSDs became more common, i.e. I don't think I ever experienced it with 
spinning rust. Maybe something to do with the vastly faster I/O of an SSD, 
which allows it to more quickly saturate the RAM before the OOM killer has time 
to react?

Also, I've had relatively low memory KVM guests running on a VPS under very high load, 
and they never lockup. The OOM killer does occasionally kick in, but the affected daemon 
or systemd service restarts and it's amazingly undramatic. It appears that this issue 
only occurs with Xorg (and I imagine Wayland) and "desktop" usage.

As for the problem of the randomness of the OOM killer, couldn't it be made to 
take into account the PID and/or how long the process has been running? 
Normally Xorg (and I assume Wayland stuff) gets started before the other 
desktop programs that tend to consume a lot of memory. So if it's a higher PID 
and/or has been running for less time, give it a higher score for killability.

In my experience on a system with 8GB of RAM and an SSD, the amount of swap 
space makes no difference. I've tried with no swap space, with 2GB, with 8GB, 
etc, and it still hangs under high memory usage. I've also tried tuning a lot 
of sysctl parameters such as vm.swappiness, vm.vfs_cache_pressure, and 
vm.min_free_kbytes, to no avail.

Don't know if this helps, but here are some additional discussions of Linux 
unresponsiveness under low memory situations from a layman's perspective:
- osnews.com/story/130117/kde-usability-and-productivity-are-we-there-yet/ (in 
the comments)
- 
unix.stackexchange.com/questions/373312/oom-killer-doesnt-work-properly-leads-to-a-frozen-os
- bbs.archlinux.org/viewtopic.php?id=233843
- 
askubuntu.com/questions/432809/why-is-kswapd0-running-on-a-computer-with-no-swap/432827#432827
- 
unix.stackexchange.com/questions/24625/how-to-completely-disable-swap/24646#24646

Thanks again to everyone for looking into this!
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-13 Thread Simon Farnsworth
> On 10 Aug 2019, at 17:56, Georg Sauthoff  wrote:
> 
> On Fri, Aug 09, 2019 at 03:50:43PM -0600, Chris Murphy wrote:
> [..]
>> Problem and thesis statement:
>> Certain workloads, such as building webkitGTK from source, results in
>> heavy swap usage eventually leading to the system becoming totally
>> unresponsive. Look into switching from disk based swap, to swap on a
>> ZRAM device.
>> 
>> Summary of findings (restated, but basically the same as found at [2]):
>> Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM,
>> Samsung SSD 840 EVO, Fedora Rawhide Workstation.
>> Test case, build WebKitGTK from source.
> [..]
> 
> To avoid such issues I disable swap on my machines. I really don't see
> the point of having a swap partition if you have 16 or 32 GiB RAM. Even
> with 8 GiB I disable swap.
> 
https://chrisdown.name/2018/01/02/in-defence-of-swap.html 
 is worth a read - 
TL;DR the kernel used (pre 4.0) to be awful about swap, but modern kernels use 
it to avoid paging executable (file-backed) pages in low memory. If any paging 
is needed, lack of swap means that the kernel will page out active code before 
it gets as far as an OOM kill, resulting in a longer time to recover from 
memory contention (regardless of whether there's an OOM kill or the system 
recovers naturally).

Further, a sensible amount of swap (say 2 GiB or so) means that unused 
anonymous pages (e.g. data that's left over from initialization, or data that 
will only be needed when a process exits) can be swapped out and left on disk, 
freeing up valuable RAM for useful work.

Basically, a sane amount of swap is healthy - old advice about large amounts of 
swap is not.

> With - say - 8 GiB the build of a large project might fail (e.g. llvm,
> e.g. during linking) but it then fails fast and I can just restart it
> with `ninja -j2` or something like that.
> 
> Another source of IO related unresponsiveness is buffer bloat - I thus
> apply this configuration on my machines:
> 
>$ cat /etc/sysctl.d/01-disk-bufferbloat.conf
>vm.dirty_background_bytes=107374182
>vm.dirty_bytes=214748364
> 
> Best regards
> Georg
> -- 
> 'Time your programs before making claims about efficiency'
>  (Bjarne Stroustrup, The C++ Programming Language, 4th ed., p. 132, 2013)
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: 
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Chris Murphy
On Mon, Aug 12, 2019 at 6:31 PM David Airlie  wrote:
>
> On Sun, Aug 11, 2019 at 2:57 AM Georg Sauthoff  wrote:
> >
> > On Fri, Aug 09, 2019 at 03:50:43PM -0600, Chris Murphy wrote:
> > [..]
> > > Problem and thesis statement:
> > > Certain workloads, such as building webkitGTK from source, results in
> > > heavy swap usage eventually leading to the system becoming totally
> > > unresponsive. Look into switching from disk based swap, to swap on a
> > > ZRAM device.
> > >
> > > Summary of findings (restated, but basically the same as found at [2]):
> > > Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM,
> > > Samsung SSD 840 EVO, Fedora Rawhide Workstation.
> > > Test case, build WebKitGTK from source.
> > [..]
> >
> > To avoid such issues I disable swap on my machines. I really don't see
> > the point of having a swap partition if you have 16 or 32 GiB RAM. Even
> > with 8 GiB I disable swap.
>
> Disabling swap doesn't avoid the issues, it can in fact make them worse.
>
> If you have apps allocate memory they don't always OOM before the
> kernel tries to evict text pages, but since SSDs are fast it then
> tries to pull back in those text pages before realising (that is what
> most of the latest rounds of articles has been about). Something like
> firefox runs with no swap, starts to need more memory than the system
> has, parts of firefox executable get paged out, but then are needed
> for firefox to use the RAM, and round in circles it goes.
>
> Having swap is still in this day and age better for your system that
> not having it.

I agree that it's better to have swap for incidental swap purposes,
rather than random things just getting abruptly hit with oom. I say
random, because I see the oom_score_adj is the same for every process
other than systemd-udev, auditd, sshd, and dbus. Plausibly the shell
could get oom killed without warning, taking out the entire user
session, all apps, and all the build processes.

I just discovered in the log from yesterday, that iotop was subject to
oom killer, rather than one of the large cc1plus processes, which is
what I'd previously consistently witnessed. So iotop and cc1plus must
be in the ballpark oom score wise and oom killer just so happens to
pick one or the other. iotop going away relieved just enough memory
that nothing else was subject to oom killer, and yet processes were
clearly resource starved nevertheless: the GUI was frozen, but then
also other processes had already been dying due to timeouts, for
example:

Aug 11 18:26:57 fmac.local systemd[1]: sssd-kcm.service: Control
process exited, code=killed, status=15/TERM
Aug 11 18:26:57 fmac.local systemd[1]: sssd-kcm.service: Failed with
result 'timeout'.

Aug 11 18:27:00 fmac.local systemd[1]: systemd-journald.service: State
'stop-sigterm' timed out. Killing.
Aug 11 18:27:00 fmac.local systemd[1]: systemd-journald.service:
Killing process 31010 (systemd-journal) with signal SIGKILL.
Aug 11 18:27:00 fmac.local systemd[1]: systemd-journald.service: Main
process exited, code=killed, status=9/KILL

This is like a train wreck where there are all sorts of interesting
sub failures happening. At one point I think, well we need better oom
scores so the truly lowest important process is killed off. But upon
big picture scrutiny, the system is failing before oom killer has been
triggered. Processes are dying with timeouts. The GUI including the
mouse pointer is frozen, even when swap is half full. Practically
speaking, it's a goner the moment the mouse pointer froze the very
first time. I might tolerate some stuttering here and there, but
minutes of frozen state? Nah - not interested in seeing if this is
another 5 minutes of choke, or 5 days.

And that's the bad side of swap is when the system is more than
incidentally using it, and is depending on it. And apparently nothing
is on a deadline timer if things can just start timing out on their
own, including the system journal! That was a surprise to see. If it
was that hung up, maybe I can't trust the journal entry times or
order, maybe important entries were lost.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread David Airlie
On Sun, Aug 11, 2019 at 2:57 AM Georg Sauthoff  wrote:
>
> On Fri, Aug 09, 2019 at 03:50:43PM -0600, Chris Murphy wrote:
> [..]
> > Problem and thesis statement:
> > Certain workloads, such as building webkitGTK from source, results in
> > heavy swap usage eventually leading to the system becoming totally
> > unresponsive. Look into switching from disk based swap, to swap on a
> > ZRAM device.
> >
> > Summary of findings (restated, but basically the same as found at [2]):
> > Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM,
> > Samsung SSD 840 EVO, Fedora Rawhide Workstation.
> > Test case, build WebKitGTK from source.
> [..]
>
> To avoid such issues I disable swap on my machines. I really don't see
> the point of having a swap partition if you have 16 or 32 GiB RAM. Even
> with 8 GiB I disable swap.

Disabling swap doesn't avoid the issues, it can in fact make them worse.

If you have apps allocate memory they don't always OOM before the
kernel tries to evict text pages, but since SSDs are fast it then
tries to pull back in those text pages before realising (that is what
most of the latest rounds of articles has been about). Something like
firefox runs with no swap, starts to need more memory than the system
has, parts of firefox executable get paged out, but then are needed
for firefox to use the RAM, and round in circles it goes.

Having swap is still in this day and age better for your system that
not having it.

Dave.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Chris Murphy
On Mon, Aug 12, 2019 at 11:07 AM Benjamin Kircher
 wrote:
>
> (… I definitely need to play around with Silverblue to learn what they are 
> doing.)

I'm pretty sure Silverblue will be rebased on Fedora CoreOS which
recently released a preview. I'm not sure what the time frame for that
is, but maybe that work will be concurrent with work on a release
version of Fedora CoreOS. The central means of installing/uninstalling
and running applications on a future immutable system is flatpak. But
you don't need to commit a system to Silverblue to use and test
flatpak applications on Fedora 29/30 Workstation. Containerization is
an option not a requirement of flatpaks, as is running it as a systemd
--user instance.

Since layering is permitted with rpm-ostree based systems, using
overlayfs, there still needs to be some way for the per-user service
manager to enforce limits on unprivileged programs. The use of the
word "limit" might be misleading. Perhaps instead it should be on
defining and preserving the user interface responsiveness, whether
that's CLI or GUI, so that control isn't lost. i.e. the unprivileged
program gets the leftover resources, it's not a peer with the user
interface. Promoting the active user interfaces relative to the
unprivileged task would provide a way of effectively containing the
unprivileged tasks, by one always being able to preempt the other.

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Lennart Poettering
On Mo, 12.08.19 19:06, Benjamin Kircher (benjamin.kirc...@gmail.com) wrote:

>
>
> > On 12. Aug 2019, at 18:16, Lennart Poettering  wrote:
> >
> > On Mo, 12.08.19 09:40, Chris Murphy (li...@colorremedies.com) wrote:
> >
> >> How to do this automatically? Could there be a mechanism for the
> >> system and the requesting application to negotiate resources?
> >
> > Ideally, GNOME would run all its apps as systemd --user services. We
> > could then set DefaultMemoryHigh= globally for the systemd --user
> > instance to some percentage value (which is taken relative to the
> > physical RAM size). This would then mean every user app individually
> > could use — let's say — 75% of the physical RAM size and when it wants
> > more it would be penalized during reclaim compared to apps using less.
> >
> > If GNOME would run all apps as user services we could do various other
> > nice things too. For example, it could dynamically assign the fg app
> > more CPU/IO weight than the bg apps, if the system is starved of
> > both.
>
> I really like the ideas. Why isn’t this done this way anyway?

Well, let's just say certain popular container managers blocked
switching to cgroupsv2, and only in cgroupsv2 delegating cgroup
subtrees to unprivileged users is safe. Hence doing this kind of
resource management wasn't really doable without ugly hacks.

But as it appears cgroupsv2 has a chance of becoming a reality on
Fedora now, so this opens a lot of doors.

> I don’t have a GNOME desktop at hand right now to investigate how
> GNOME starts applications and so on but aren’t new processes started
> by the user — GNOME or not — always children of the user.slice? Is
> there a difference if I start a GNOME application or a normal
> process from my shell?

Well, "user.slice" is a concept of the *system* service manager, but
desktop apps are if anything a concept of the *per-user* service
manager.

> And for the beginning, wouldn’t it be enough to differentiate
> between user slices and system slice and set DefaultMemoryHigh= in a
> way to make sure there is always some headroom left for the system?

From the system service manager's PoV all user apps together make up
the user's 'user@.service' instance, it doesn#t look below.

i.e. cgroups is hierarchial, and various components can manage their
own subtrees. PID 1 manages the top of the tree, and the per-user
service manager a subtree of it that is below it and arranges per-user
apps below that. But from PID1's PoV each of those per-user subtrees
is opaque and it won't do resource management beneath that
boundary. It's the job of the per-user service manager to do resource
management there.

Lennart

--
Lennart Poettering, Berlin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Benjamin Kircher


> On 12. Aug 2019, at 18:16, Lennart Poettering  wrote:
> 
> On Mo, 12.08.19 09:40, Chris Murphy (li...@colorremedies.com) wrote:
> 
>> How to do this automatically? Could there be a mechanism for the
>> system and the requesting application to negotiate resources?
> 
> Ideally, GNOME would run all its apps as systemd --user services. We
> could then set DefaultMemoryHigh= globally for the systemd --user
> instance to some percentage value (which is taken relative to the
> physical RAM size). This would then mean every user app individually
> could use — let's say — 75% of the physical RAM size and when it wants
> more it would be penalized during reclaim compared to apps using less.
> 
> If GNOME would run all apps as user services we could do various other
> nice things too. For example, it could dynamically assign the fg app
> more CPU/IO weight than the bg apps, if the system is starved of
> both.

I really like the ideas. Why isn’t this done this way anyway?

I don’t have a GNOME desktop at hand right now to investigate how GNOME starts 
applications and so on but aren’t new processes started by the user — GNOME or 
not — always children of the user.slice? Is there a difference if I start a 
GNOME application or a normal process from my shell?

And for the beginning, wouldn’t it be enough to differentiate between user 
slices and system slice and set DefaultMemoryHigh= in a way to make sure there 
is always some headroom left for the system?

BK

(… I definitely need to play around with Silverblue to learn what they are 
doing.)
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Benjamin Kircher

> On 12. Aug 2019, at 17:40, Chris Murphy  wrote:
> 
> If I just run the example program, let's say systemd MemoryLimit is
> set to /proc/meminfo MemAvailable, the program is still going to try
> and bust out of that and fail. The failure reason is also non-obvious.
> Yes this is definitely an improvement in that the system isn't taken
> down.
> 
> How to do this automatically? Could there be a mechanism for the
> system and the requesting application to negotiate resources?

Honestly, right now, doing this automatically is not possible.

Instead, we anticipate the workload or the nature of the work. Like as when we 
connect remotely to a box and start some long running process, we anticipate 
trouble with the network and use a terminal multiplexer, right? Same thing with 
resource intensive processes.

But in future, I could imagine that this whole control group mechanism really 
pays off in a way where we distribute system resources automatically.

Isn’t that what Silverblue is all about? Having a base system and on top of 
that, everything is run in a container that could be potentially resource 
constraint?

BK
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Lennart Poettering
On Mo, 12.08.19 09:40, Chris Murphy (li...@colorremedies.com) wrote:

> How to do this automatically? Could there be a mechanism for the
> system and the requesting application to negotiate resources?

Ideally, GNOME would run all its apps as systemd --user services. We
could then set DefaultMemoryHigh= globally for the systemd --user
instance to some percentage value (which is taken relative to the
physical RAM size). This would then mean every user app individually
could use — let's say — 75% of the physical RAM size and when it wants
more it would be penalized during reclaim compared to apps using less.

If GNOME would run all apps as user services we could do various other
nice things too. For example, it could dynamically assign the fg app
more CPU/IO weight than the bg apps, if the system is starved of
both.

> Right now the only lever to avoid swap, is to not create a swap
> partition at installation time. Or create a smaller one instead of 1:1
> ratio with RAM. Or use a 1/4 RAM sized swap on ZRAM. A consequence of
> each of these alternatives, is hibernation can't be used. Fedora
> already explicitly does not support hibernation, but strictly that
> means we don't block release on hibernation related bugs. Fedora does
> still create a swap that meets the minimum size for hibernation, and
> also inserts the required 'resume' kernel  parameter to locate the
> hibernation image at the next boot. So we kinda sorta do support it.

We could add a mode to systemd's hibernation support to only "swapon"
a swap partition immediately before hibernating, and "swapoff" it
right after coming back. This has been proposed before, but noone so
far did the work on it. But quite frankly this feels just like taping
over the fact that the Linux kernel is rubbish when it comes to
swapping...

> Another reality is, the example program, also doesn't have a good way
> of estimating the resources it needs. It has some levers, that just
> aren't being used by default, including -l option which reads "do not
> start new jobs if the load average is greater than N". But that's
> different than "tell me the box sizes you can use" and then the system
> supplying a matching box, and for the program to work within it.

As suggested above, I think DefaultMemoryHigh=75% would be an OK
approach which would allow us adjust to the "beefiness" of a machine
automatically.

Lennart

--
Lennart Poettering, Berlin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Chris Murphy
On Mon, Aug 12, 2019 at 1:01 AM Florian Weimer  wrote:
>
> * Chris Murphy:
>
> > Summary of findings (restated, but basically the same as found at [2]):
> > Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM,
> > Samsung SSD 840 EVO, Fedora Rawhide Workstation.
>
> Do you use the built-in Intel graphics?  Can you test with something
> else?

Only intel graphics. The AMD GPU on the test system is
non-functional/defective. Other systems only have Intel graphics. I
have tested this in a VM which I think is qxl graphics (?), and I get
the same results, with minimal sample size. It seems like the oom
happens more often and sooner on the VM, but that might because the VM
is necessarily even more resource constrained than the host. But I
have reproduced the total and seemingly indefinite hang. The results
aren't completely deterministic, whether baremetal or VM. They're all
"failures" in one form or another, but how they fail does differ run
to run. And that's expected because to what degree I'm simultaneously
browsing in Firefox, how many tabs are open, other programs being
used, the user is a cause of that non-determinism and is a relevant
factor.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Chris Murphy
On Mon, Aug 12, 2019 at 12:30 AM Benjamin Kircher
 wrote:
>
>
>
> > On 11. Aug 2019, at 23:05, Chris Murphy  wrote:
> >
> > I think the point at which the mouse pointer has frozen, the user has
> > no practical means of controlling or interacting with the system, it's
> > a failure.
> >
> > In the short term, is it reasonable and possible, to get the oom
> > killer to trigger sooner and thereby avoid the system becoming
> > unresponsive in the first place? The oom score for most all processes
> > is 0, and niced processes have their oom score increased. I'm not
> > seeing levers to control how aggressive it is, only a way of hinting
> > at which processes can be more readily subject to being killed. In
> > fact, a requirement of oom killer is that swap is completely consumed,
> > which if swap is on anything other than a fast SSD, swapping creates
> > its own performance problems way before oom can be a rescuer. I think
> > I just argued against my own question.
>
> Yes you just did :-)
>
> From what I understand from this LKML thread [1] fast swap on NVMe is only 
> part of the issue (or adds to the issue). The kernel really really tries hard 
> not to OOM kill anything and keep the system going. And this overcommitment 
> is where it eventually gets unresponsive to the extend that the machine needs 
> to be hard rebooted.
>
> The LKML thread also mentions that user-space OOM handling could help.
>
> But what about cgroups? Isn’t there a systemd utility that helps me wrap 
> processes in resource constrained groups? Something along the line
>
> $ systemd-run -p MemoryLimit=1G firefox
>
> (Not tested.) I imagine that a well-behaved program will handle a bad malloc 
> by ending itself?
>
> BTW, this happens not only on Linux. I’m used to deal with quite big files 
> during my day job and if you accidentally write some… em… very 
> unsophisticated code that attempts to read the entire file into memory at 
> once you can experience the same behavior on a recent macOS, too. You’re left 
> with nothing else than force rebooting your machine.
>
> [1] https://lkml.org/lkml/2019/8/4/15
>

If I just run the example program, let's say systemd MemoryLimit is
set to /proc/meminfo MemAvailable, the program is still going to try
and bust out of that and fail. The failure reason is also non-obvious.
Yes this is definitely an improvement in that the system isn't taken
down.

How to do this automatically? Could there be a mechanism for the
system and the requesting application to negotiate resources?

One reality is, the system isn't a good estimator of system
responsiveness from the user's point of view. Anytime swap is under
significant pressure (what's the definition of significant?) the
system is effectively lost at that point, *if* this is a desktop
system (includes laptops). In the example case, once swap is being
heavily used on either the SSD, or on ZRAM, the mouse pointer is
frozen variably 50%-90% of the time. It's not a usable system, well
before swap is full. How does the system learn that a light swap rate
is OK, but a heavy swap rate will lead to an angry user? And even
heavy swap might be OK on NVMe, or on a server.

Right now the only lever to avoid swap, is to not create a swap
partition at installation time. Or create a smaller one instead of 1:1
ratio with RAM. Or use a 1/4 RAM sized swap on ZRAM. A consequence of
each of these alternatives, is hibernation can't be used. Fedora
already explicitly does not support hibernation, but strictly that
means we don't block release on hibernation related bugs. Fedora does
still create a swap that meets the minimum size for hibernation, and
also inserts the required 'resume' kernel  parameter to locate the
hibernation image at the next boot. So we kinda sorta do support it.

Another reality is, the example program, also doesn't have a good way
of estimating the resources it needs. It has some levers, that just
aren't being used by default, including -l option which reads "do not
start new jobs if the load average is greater than N". But that's
different than "tell me the box sizes you can use" and then the system
supplying a matching box, and for the program to work within it.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Florian Weimer
* Petr Pisar:

> On 2019-08-12, Florian Weimer  wrote:
>> Do you use the built-in Intel graphics?  Can you test with something
>> else?
>>

> Does it have any effect? It happens to me even with a discrete GPU.

I expect that the GEM shrinker (or rather, the reason why it is needed)
radically alters kernel memory management.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Petr Pisar
On 2019-08-12, Florian Weimer  wrote:
> Do you use the built-in Intel graphics?  Can you test with something
> else?
>
Does it have any effect? It happens to me even with a discrete GPU.

As far as I know integrated graphics arrays do not share physical memory
from point of view of the CPU address space. The physical memory is
split between GPU and CPU regions and CPU never see the GPU's physical
memory. IOMMU can be asked for mapping GPU's memory into CPU's virtual
space as can be done with any PCI card, but the physical memory is
always separated. (Although it lives in the same memory chip.) Some
BIOSes allows to define the UMA split (ratio beteen GPU and CPU memory).
But that is out of control of an operating system and cannot be change
until reset.

What actually happens is that some CPU physical memory is used for a GUI
program text and some CPU memory for a block device I/O cache. Both
purposes are handled uniformly by Linux. When the physical memory is
exhausted, a memory allocator starts paging to a swap device. The evil
thing is how memory pages are selected to be swapped out. The algorithm
is to swap out the least recently used ones. And that is often the
program text. Not the block cache. As a result your GUI becomes
unresponsive because all the physical memory is filled with a block
cache and the program text has to be reloaded from a block device. And
what's worse, this happens even without swap space because program text
pages are backed by a file and thus can dropped and loaded from a file
system later. I.e. program text is always swapable.

A cure would be more fair memory allocator that could magically
discover that a user is more interested in the few megabytes of his
window manager than the gigabytes of a transfered file. The issue is
that the allocator does not discriminate. A process can actully provide
some hints using madvise(2) and mlock(2), but that does not apply to
the program text, neither to the block cache in the kernel space. And
even if processes provided hints, there always could be some adversarial
program abusing others. Maybe if ulimit were augmented with a block
cache maximal usage and an I/O scheduler accounted for that. That could
help.

-- Petr
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-12 Thread Florian Weimer
* Chris Murphy:

> Summary of findings (restated, but basically the same as found at [2]):
> Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM,
> Samsung SSD 840 EVO, Fedora Rawhide Workstation.

Do you use the built-in Intel graphics?  Can you test with something
else?

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread Benjamin Kircher


> On 11. Aug 2019, at 23:05, Chris Murphy  wrote:
> 
> I think the point at which the mouse pointer has frozen, the user has
> no practical means of controlling or interacting with the system, it's
> a failure.
> 
> In the short term, is it reasonable and possible, to get the oom
> killer to trigger sooner and thereby avoid the system becoming
> unresponsive in the first place? The oom score for most all processes
> is 0, and niced processes have their oom score increased. I'm not
> seeing levers to control how aggressive it is, only a way of hinting
> at which processes can be more readily subject to being killed. In
> fact, a requirement of oom killer is that swap is completely consumed,
> which if swap is on anything other than a fast SSD, swapping creates
> its own performance problems way before oom can be a rescuer. I think
> I just argued against my own question.

Yes you just did :-)

From what I understand from this LKML thread [1] fast swap on NVMe is only part 
of the issue (or adds to the issue). The kernel really really tries hard not to 
OOM kill anything and keep the system going. And this overcommitment is where 
it eventually gets unresponsive to the extend that the machine needs to be hard 
rebooted.

The LKML thread also mentions that user-space OOM handling could help.

But what about cgroups? Isn’t there a systemd utility that helps me wrap 
processes in resource constrained groups? Something along the line

$ systemd-run -p MemoryLimit=1G firefox

(Not tested.) I imagine that a well-behaved program will handle a bad malloc by 
ending itself?

BTW, this happens not only on Linux. I’m used to deal with quite big files 
during my day job and if you accidentally write some… em… very unsophisticated 
code that attempts to read the entire file into memory at once you can 
experience the same behavior on a recent macOS, too. You’re left with nothing 
else than force rebooting your machine.

[1] https://lkml.org/lkml/2019/8/4/15

BK
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread Chris Murphy
On Sun, Aug 11, 2019 at 1:02 PM Jan Kratochvil
 wrote:
>
> On Sun, 11 Aug 2019 20:54:28 +0200, Chris Murphy wrote:
> > and likely experiences data loss and possibly even file system
> > corruption as a direct consequence of having to force power off on the
> > machine because for all practical purposes normal control has been
> > lost.
>
> Not really, this is what journaling filesystem is there for.

Successful journal replay obviates the need for fsck, it has nothing
to do with avoiding corruption. And in any case, anything the user is
working on that isn't already saved and committed to stable media,
isn't going to survive the poweroff.

> But then there still can be an application-level data corruptions if an
> application does not handle its sudden termination properly.
> Which should be rare but IIRC I did see it for example with Firefox.

I think the point at which the mouse pointer has frozen, the user has
no practical means of controlling or interacting with the system, it's
a failure.

In the short term, is it reasonable and possible, to get the oom
killer to trigger sooner and thereby avoid the system becoming
unresponsive in the first place? The oom score for most all processes
is 0, and niced processes have their oom score increased. I'm not
seeing levers to control how aggressive it is, only a way of hinting
at which processes can be more readily subject to being killed. In
fact, a requirement of oom killer is that swap is completely consumed,
which if swap is on anything other than a fast SSD, swapping creates
its own performance problems way before oom can be a rescuer. I think
I just argued against my own question.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread Jan Kratochvil
On Sun, 11 Aug 2019 20:54:28 +0200, Chris Murphy wrote:
> and likely experiences data loss and possibly even file system
> corruption as a direct consequence of having to force power off on the
> machine because for all practical purposes normal control has been
> lost.

Not really, this is what journaling filesystem is there for.

But then there still can be an application-level data corruptions if an
application does not handle its sudden termination properly.
Which should be rare but IIRC I did see it for example with Firefox.


Jan
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread Chris Murphy
On Sun, Aug 11, 2019 at 10:36 AM  wrote:
>
> On Sun, Aug 11, 2019 at 10:50 AM, Chris Murphy
>  wrote:
> > Let's take another argument. If the user manually specifies 'ninja -j
> > 64' on this same system, is that sabotage? I'd say it is. And
> > therefore why isn't it sabotage that the ninja default computes N jobs
> > as nrcpus + 2?  And also doesn't take available memory into account
> > when deciding what resources to demand? I can build linux all day long
> > on this system with its defaults and never run into a concurrent
> > usability problem.
> >
> > There does seem to be a dual responsibility, somehow, between the
> > operating system and the application, to make sure sane requests are
> > made and honored.
>
> This seems like a distraction from the real goal here, which is to
> ensure Fedora remains responsive under heavy memory pressure, and to
> ensure unprivileged processes cannot take down the system by allocating
> large amounts of memory. Fixing ninja and make to dynamically scale the
> number of parallel build processes based on memory pressure would be
> wonderful, but it's not going to solve the underlying issue here, which
> is that random user processes should never be able to hang the system.

That's fair.

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread Chris Murphy
On Sun, Aug 11, 2019 at 11:21 AM Jan Kratochvil
 wrote:
>
> On Sun, 11 Aug 2019 17:50:17 +0200, Chris Murphy wrote:
> > I don't follow. You're saying RelWithDebInfo is never suitable for a
> > local build?
>
> Most of the time. What is your use case for it?

My use case is testing the responsiveness of Fedora Workstation under
CPU and memory pressure, as experienced by an ordinary user.


> > In file included from Source/JavaScriptCore/config.h:32,
> >  from 
> > Source/JavaScriptCore/llint/LLIntSettingsExtractor.cpp:26:
> > Source/JavaScriptCore/runtime/JSExportMacros.h:32:10: fatal error: 
> > wtf/ExportMacros.h: No such file or directory
>
> You are reinventing the wheel Fedora packager has already done for this
> package.

That's out of scope.

I said from the outset this is an example. The central topic is that
an unprivileged program is able to ask for resources that do not
exist, and the operating system tries and fails to supply those
resources, resulting not only in task failure, but the entire system
is lost. In this example the user is doing other things concurrently
and likely experiences data loss and possibly even file system
corruption as a direct consequence of having to force power off on the
machine because for all practical purposes normal control has been
lost.


> > Let's take another argument. If the user manually specifies 'ninja -j
> > 64' on this same system, is that sabotage?
>
> For untrusted users Linux has given up for that, it is too big can of worms.
> Use virtual machine (KVM) with specified resources (memory size).  Nowadays it
> should be also possible with less overhead by using Docker containers.
>
> If you mean some local builds of your own causing runaway then
> (1) Turn off swap as RAM is cheap enough today.
> If something really runs out of the RAM it gets killed by kernel OOM.
> (2) Have the swap on NVMe, it from my experience does not kill the machine.
> (3) Use some reasonable ulimits in your ~/.bash_profile.
> (4) When the machine is really unresponsible login there from a different box
> and kill the culprits. From my own experience the machine is still able to
> accept new SSH connection, despite a bit slowly.
> But yes, I agree this problem has AFAIK no perfect solution.


I don't think it's acceptable in 2019 that an unpriviledged task takes
out the entire operating system. As I mention in the very first post,
remote ssh was not responsive for 30 minutes, at which point I gave up
and forced power off. It's a bit of a trap though to suggest the user
needs the ability and skill to remote ssh to kill off runaway
programs, I refuse that premise.

It's completely sane for an ordinary user to consider that control of
the system has been lost immediately upon experiencing a frozen mouse
arrow.



-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread Jan Kratochvil
On Sun, 11 Aug 2019 17:50:17 +0200, Chris Murphy wrote:
> I don't follow. You're saying RelWithDebInfo is never suitable for a
> local build?

Most of the time. What is your use case for it?


> isn't relevant to getting a successful build.

With powerful enough machine everything is possible.  Just be aware
RelWithDebInfo is the most resource demanding option compared to Release and
Debug and at the same time it is the least useful one for local builds.


> In file included from Source/JavaScriptCore/config.h:32,
>  from 
> Source/JavaScriptCore/llint/LLIntSettingsExtractor.cpp:26:
> Source/JavaScriptCore/runtime/JSExportMacros.h:32:10: fatal error: 
> wtf/ExportMacros.h: No such file or directory

You are reinventing the wheel Fedora packager has already done for this
package.  I guess you are missing some dependency.  If you have a problem
stick to the proven build (unless it is temporarily FTBFS which this package
is not now).  I think Fedora recommends mock for such rebuild but I find mock
inconvenient for local development so I use (I have some scripts for that):
dnf download --source webkit2gtk3
mkdir webkit2gtk3-2.24.3-1.fc30.src
cd webkit2gtk3-2.24.3-1.fc30.src
rpm2cpio ../webkit2gtk3-2.24.3-1.fc30.src.rpm|cpio -id
function rpmbuildlocal { time MAKEFLAGS= rpmbuild --define "_topdir 
$PWD" --define "_builddir $PWD" --define "_rpmdir $PWD" --define "_sourcedir 
$PWD" --define "_specdir $PWD" --define "_srcrpmdir $PWD" --define 
"_build_name_fmt %%{NAME}-%%{VERSION}-%%{RELEASE}.%%{ARCH}.rpm" "$@"; rmdir 
&>/dev/null BUILDROOT; }
# Is the .src.rpm rebuild still needed?  
https://bugzilla.redhat.com/show_bug.cgi?id=1210276
rpmbuildlocal -bs *.spec
sudo dnf builddep webkit2gtk3-2.24.3-1.fc30.src.rpm
rm webkit2gtk3-2.24.3-1.fc30.src.rpm
rpmbuildlocal -bc webkit2gtk3.spec 2>&1|tee log
# or -bb or what do you want.
It has built fine for me here now.


> Let's take another argument. If the user manually specifies 'ninja -j
> 64' on this same system, is that sabotage?

For untrusted users Linux has given up for that, it is too big can of worms.
Use virtual machine (KVM) with specified resources (memory size).  Nowadays it
should be also possible with less overhead by using Docker containers.

If you mean some local builds of your own causing runaway then
(1) Turn off swap as RAM is cheap enough today.
If something really runs out of the RAM it gets killed by kernel OOM.
(2) Have the swap on NVMe, it from my experience does not kill the machine.
(3) Use some reasonable ulimits in your ~/.bash_profile.
(4) When the machine is really unresponsible login there from a different box
and kill the culprits. From my own experience the machine is still able to
accept new SSH connection, despite a bit slowly.
But yes, I agree this problem has AFAIK no perfect solution.


Jan
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread mcatanzaro
On Sun, Aug 11, 2019 at 10:50 AM, Chris Murphy 
 wrote:

Let's take another argument. If the user manually specifies 'ninja -j
64' on this same system, is that sabotage? I'd say it is. And
therefore why isn't it sabotage that the ninja default computes N jobs
as nrcpus + 2?  And also doesn't take available memory into account
when deciding what resources to demand? I can build linux all day long
on this system with its defaults and never run into a concurrent
usability problem.

There does seem to be a dual responsibility, somehow, between the
operating system and the application, to make sure sane requests are
made and honored.


This seems like a distraction from the real goal here, which is to 
ensure Fedora remains responsive under heavy memory pressure, and to 
ensure unprivileged processes cannot take down the system by allocating 
large amounts of memory. Fixing ninja and make to dynamically scale the 
number of parallel build processes based on memory pressure would be 
wonderful, but it's not going to solve the underlying issue here, which 
is that random user processes should never be able to hang the system.


Michael

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-11 Thread Chris Murphy
On Sat, Aug 10, 2019 at 3:07 AM Jan Kratochvil
 wrote:
>
> On Fri, 09 Aug 2019 23:50:43 +0200, Chris Murphy wrote:
> > $ cmake -DPORT=GTK -DCMAKE_BUILD_TYPE=RelWithDebInfo -GNinja
>
> RelWithDebInfo is -O2 -g build.  That is not suitable for debugging, for
> debugging you should use -DCMAKE_BUILD_TYPE=Debug (that is -g).
> RelWithDebInfo is useful for final rpm packages but those are build in Koji.

I don't follow. You're saying RelWithDebInfo is never suitable for a
local build?

I'm not convinced that matters, because what the user-developer is
trying to accomplish post-build isn't relevant to getting a successful
build. And also, this is just one example of how apparently easy it is
to take down a system with an unprivileged task, per the various
discussions I've had with members of the Workstation WG.

Anyway, the build fails for a different reason when I use Debug
instead of RelWithDebInfo so I can't test it.

In file included from Source/JavaScriptCore/config.h:32,
 from Source/JavaScriptCore/llint/LLIntSettingsExtractor.cpp:26:
Source/JavaScriptCore/runtime/JSExportMacros.h:32:10: fatal error:
wtf/ExportMacros.h: No such file or directory
   32 | #include 
  |  ^~~~
compilation terminated.
[1131/2911] Building CXX object Sourc...er/preprocessor/DiagnosticsBase.cpp.o
ninja: build stopped: subcommand failed.



> Debug build will have smaller debug info so the problem may go away.
>
> If it does not go away then tune the parallelism. Low -j makes the build
> needlessly slow during compilation phase while high -j (up to about #cpus
> + 2 or so) will make the final linking phase with debug info to run out of
> memory. This is why LLVM has separate "-j" for the linking phase but that is
> implemented only in LLVM CMakeLists.txt files:
> https://llvm.org/docs/CMake.html
> LLVM_PARALLEL_LINK_JOBS
> So that you leave the default -j high but set LLVM_PARALLEL_LINK_JOBS to 1 or 
> 2.
>
> Other options for faster build times are also LLVM specific:
> -DLLVM_USE_LINKER=gold (maybe also lld now?)
>  - as ld.gold or ld.lld are faster than ld.bfd
> -DLLVM_USE_SPLIT_DWARF=ON
>  - Linking phase no longer deals with the huge debug info
>
> Which should be applicable for other projects by something like (untested!):
> -DCMAKE_C_FLAGS="-gsplit-dwarf"
> -DCMAKE_CXX_FLAGS="-gsplit-dwarf"
> -DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=gold -Wl,--gdb-index"
> -DCMAKE_SHARED_LINKER_FLAGS="-fuse-ld=gold -Wl,--gdb-index"
>
> (That gdb-index is useful if you are really going to debug it using GDB as
> I expect you are going to do when you want RelWithDebInfo and not Release; but
> then I would recommend Debug in such case anyway as debugging optimized code
> is very difficult.)
>
>
> > is there a practical way right now of enforcing CPU
> > and memory limits on unprivileged applications?
>
> $ help ulimit
>   -mthe maximum resident set size
>   -uthe maximum number of user processes
>   -vthe size of virtual memory
>
> One can also run it with 'nice -n19', 'ionice -c3'
> and/or "cgclassify -g '*':hammock" (config attached).

Thanks. I'll have to defer to others about how to incorporate this so
the default build is more intelligently taking actual resources into
account. My strong bias is that the user-developer can't be burdened
with knowing esoteric things. The defaults should just work.

Let's take another argument. If the user manually specifies 'ninja -j
64' on this same system, is that sabotage? I'd say it is. And
therefore why isn't it sabotage that the ninja default computes N jobs
as nrcpus + 2?  And also doesn't take available memory into account
when deciding what resources to demand? I can build linux all day long
on this system with its defaults and never run into a concurrent
usability problem.

There does seem to be a dual responsibility, somehow, between the
operating system and the application, to make sure sane requests are
made and honored.

> But after all I recommend just more memory, it is cheap nowadays and I find
> 64GB just about the right size.

That's an optimization. It can't be used as an excuse for an
unprivileged task taking down a system.


-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-10 Thread Georg Sauthoff
On Fri, Aug 09, 2019 at 03:50:43PM -0600, Chris Murphy wrote:
[..]
> Problem and thesis statement:
> Certain workloads, such as building webkitGTK from source, results in
> heavy swap usage eventually leading to the system becoming totally
> unresponsive. Look into switching from disk based swap, to swap on a
> ZRAM device.
> 
> Summary of findings (restated, but basically the same as found at [2]):
> Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM,
> Samsung SSD 840 EVO, Fedora Rawhide Workstation.
> Test case, build WebKitGTK from source.
[..]

To avoid such issues I disable swap on my machines. I really don't see
the point of having a swap partition if you have 16 or 32 GiB RAM. Even
with 8 GiB I disable swap.

With - say - 8 GiB the build of a large project might fail (e.g. llvm,
e.g. during linking) but it then fails fast and I can just restart it
with `ninja -j2` or something like that.

Another source of IO related unresponsiveness is buffer bloat - I thus
apply this configuration on my machines:

$ cat /etc/sysctl.d/01-disk-bufferbloat.conf
vm.dirty_background_bytes=107374182
vm.dirty_bytes=214748364

Best regards
Georg
-- 
'Time your programs before making claims about efficiency'
  (Bjarne Stroustrup, The C++ Programming Language, 4th ed., p. 132, 2013)
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-10 Thread Jan Kratochvil
On Fri, 09 Aug 2019 23:50:43 +0200, Chris Murphy wrote:
> $ cmake -DPORT=GTK -DCMAKE_BUILD_TYPE=RelWithDebInfo -GNinja

RelWithDebInfo is -O2 -g build.  That is not suitable for debugging, for
debugging you should use -DCMAKE_BUILD_TYPE=Debug (that is -g).
RelWithDebInfo is useful for final rpm packages but those are build in Koji.

Debug build will have smaller debug info so the problem may go away.

If it does not go away then tune the parallelism. Low -j makes the build
needlessly slow during compilation phase while high -j (up to about #cpus
+ 2 or so) will make the final linking phase with debug info to run out of
memory. This is why LLVM has separate "-j" for the linking phase but that is
implemented only in LLVM CMakeLists.txt files:
https://llvm.org/docs/CMake.html
LLVM_PARALLEL_LINK_JOBS
So that you leave the default -j high but set LLVM_PARALLEL_LINK_JOBS to 1 or 2.

Other options for faster build times are also LLVM specific:
-DLLVM_USE_LINKER=gold (maybe also lld now?)
 - as ld.gold or ld.lld are faster than ld.bfd
-DLLVM_USE_SPLIT_DWARF=ON
 - Linking phase no longer deals with the huge debug info

Which should be applicable for other projects by something like (untested!):
-DCMAKE_C_FLAGS="-gsplit-dwarf"
-DCMAKE_CXX_FLAGS="-gsplit-dwarf"
-DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=gold -Wl,--gdb-index"
-DCMAKE_SHARED_LINKER_FLAGS="-fuse-ld=gold -Wl,--gdb-index"

(That gdb-index is useful if you are really going to debug it using GDB as
I expect you are going to do when you want RelWithDebInfo and not Release; but
then I would recommend Debug in such case anyway as debugging optimized code
is very difficult.)


> is there a practical way right now of enforcing CPU
> and memory limits on unprivileged applications?

$ help ulimit
  -mthe maximum resident set size
  -uthe maximum number of user processes
  -vthe size of virtual memory

One can also run it with 'nice -n19', 'ionice -c3'
and/or "cgclassify -g '*':hammock" (config attached).

But after all I recommend just more memory, it is cheap nowadays and I find
64GB just about the right size.


Jan
mount {
cpu = /cgroup/cpu;
memory  = /cgroup/memory;
blkio   = /cgroup/blkio;
}

group hammock {
perm {
task {
uid = jkratoch;
gid = jkratoch;
}
admin {
uid = jkratoch;
gid = jkratoch;
}
}
cpu {
cpu.shares = 2;
}
memory {
memory.limit_in_bytes = 2G;
}
blkio {
blkio.weight = 100;
}
}
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-09 Thread Chris Murphy
Just in case anyone wants to try to reproduce this particular example:

1. Grab latest stable from here and untar it
https://webkitgtk.org/releases/
2. Run this included script, which is dnf aware, to install dependencies
./Tools/gtk/install-dependencies
3. Additional packages I had to install to get it to build
sudo dnf install ruby-devel openjpeg2-devel woff2-devel
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Better interactivity in low-memory situations

2019-08-09 Thread Omair Majid

Hi,

Chris Murphy  writes:

> Certain workloads, such as building webkitGTK from source, results in
> heavy swap usage eventually leading to the system becoming totally
> unresponsive. Look into switching from disk based swap, to swap on a
> ZRAM device.

It sounds like the same issue that has been in the news recently:

- https://www.phoronix.com/scan.php?page=news_item&px=Linux-Does-Bad-Low-RAM
- https://news.ycombinator.com/item?id=20620545

Older sources with more information:

- https://lwn.net/Articles/759658/
- 
https://superuser.com/questions/1115983/prevent-system-freeze-unresponsiveness-due-to-swapping-run-away-memory-usage

(I learned about this bug the hard way; my machine experienced this bug
in the middle of a public presentation a few years ago.)

Regards,
Omair

--
PGP Key: B157A9F0 (http://pgp.mit.edu/)
Fingerprint = 9DB5 2F0B FD3E C239 E108  E7BD DF99 7AF8 B157 A9F0
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Better interactivity in low-memory situations

2019-08-09 Thread Chris Murphy
This subject matches a Fedora Workstation Working Group issue of the
same name [1], and this post is intended to be an independent summary
of the findings so far, and call for additional testing and
discussion, in particular subject matter experts.

Problem and thesis statement:
Certain workloads, such as building webkitGTK from source, results in
heavy swap usage eventually leading to the system becoming totally
unresponsive. Look into switching from disk based swap, to swap on a
ZRAM device.

Summary of findings (restated, but basically the same as found at [2]):
Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM,
Samsung SSD 840 EVO, Fedora Rawhide Workstation.
Test case, build WebKitGTK from source.

$ cmake -DPORT=GTK -DCMAKE_BUILD_TYPE=RelWithDebInfo -GNinja
$ ninja

Case 1: 8GiB swap on SSD plain partition (not encrypted, not on LVM)
Case 2: 8GiB swap on /dev/zram0

In each case, that swap is exclusive, there are no other swap devices.
Within ~30 minutes in the first case, and ~10 minutes in the second
case, the GUI is completely unresponsive, mouse pointer has frozen and
doesn't recover after more than 30 minutes of waiting. By remote ssh,
the first case is semi-responsive, updates should be every 5 seconds
but are instead received every 2-5 minutes but it wasn't possible to
compel recovery by cancelling the build process after another 30
minutes. By remote ssh, the second case is totally unresponsive, no
updates for 30 minutes.

The system was manually forced power off at that point, in both cases.
oom killer never triggered.

NOTE: ninja, by default on this system, sets N concurrent jobs to
nrcpus + 2, which is 10 on this system. If I reboot with nr_cpus=4,
ninja sets N jobs to 6.

Case 3: 2GiB swap on /dev/zram0
In one test this resulted in system hang (no pointer movement) within
5 minutes of executing ninja, and within another 6 minutes oom killer
is invoked on a cc1plus process, which is fatal to the build process,
remaining build related processes quit on their own, and the system
eventually recovers.

But in two subsequent tests in this same configuration, oom killer
wasn't invoked, and the system meandered between responsive for ~1
minute, totally frozen for 5-6 minutes, in a cycle lasting beyond 1
hour without ever triggering oom killer.

Screenshot taken during one of the moments the remote ssh session updated
https://drive.google.com/open?id=1IDboR1fzP4onu_tzyZxsx7M5cT_RJ7Iz

The state had not changed after 45 minutes following the above
screenshot so I forced power off on that system. But the point here is
this slightly different configuration has some non-determinism to it,
even though in the end it's a bad UX. The default, unprivileged build
command is effectively taking down the system all the same.

Case 4: 8GiB swap on SSD plain partition, `ninja -j 4`
This is the same setup as Case 1, except I manually set N jobs to 4.
Build succeeds, and except for a few mouse pointer stutters, the
system remains responsive, even Firefox with multiple tabs open, and
youtube video playing. Exactly the experience we'd like to see, albeit
not all CPU resources are used for the build, but clearly the limiting
factor is this particular package requires more than ~14GiB to build
successfully, and the system + shell + Firefox, just doesn't have
that.

Starter questions:
To what degree, and why, is this problem instigated by the build
application (ninja in this example) or its supporting configuration
files, including cmake? Or the kernel? Or the system configuration? Is
it a straightforward problem, or is this actually somewhat nuanced
with multiple components in suboptimal configuration coming together
as the cause? Is it expected that an unprivileged user can run a
command whose defaults eventually lead to a totally unrecoverable
system? From a security risk standpoint, the blame can't be entirely
on the user or the application configuration, but how should
application containment be enforced? Other than containerizing the
build programs, is there a practical way right now of enforcing CPU
and memory limits on unprivileged applications? Other alternatives? At
the very least it seems like getting to an oom killer sooner would
result in a better experience, fail the process before the GUI becomes
unresponsive and hangs out for 30+minutes (possibly many hours).

[1]
https://pagure.io/fedora-workstation/issue/98
[2]
https://pagure.io/fedora-workstation/issue/98#comment-588713

Thanks,

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org