Greg Price <[email protected]> added the comment:
> About the RSS memory, I'm not sure how Linux accounts the Unicode databases
> before they are accessed. Is it like read-only memory loaded on demand when
> accessed?
It stands for "resident set size", as in "resident in memory"; and it only
counts pages of real physical memory. The intention is to count up pages that
the process is somehow using.
Where the definition potentially gets fuzzy is if this process and another are
sharing some memory. I don't know much about how that kind of edge case is
handled. But one thing I think it's pretty consistently good at is not
counting pages that you've nominally mapped from a file, but haven't actually
forced to be loaded physically into memory by actually looking at them.
That is: say you ask for a file (or some range of it) to be mapped into memory
for you. This means it's now there in the address space, and if the process
does a load instruction from any of those addresses, the kernel will ensure the
load instruction works seamlessly. But: most of it won't be eagerly read from
disk or loaded physically into RAM. Rather, the kernel's counting on that load
instruction causing a page fault; and its page-fault handler will take care of
reading from the disk and sticking the data physically into RAM. So until you
actually execute some loads from those addresses, the data in that mapping
doesn't contribute to the genuine demand for scarce physical RAM on the
machine; and it also isn't counted in the RSS number.
Here's a demo! This 262392 kiB (269 MB) Git packfile is the biggest file lying
around in my CPython directory:
$ du -k .git/objects/pack/pack-0e4acf3b2d8c21849bb11d875bc14b4d62dc7ab1.pack
262392 .git/objects/pack/pack-0e4acf3b2d8c21849bb11d875bc14b4d62dc7ab1.pack
Open it for read -- adds 100 kiB, not sure why:
$ python
Python 3.7.3 (default, Apr 3 2019, 05:39:12)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, mmap
>>> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
VmRSS: 9968 kB
>>> fd =
>>> os.open('.git/objects/pack/pack-0e4acf3b2d8c21849bb11d875bc14b4d62dc7ab1.pack',
>>> os.O_RDONLY)
>>> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
VmRSS: 10068 kB
Map it into our address space -- RSS doesn't budge:
>>> m = mmap.mmap(fd, 0, prot=mmap.PROT_READ)
>>> m
<mmap.mmap object at 0x7f185b5379c0>
>>> len(m)
268684419
>>> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
VmRSS: 10068 kB
Cause the process to actually look at all the data (this takes about ~10s,
too)...
>>> sum(len(l) for l in m)
268684419
>>> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
VmRSS: 271576 kB
RSS goes way up, by 261508 kiB! Oddly slightly less (by ~1MB) than the file's
size.
But wait, there's more. Drop that mapping, and RSS goes right back down (OK,
keeps 8 kiB extra):
>>> del m
>>> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
VmRSS: 10076 kB
... and then map the exact same file again, and it's *still* down:
>>> m = mmap.mmap(fd, 0, prot=mmap.PROT_READ)
>>> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
VmRSS: 10076 kB
This last step is interesting because it's a certainty that the data is still
physically in memory -- this is my desktop, with plenty of free RAM. And it's
even in our address space. But because we haven't actually loaded from those
addresses, it's still in memory only at the kernel's caching whim, and so
apparently our process doesn't get "charged" or "blamed" for its presence there.
In the case of running an executable with a bunch of data in it, I expect that
the bulk of the data (and of the code for that matter) winds up treated very
much like the file contents we mmap'd in. It's mapped but not eagerly
physically loaded; so it doesn't contribute to the RSS number, nor to the
genuine demand for scarce physical RAM on the machine.
That's a bit long :-), but hopefully informative. In short, I think for us RSS
should work well as a pretty faithful measure of the real memory consumption
that we want to be frugal with.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue32771>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com