Re: [PATCH 2/2] open_issues/gnumach_vm_map_entry_forward_merging.mdwn: edited one of sergey's emails into this wiki page.
October 10, 2023 9:20 PM, jbra...@dismail.de wrote: I edited an old email from Sergey to make this wiki edit. I hope Sergey doesn't mind. > --- > .../gnumach_vm_map_entry_forward_merging.mdwn | 187 ++ > 1 file changed, 187 insertions(+) > > diff --git a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn > b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn > index 7739f4d1..b34bd61e 100644 > --- a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn > +++ b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn > @@ -10,6 +10,193 @@ License|/fdl]]."]]"""]] > > [[!tag open_issue_gnumach]] > > +Mach is not always able to merge/coalesce mappings (VM entries) that > +are made next to each other, leading to potentially very large numbers > +of VM entries, which may slow down the VM functionality. This is said > +to particularly affect ext2fs and bash. > + > +The basic idea of Mach designers is that entry coalescing is only an > +optimization anyway, not a hard guarantee. We can apply it in the > +common simple case, and just refuse to do it in any remotely complex > +cases (copies, shadows, multiply referenced objects, pageout in > +progress, ...). > + > +Suppose you define a special test program that intentionally maps > +parts of a file next to each other and watches the resulting VM map > +entries, and just ran a full Hurd system and observed results. > + > +One can stress test ext2fs in particular to check for VM entry > +merging: > + > + # grep NR -r /usr &> /dev/null > + # vminfo 8 | wc -l > + > +That grep opens and reads lots of files to simulate a long-running > +machine (perhaps a build server); then one can look at the number of > +mappings in ext2fs afterwards. Depending on how much your /usr is > +populated, you will get different numbers. An older Hurd from say > +2022, the above comand would result in 5,000-20,000 entries depending > +on the machine! In June 2023, GNUMach gained some forward merging > +functinality, which lowered the number of mappings down to 93 entries! > + > +(It is a separate question of why ext2fs makes that many mappings in > +the first place. There could possible by a leak in ext2fs that would > +be responsible for this, but none have been found so far. Possibly > +another problem is that we have an unbounded node cache in libdiskfs > +and Mach caching VM objects, which also keeps the node alive.) > + > +These are the simple forward merging cases that GNUMach now supports: > + > +- Forward merging: in `vm_map_enter`, merging with the next entry, in > + addition to merging with the previous entry that was already there; > + > +- For forward merging, a `VM_OBJECT_NULL` can be merged in front of a > + non-null VM object, provided the second entry has large enough > + offset into the object to 'mount' the the first entry in front of > + it; > + > +- A VM object can always be merged with itself (provded offsets/sizes > + match) -- this allows merging entries referencing non-anonymous VM > + objects too, such a file mappings; > + > +- Operations such as `vm_protect` do "clipping", which means splitting > + up VM map entries, in case the specified region lands in the middle > + of an entry -- but they were never "gluing" (merging, coalescing) > + entries back together if the region is later vm_protect'ed back. Now > + this is done (and we try to coalesce in some other cases too). This > + should particularly help with "program break" (brk) in glibc, which > + vm_protect's the pages allocated for the brk back and forth all the > + time. > + > +- As another optimization, throw away unmapped physical pages when > + there are no other references to the object (provided there is no > + pager). Previously the pages would remain in core until the object > + was either unmapped completely, or until another mapping was to be > + created in place of the unmapped one and coalescing kicked in. > + > +- Also shrink the size of `struct vm_page` somewhat. This was a low > + hanging fruit. > + > +`vm_map_coalesce_entry()` is analogous to `vm_map_simplify_entry()` in > +other versions of Mach, but different enough to warrant a different > +name. The same "coalesce" wording was used as in > +`vm_object_coalesce()`, which is appropriate given that the former is > +a wrapper for the latter. > + > +### The following provides clarifies some inaccuracies in old IRC logs: > + > + any request, be it e.g. `mmap()`, or `mprotect()`, can easily split > + entries > + > +`mmap ()` cannot split entries to my knowledge, unless we're talking about > +`MAP_FIXED` and unampping parts of the existing mappings. > + > + my ext2fs has ~6500 entries, but I guess this is related to > + mapping blocks from the filesystem, right? > + > +No. Neither libdiskfs nor ext2fs ever map the store contents into memory > +(arguably maybe they should); they just read them with `store_read ()`, > +and then dispose of the the read buffers properly. The excessive number > +of VM map entries, as far as I can
[PATCH 2/2] open_issues/gnumach_vm_map_entry_forward_merging.mdwn: edited one of sergey's emails into this wiki page.
--- .../gnumach_vm_map_entry_forward_merging.mdwn | 187 ++ 1 file changed, 187 insertions(+) diff --git a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn index 7739f4d1..b34bd61e 100644 --- a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn +++ b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn @@ -10,6 +10,193 @@ License|/fdl]]."]]"""]] [[!tag open_issue_gnumach]] +Mach is not always able to merge/coalesce mappings (VM entries) that +are made next to each other, leading to potentially very large numbers +of VM entries, which may slow down the VM functionality. This is said +to particularly affect ext2fs and bash. + +The basic idea of Mach designers is that entry coalescing is only an +optimization anyway, not a hard guarantee. We can apply it in the +common simple case, and just refuse to do it in any remotely complex +cases (copies, shadows, multiply referenced objects, pageout in +progress, ...). + +Suppose you define a special test program that intentionally maps +parts of a file next to each other and watches the resulting VM map +entries, and just ran a full Hurd system and observed results. + +One can stress test ext2fs in particular to check for VM entry +merging: + + # grep NR -r /usr &> /dev/null + # vminfo 8 | wc -l + +That grep opens and reads lots of files to simulate a long-running +machine (perhaps a build server); then one can look at the number of +mappings in ext2fs afterwards. Depending on how much your /usr is +populated, you will get different numbers. An older Hurd from say +2022, the above comand would result in 5,000-20,000 entries depending +on the machine! In June 2023, GNUMach gained some forward merging +functinality, which lowered the number of mappings down to 93 entries! + +(It is a separate question of why ext2fs makes that many mappings in +the first place. There could possible by a leak in ext2fs that would +be responsible for this, but none have been found so far. Possibly +another problem is that we have an unbounded node cache in libdiskfs +and Mach caching VM objects, which also keeps the node alive.) + +These are the simple forward merging cases that GNUMach now supports: + +- Forward merging: in `vm_map_enter`, merging with the next entry, in + addition to merging with the previous entry that was already there; + +- For forward merging, a `VM_OBJECT_NULL` can be merged in front of a + non-null VM object, provided the second entry has large enough + offset into the object to 'mount' the the first entry in front of + it; + +- A VM object can always be merged with itself (provded offsets/sizes + match) -- this allows merging entries referencing non-anonymous VM + objects too, such a file mappings; + +- Operations such as `vm_protect` do "clipping", which means splitting + up VM map entries, in case the specified region lands in the middle + of an entry -- but they were never "gluing" (merging, coalescing) + entries back together if the region is later vm_protect'ed back. Now + this is done (and we try to coalesce in some other cases too). This + should particularly help with "program break" (brk) in glibc, which + vm_protect's the pages allocated for the brk back and forth all the + time. + +- As another optimization, throw away unmapped physical pages when + there are no other references to the object (provided there is no + pager). Previously the pages would remain in core until the object + was either unmapped completely, or until another mapping was to be + created in place of the unmapped one and coalescing kicked in. + +- Also shrink the size of `struct vm_page` somewhat. This was a low + hanging fruit. + +`vm_map_coalesce_entry()` is analogous to `vm_map_simplify_entry()` in +other versions of Mach, but different enough to warrant a different +name. The same "coalesce" wording was used as in +`vm_object_coalesce()`, which is appropriate given that the former is +a wrapper for the latter. + +### The following provides clarifies some inaccuracies in old IRC logs: + +any request, be it e.g. `mmap()`, or `mprotect()`, can easily split +entries + +`mmap ()` cannot split entries to my knowledge, unless we're talking about +`MAP_FIXED` and unampping parts of the existing mappings. + +my ext2fs has ~6500 entries, but I guess this is related to +mapping blocks from the filesystem, right? + +No. Neither libdiskfs nor ext2fs ever map the store contents into memory +(arguably maybe they should); they just read them with `store_read ()`, +and then dispose of the the read buffers properly. The excessive number +of VM map entries, as far as I can see, is just heap memory. + +(I'm perplexed about how the kernel can merge two memory objects if +disctinct port names exist in the tasks' name space -- that's what +`mem_obj` is, right?) + +if, say, 584 and 585 above are port names which the task expects to be +able to access