Re: [PATCH 2/2] open_issues/gnumach_vm_map_entry_forward_merging.mdwn: edited one of sergey's emails into this wiki page.

2023-10-10 Thread jbranso
October 10, 2023 9:20 PM, jbra...@dismail.de wrote:

I edited an old email from Sergey to make this wiki edit.

I hope Sergey doesn't mind.

> ---
> .../gnumach_vm_map_entry_forward_merging.mdwn | 187 ++
> 1 file changed, 187 insertions(+)
> 
> diff --git a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
> b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
> index 7739f4d1..b34bd61e 100644
> --- a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
> +++ b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
> @@ -10,6 +10,193 @@ License|/fdl]]."]]"""]]
> 
> [[!tag open_issue_gnumach]]
> 
> +Mach is not always able to merge/coalesce mappings (VM entries) that
> +are made next to each other, leading to potentially very large numbers
> +of VM entries, which may slow down the VM functionality. This is said
> +to particularly affect ext2fs and bash.
> +
> +The basic idea of Mach designers is that entry coalescing is only an
> +optimization anyway, not a hard guarantee. We can apply it in the
> +common simple case, and just refuse to do it in any remotely complex
> +cases (copies, shadows, multiply referenced objects, pageout in
> +progress, ...).
> +
> +Suppose you define a special test program that intentionally maps
> +parts of a file next to each other and watches the resulting VM map
> +entries, and just ran a full Hurd system and observed results.
> +
> +One can stress test ext2fs in particular to check for VM entry
> +merging:
> +
> + # grep NR -r /usr &> /dev/null
> + # vminfo 8 | wc -l
> +
> +That grep opens and reads lots of files to simulate a long-running
> +machine (perhaps a build server); then one can look at the number of
> +mappings in ext2fs afterwards. Depending on how much your /usr is
> +populated, you will get different numbers. An older Hurd from say
> +2022, the above comand would result in 5,000-20,000 entries depending
> +on the machine! In June 2023, GNUMach gained some forward merging
> +functinality, which lowered the number of mappings down to 93 entries!
> +
> +(It is a separate question of why ext2fs makes that many mappings in
> +the first place. There could possible by a leak in ext2fs that would
> +be responsible for this, but none have been found so far. Possibly
> +another problem is that we have an unbounded node cache in libdiskfs
> +and Mach caching VM objects, which also keeps the node alive.)
> +
> +These are the simple forward merging cases that GNUMach now supports:
> +
> +- Forward merging: in `vm_map_enter`, merging with the next entry, in
> + addition to merging with the previous entry that was already there;
> +
> +- For forward merging, a `VM_OBJECT_NULL` can be merged in front of a
> + non-null VM object, provided the second entry has large enough
> + offset into the object to 'mount' the the first entry in front of
> + it;
> +
> +- A VM object can always be merged with itself (provded offsets/sizes
> + match) -- this allows merging entries referencing non-anonymous VM
> + objects too, such a file mappings;
> +
> +- Operations such as `vm_protect` do "clipping", which means splitting
> + up VM map entries, in case the specified region lands in the middle
> + of an entry -- but they were never "gluing" (merging, coalescing)
> + entries back together if the region is later vm_protect'ed back. Now
> + this is done (and we try to coalesce in some other cases too). This
> + should particularly help with "program break" (brk) in glibc, which
> + vm_protect's the pages allocated for the brk back and forth all the
> + time.
> +
> +- As another optimization, throw away unmapped physical pages when
> + there are no other references to the object (provided there is no
> + pager). Previously the pages would remain in core until the object
> + was either unmapped completely, or until another mapping was to be
> + created in place of the unmapped one and coalescing kicked in.
> +
> +- Also shrink the size of `struct vm_page` somewhat. This was a low
> + hanging fruit.
> +
> +`vm_map_coalesce_entry()` is analogous to `vm_map_simplify_entry()` in
> +other versions of Mach, but different enough to warrant a different
> +name. The same "coalesce" wording was used as in
> +`vm_object_coalesce()`, which is appropriate given that the former is
> +a wrapper for the latter.
> +
> +### The following provides clarifies some inaccuracies in old IRC logs:
> +
> + any request, be it e.g. `mmap()`, or `mprotect()`, can easily split
> + entries
> +
> +`mmap ()` cannot split entries to my knowledge, unless we're talking about
> +`MAP_FIXED` and unampping parts of the existing mappings.
> +
> + my ext2fs has ~6500 entries, but I guess this is related to
> + mapping blocks from the filesystem, right?
> +
> +No. Neither libdiskfs nor ext2fs ever map the store contents into memory
> +(arguably maybe they should); they just read them with `store_read ()`,
> +and then dispose of the the read buffers properly. The excessive number
> +of VM map entries, as far as I can 

[PATCH 2/2] open_issues/gnumach_vm_map_entry_forward_merging.mdwn: edited one of sergey's emails into this wiki page.

2023-10-10 Thread jbra...@dismail.de
---
 .../gnumach_vm_map_entry_forward_merging.mdwn | 187 ++
 1 file changed, 187 insertions(+)

diff --git a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn 
b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
index 7739f4d1..b34bd61e 100644
--- a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
+++ b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
@@ -10,6 +10,193 @@ License|/fdl]]."]]"""]]
 
 [[!tag open_issue_gnumach]]
 
+Mach is not always able to merge/coalesce mappings (VM entries) that
+are made next to each other, leading to potentially very large numbers
+of VM entries, which may slow down the VM functionality. This is said
+to particularly affect ext2fs and bash.
+
+The basic idea of Mach designers is that entry coalescing is only an
+optimization anyway, not a hard guarantee. We can apply it in the
+common simple case, and just refuse to do it in any remotely complex
+cases (copies, shadows, multiply referenced objects, pageout in
+progress, ...).
+
+Suppose you define a special test program that intentionally maps
+parts of a file next to each other and watches the resulting VM map
+entries, and just ran a full Hurd system and observed results.
+
+One can stress test ext2fs in particular to check for VM entry
+merging:
+
+ # grep NR -r /usr &> /dev/null
+ # vminfo 8 | wc -l
+
+That grep opens and reads lots of files to simulate a long-running
+machine (perhaps a build server); then one can look at the number of
+mappings in ext2fs afterwards. Depending on how much your /usr is
+populated, you will get different numbers.  An older Hurd from say
+2022, the above comand would result in 5,000-20,000 entries depending
+on the machine!  In June 2023, GNUMach gained some forward merging
+functinality, which lowered the number of mappings down to 93 entries!
+
+(It is a separate question of why ext2fs makes that many mappings in
+the first place. There could possible by a leak in ext2fs that would
+be responsible for this, but none have been found so far. Possibly
+another problem is that we have an unbounded node cache in libdiskfs
+and Mach caching VM objects, which also keeps the node alive.)
+
+These are the simple forward merging cases that GNUMach now supports:
+
+- Forward merging: in `vm_map_enter`, merging with the next entry, in
+  addition to merging with the previous entry that was already there;
+
+- For forward merging, a `VM_OBJECT_NULL` can be merged in front of a
+  non-null VM object, provided the second entry has large enough
+  offset into the object to 'mount' the the first entry in front of
+  it;
+
+- A VM object can always be merged with itself (provded offsets/sizes
+  match) -- this allows merging entries referencing non-anonymous VM
+  objects too, such a file mappings;
+
+- Operations such as `vm_protect` do "clipping", which means splitting
+  up VM map entries, in case the specified region lands in the middle
+  of an entry -- but they were never "gluing" (merging, coalescing)
+  entries back together if the region is later vm_protect'ed back. Now
+  this is done (and we try to coalesce in some other cases too). This
+  should particularly help with "program break" (brk) in glibc, which
+  vm_protect's the pages allocated for the brk back and forth all the
+  time.
+
+- As another optimization, throw away unmapped physical pages when
+  there are no other references to the object (provided there is no
+  pager). Previously the pages would remain in core until the object
+  was either unmapped completely, or until another mapping was to be
+  created in place of the unmapped one and coalescing kicked in.
+
+- Also shrink the size of `struct vm_page` somewhat. This was a low
+  hanging fruit.
+
+`vm_map_coalesce_entry()` is analogous to `vm_map_simplify_entry()` in
+other versions of Mach, but different enough to warrant a different
+name. The same "coalesce" wording was used as in
+`vm_object_coalesce()`, which is appropriate given that the former is
+a wrapper for the latter.
+
+### The following provides clarifies some inaccuracies in old IRC logs:
+
+any request, be it e.g. `mmap()`, or `mprotect()`, can easily split
+entries
+
+`mmap ()` cannot split entries to my knowledge, unless we're talking about
+`MAP_FIXED` and unampping parts of the existing mappings.
+
+my ext2fs has ~6500 entries, but I guess this is related to
+mapping blocks from the filesystem, right?
+
+No. Neither libdiskfs nor ext2fs ever map the store contents into memory
+(arguably maybe they should); they just read them with `store_read ()`,
+and then dispose of the the read buffers properly. The excessive number
+of VM map entries, as far as I can see, is just heap memory.
+
+(I'm perplexed about how the kernel can merge two memory objects if
+disctinct port names exist in the tasks' name space -- that's what
+`mem_obj` is, right?)
+
+if, say, 584 and 585 above are port names which the task expects to be
+able to access