(2013/04/23 20:45), Cliff Wickman wrote:
On Tue, Apr 23, 2013 at 09:38:57AM +0900, HATAYAMA Daisuke wrote:
(2013/04/23 2:55), Cliff Wickman wrote:
Hello Mr. Atayama and Mr. Kumagai,

I have been playing with the v4 patches
       kdump, vmcore: support mmap() on /proc/vmcore
and find the mmap interface to /proc/vmcore potentially about 80x faster than
the read interface.

But in practice (using a makedumpfile that mmap's instead of read's) I find
it about 10x slower.

It looks like makedumpfile's usage of the interface is very inefficient.
It will mmap an area, read a page, then back up the offset to a previous
page.  It has to munmap and mmap on virtually every read.

You can change size of mapping memory through command-line option
--map-size <some KB>.

The version of makedumpfile is experimental. The design should be
changed if it turns out to be problematic.

Yes I'm using --map-size <some KB> but the bigger I make the mapping
size the worse makedumpfile performs. The typical pattern is to map and
read page x, then map and read page x - 1.  So every read has to unmap
and remap.  The bigger the mapping, the slower it goes.

Do you have a re-worked makedumpfile that predicts a large range of
pages and mmap's the whole range just once?
It seems that makedumpfile should have the information available to do
that.


The benchmark result has already shown that under large enough map size,
the current implementation performs as well as other kernel-space
implementation that maps a whole range of memory.

I must be missing some part of that benchmark.  I see that the interface
is much faster, but my benchmarks of makedumpfile itself are much slower
when using mmap.
Can you point me to the makedumpfile source that you are using?


I used mmap branch at

git://git.code.sf.net/p/makedumpfile/code

with the following patch applied:

===
diff --git a/makedumpfile.c b/makedumpfile.c
index 7acbf72..9dc6aee 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -290,8 +290,10 @@ read_with_mmap(off_t offset, void *bufptr, unsigned long size) {

 next_region:

-       if (!is_mapped_with_mmap(offset))
-               update_mmap_range(offset);
+       if (!is_mapped_with_mmap(offset)) {
+               if (!update_mmap_range(offset))
+                       return FALSE;
+       }

        read_size = MIN(info->mmap_end_offset - offset, size);
===

In addition, the current implementation of remap_pfn_range uses 4KB
pages only. This means that total size of PTEs amounts to 2GB per 1TB.
It's better to map pages little by little for small memory programming.

Agreed, we need a way to map with 2M pages.  And I am not suggesting that
you map all of the old kernel memory at once.  Just one region of page
structures at a time.

Ideally so, but the benchmark showed good performance even in the current impelementation, so I'm now thinking that modifying remap_pfn_range is not definitely necessary.

--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/kexec

Reply via email to