RE: [PATCH] mm/mmap: fix the adjusted length error
Thank you for your reply! > How significant is this problem in real-world use cases? How much trouble is > it causing? In my opinion, this problem is very rare in real-world use cases. In arm64 or x86 environment, the virtual memory is enough. In arm32 environment, each process has only 3G or 4G or less, but we seldom use out all of the virtual memory, it only happens in some special environment. They almost use out all the virtual memory, and in some moment, they will change their working mode so they will release and allocate memory again. This current length limitation will cause this problem. I explain it's the memory length limitation. But they can't accept the reason, it is unreasonable that we fail to allocate memory even though the memory gap is enough. > Have you looked further into this? Michel is concerned about the performance > cost of the current solution. The current algorithm(change before) is wonderful, and it has been used for a long time, I don't think it is worthy to change the whole algorithm in order to fix this problem. Therefore, I just adjust the gap_start and gap_end value in place of the length. My change really affects the performance because I calculate the gap_start and gap_end value again and again. Does it affect too much performance? I have no complex environment, so I can't test it, but I don't think it will cause too much performance loss. First, I don't change the whole algorithm. Second, unmapped_area and unmapped_area_topdown function aren't used frequently. Maybe there are some big performance problems I'm not concerned about. But I think if that's not a problem, there should be a limitation description. -Original Message- From: Andrew Morton [mailto:a...@linux-foundation.org] Sent: Friday, July 12, 2019 9:20 AM To: chenjianhong (A) Cc: Michel Lespinasse ; Greg Kroah-Hartman ; mho...@suse.com; Vlastimil Babka ; Kirill A. Shutemov ; Yang Shi ; ja...@google.com; steve.cap...@arm.com; tiny.win...@gmail.com; LKML ; linux-mm ; sta...@vger.kernel.org; wi...@infradead.org Subject: Re: [PATCH] mm/mmap: fix the adjusted length error On Sat, 18 May 2019 07:05:07 + "chenjianhong (A)" wrote: > I explain my test code and the problem in detail. This problem is > found in 32-bit user process, because its virtual is limited, 3G or 4G. > > First, I explain the bug I found. Function unmapped_area and > unmapped_area_topdowns adjust search length to account for worst case > alignment overhead, the code is ' length = info->length + info->align_mask; '. > The variable info->length is the length we allocate and the variable > info->align_mask accounts for the alignment, because the gap_start or > info->gap_end > value also should be an alignment address, but we can't know the alignment > offset. > So in the current algorithm, it uses the max alignment offset, this > value maybe zero or other(0x1ff000 for shmat function). > Is it reasonable way? The required value is longer than what I allocate. > What's more, why for the first time I can allocate the memory > successfully Via shmat, but after releasing the memory via shmdt and I > want to attach again, it fails. This is not acceptable for many people. > > Second, I explain my test code. The code I have sent an email. The > following is the step. I don't think it's something unusual or > unreasonable, because the virtual memory space is enough, but the > process can allocate from it. And we can't pass explicit addresses to > function mmap or shmat, the address is getting from the left vma gap. > 1, we allocat large virtual memory; > 2, we allocate hugepage memory via shmat, and release one of the > hugepage memory block; 3, we allocate hugepage memory by shmat again, > this will fail. How significant is this problem in real-world use cases? How much trouble is it causing? > Third, I want to introduce my change in the current algorithm. I don't > change the current algorithm. Also, I think there maybe a better way > to fix this error. Nowadays, I can just adjust the gap_start value. Have you looked further into this? Michel is concerned about the performance cost of the current solution.
RE: [PATCH] mm/mmap: fix the adjusted length error
hm[i] == (void *)-1) { fprintf(stderr, "funa shmat[%d] size(0x%08x)failed %d\n", i, seg_size[i], errno); return -1; } system("pid=`ps -e | grep memory | awk '{print $1}'`;cat /proc/$pid/maps"); sleep(2); shmdt(shm[1]); printf("---fun_A shmdt---\n"); system("pid=`ps -e | grep memory | awk '{print $1}'`;cat /proc/$pid/maps"); printf("---fun_A ok---\n"); return 0; } /* * first, we allocat large virtual memory; * second, we allocate hugepage memory by shmat, and release one * of the hugepage memory block; * third, we allocate hugepage memory by shmat again, this will fail. */ int main(int argc,char * argv[]) { int i; int ret = 0; for (i == 0; i < 52; i++) { malloc(size);//first } if (init_memory() != 0) { ret = -1; goto failed_memory; } fun_C();//second sleep(5); ret = fun_A();//third if (ret != 0) { goto failed_memory; } sleep(3); failed_memory: del_segmem(); return ret; } -Original Message- From: Michel Lespinasse [mailto:wal...@google.com] Sent: Saturday, May 18, 2019 8:13 AM To: chenjianhong (A) Cc: Greg Kroah-Hartman ; Andrew Morton ; mho...@suse.com; Vlastimil Babka ; Kirill A. Shutemov ; Yang Shi ; ja...@google.com; steve.cap...@arm.com; tiny.win...@gmail.com; LKML ; linux-mm ; sta...@vger.kernel.org Subject: Re: [PATCH] mm/mmap: fix the adjusted length error I worry that the proposed change turns the search from an O(log N) worst case into a O(N) one. To see why the current search is O(log N), it is easiest to start by imagining a simplified search algorithm that wouldn't include the low and high address limits. In that algorithm, backtracking through the vma tree is never necessary - the tree walk can always know, prior to going left or right, if a suitable gap will be found in the corresponding subtree. The code we have today does have to respect the low and high address limits, so it does need to implement backtracking - but this backtracking only occurs to back out of subtrees that include the low address limit (the search went 'left' into a subtree that has a large enough gap, but the gap turns out to be below the limit so it can't be used and the search needs to go 'right' instead). Because of this, the amount of backtracking that can occur is very limited, and the search is still O(log N) in the worst case. With your proposed change, backtracking could occur not only around the low address limit, but also at any node within the search tree, when it turns out that a gap that seemed large enough actually isn't due to alignment constraints. So, the code should still work, but it could backtrack more in the worst case, turning the worst case search into an O(N) thing. I am not sure what to do about this. First I would want to understand more about your test case; is this something that you stumbled upon without expecting it or was it an artificially constructed case to show the limitations of the current search algorithm ? Also, if your process does something unusual and expects to be able to map (close to) the entirety of its address space, would it be reasonable for it to manually manage the address space and pass explicit addresses to mmap / shmat ? On Thu, May 16, 2019 at 11:02 PM jianhong chen wrote: > In linux version 4.4, a 32-bit process may fail to allocate 64M hugepage > memory by function shmat even though there is a 64M memory gap in > the process. > > It is the adjusted length that causes the problem, introduced from > commit db4fbfb9523c935 ("mm: vm_unmapped_area() lookup function"). > Accounting for the worst case alignment overhead, function unmapped_area > and unmapped_area_topdown adjust the search length before searching > for available vma gap. This is an estimated length, sum of the desired > length and the longest alignment offset, which can cause misjudgement > if the system has very few virtual memory left. For example, if the > longest memory gap available is 64M, we can’t get it from the system > by allocating 64M hugepage memory via shmat function. The reason is > that it requires a longger length, the sum of the desired length(64M) > and the longest alignment offset. > > To fix this error ,we can calculate the alignment offset of > gap_start or gap_end to get a desired gap_start or gap_end value, > before searching for the available gap. In this way, we don't > need to adjust the search length. > > Problem reproduces procedure: > 1. allocate a lot of virtual memory segments via shmat and malloc > 2. release one of the biggest memory segmen
RE: my test code and result///[PATCH] mm/mmap: fix the adjusted length error
00:0e 458756/SYSV00ee (deleted) ff9df000-ffa0 rw-p 00:00 0 [stack] -1000 r-xp 00:00 0 [vectors] ---after fun_C shmdt--- 8000-9000 r-xp 00:12 290 /tmp/memory_mmap 00011000-00012000 rw-p 1000 00:12 290 /tmp/memory_mmap 27589000-f75bd000 rw-p 00:00 0 f75bd000-f76e4000 r-xp 01:00 560 /lib/libc-2.11.1.so f76e4000-f76ec000 ---p 00127000 01:00 560 /lib/libc-2.11.1.so f76ec000-f76ee000 r--p 00127000 01:00 560 /lib/libc-2.11.1.so f76ee000-f76ef000 rw-p 00129000 01:00 560 /lib/libc-2.11.1.so f76ef000-f76f2000 rw-p 00:00 0 f76f2000-f7713000 r-xp 01:00 583 /lib/libgcc_s.so.1 f7713000-f771a000 ---p 00021000 01:00 583 /lib/libgcc_s.so.1 f771a000-f771b000 rw-p 0002 01:00 583 /lib/libgcc_s.so.1 f771b000-f7738000 r-xp 01:00 543 /lib/ld-2.11.1.so f773c000-f773d000 rw-p 00:00 0 f773d000-f773f000 rw-p 00:00 0 f773f000-f774 r--p 0001c000 01:00 543 /lib/ld-2.11.1.so f774-f7741000 rw-p 0001d000 01:00 543 /lib/ld-2.11.1.so f780-f7a0 rw-s 00:0e 327680/SYSV00ea (deleted) fba0-fca0 rw-s 00:0e 393218/SYSV00ec (deleted) fca0-fce0 rw-s 00:0e 425987/SYSV00ed (deleted) fce0-fd80 rw-s 00:0e 458756/SYSV00ee (deleted) ff9df000-ffa0 rw-p 00:00 0 [stack] -1000 r-xp 00:00 0 [vectors] ---fun_C ok---- ---fun_A shmat- funa shmat[1] size(0x0400)failed 12 -Original Message- From: chenjianhong (A) Sent: Friday, May 17, 2019 2:07 PM To: gre...@linuxfoundation.org; a...@linux-foundation.org; mho...@suse.com; vba...@suse.cz; kirill.shute...@linux.intel.com; yang@linux.alibaba.com; ja...@google.com; steve.cap...@arm.com; tiny.win...@gmail.com; wal...@google.com Cc: chenjianhong (A) ; linux-kernel@vger.kernel.org; linux...@kvack.org; sta...@vger.kernel.org Subject: [PATCH] mm/mmap: fix the adjusted length error In linux version 4.4, a 32-bit process may fail to allocate 64M hugepage memory by function shmat even though there is a 64M memory gap in the process. It is the adjusted length that causes the problem, introduced from commit db4fbfb9523c935 ("mm: vm_unmapped_area() lookup function"). Accounting for the worst case alignment overhead, function unmapped_area and unmapped_area_topdown adjust the search length before searching for available vma gap. This is an estimated length, sum of the desired length and the longest alignment offset, which can cause misjudgement if the system has very few virtual memory left. For example, if the longest memory gap available is 64M, we can’t get it from the system by allocating 64M hugepage memory via shmat function. The reason is that it requires a longger length, the sum of the desired length(64M) and the longest alignment offset. To fix this error ,we can calculate the alignment offset of gap_start or gap_end to get a desired gap_start or gap_end value, before searching for the available gap. In this way, we don't need to adjust the search length. Problem reproduces procedure: 1. allocate a lot of virtual memory segments via shmat and malloc 2. release one of the biggest memory segment via shmdt 3. attach the biggest memory segment via shmat e.g. process maps: 8000-9000 r-xp 00:12 3385/tmp/memory_mmap 00011000-00012000 rw-p 1000 00:12 3385/tmp/memory_mmap 27536000-f756a000 rw-p 00:00 0 f756a000-f7691000 r-xp 01:00 560 /lib/libc-2.11.1.so f7691000-f7699000 ---p 00127000 01:00 560 /lib/libc-2.11.1.so f7699000-f769b000 r--p 00127000 01:00 560 /lib/libc-2.11.1.so f769b000-f769c000 rw-p 00129000 01:00 560 /lib/libc-2.11.1.so f769c000-f769f000 rw-p 00:00 0 f769f000-f76c r-xp 01:00 583 /lib/libgcc_s.so.1 f76c-f76c7000 ---p 00021000 01:00 583 /lib/libgcc_s.so.1 f76c7000-f76c8000 rw-p 0002 01:00 583 /lib/libgcc_s.so.1 f76c8000-f76e5000 r-xp 01:00 543 /lib/ld-2.11.1.so f76e9000-f76ea000 rw-p 00:00 0 f76ea000-f76ec000 rw-p 00:00 0 f76ec000-f76ed000 r--p 0001c000 01:00 543 /lib/ld-2.11.1.so f76ed000-f76ee000 rw-p 0001d000 01:00 543 /lib/ld-2.11.1.so f780-f7a0 rw-s 00:0e 0 /SYSV00ea (deleted) fba0-fca0 rw-s 00:0e 65538 /SYSV00ec (deleted) fca0-fce0 rw-s 00:0e 98307 /SYSV00ed (deleted) fce0-fd80 rw-s 00:0e 131076 /SYSV00ee (deleted) ff913000-ff934000 rw-p 00:00 0 [stack] -1000 r-xp 00:00 0 [vectors] from 0xf