Re: [PATCH v4] Improve the performance of --num-threads -d 31
Hi Minfei and Minoru, The bug has been fixed. -- Thanks Zhou On 03/09/2016 08:27 AM, Zhou Wenjian wrote: v4: 1. fix a bug caused by the logic v3: 1. remove some unused variables 2. fix a bug caused by the wrong logic 3. fix a bug caused by optimising 4. improve more performance by using Minoru Usui's code multi-threads implementation will introduce extra cost when handling each page. The origin implementation will also do the extra work for filtered pages. So there is a big performance degradation in --num-threads -d 31. The new implementation won't do the extra work for filtered pages any more. So the performance of -d 31 is close to that of serial processing. The new implementation is just like the following: * The basic idea is producer producing page and consumer writing page. * Each producer have a page_flag_buf list which is used for storing page's description. * The size of page_flag_buf is little so it won't take too much memory. * And all producers will share a page_data_buf array which is used for storing page's compressed data. * The main thread is the consumer. It will find the next pfn and write it into file. * The next pfn is smallest pfn in all page_flag_buf. Signed-off-by: Minoru UsuiSigned-off-by: Zhou Wenjian --- makedumpfile.c | 298 +++-- makedumpfile.h | 35 --- 2 files changed, 202 insertions(+), 131 deletions(-) diff --git a/makedumpfile.c b/makedumpfile.c index fa0b779..2b0864a 100644 --- a/makedumpfile.c +++ b/makedumpfile.c @@ -3483,7 +3483,8 @@ initial_for_parallel() unsigned long page_data_buf_size; unsigned long limit_size; int page_data_num; - int i; + struct page_flag *current; + int i, j; len_buf_out = calculate_len_buf_out(info->page_size); @@ -3560,10 +3561,16 @@ initial_for_parallel() limit_size = (get_free_memory_size() - MAP_REGION * info->num_threads) * 0.6; + if (limit_size < 0) { + MSG("Free memory is not enough for multi-threads\n"); + return FALSE; + } page_data_num = limit_size / page_data_buf_size; + info->num_buffers = 3 * info->num_threads; - info->num_buffers = MIN(NUM_BUFFERS, page_data_num); + info->num_buffers = MAX(info->num_buffers, NUM_BUFFERS); + info->num_buffers = MIN(info->num_buffers, page_data_num); DEBUG_MSG("Number of struct page_data for produce/consume: %d\n", info->num_buffers); @@ -3588,6 +3595,36 @@ initial_for_parallel() } /* +* initial page_flag for each thread +*/ + if ((info->page_flag_buf = malloc(sizeof(void *) * info->num_threads)) + == NULL) { + MSG("Can't allocate memory for page_flag_buf. %s\n", + strerror(errno)); + return FALSE; + } + memset(info->page_flag_buf, 0, sizeof(void *) * info->num_threads); + + for (i = 0; i < info->num_threads; i++) { + if ((info->page_flag_buf[i] = calloc(1, sizeof(struct page_flag))) == NULL) { + MSG("Can't allocate memory for page_flag. %s\n", + strerror(errno)); + return FALSE; + } + current = info->page_flag_buf[i]; + + for (j = 1; j < NUM_BUFFERS; j++) { + if ((current->next = calloc(1, sizeof(struct page_flag))) == NULL) { + MSG("Can't allocate memory for page_flag. %s\n", + strerror(errno)); + return FALSE; + } + current = current->next; + } + current->next = info->page_flag_buf[i]; + } + + /* * initial fd_memory for threads */ for (i = 0; i < info->num_threads; i++) { @@ -3612,7 +3649,8 @@ initial_for_parallel() void free_for_parallel() { - int i; + int i, j; + struct page_flag *current; if (info->threads != NULL) { for (i = 0; i < info->num_threads; i++) { @@ -3655,6 +3693,19 @@ free_for_parallel() free(info->page_data_buf); } + if (info->page_flag_buf != NULL) { + for (i = 0; i < info->num_threads; i++) { + for (j = 0; j < NUM_BUFFERS; j++) { + if (info->page_flag_buf[i] != NULL) { + current = info->page_flag_buf[i]; + info->page_flag_buf[i] = current->next; + free(current); + } +
[PATCH] makedumpfile: Fix several issues with reading ELF pages
While adopting the algorithm for libkdumpfile, several corner cases were found by a test case: 1. If the last part of a page is not present in the ELF file, it should be replaced with zeroes. However, the check is incorrect. 2. If the beginning of a page is not present, following data is read from an incorrect file offset. 3. If the page is split among several segments, the current position in the buffer is not always taken into account. Case 1 is a simple typo/braino (writing bufptr instead of endp). To fix cases 2 and 3, it is best to update the paddr variable so that it always corresponds to the current read position in the buffer. I have tested the new code with a specially crafted ELF dump where one page had the following (artificial) layout: # PAGE RANGESTORED IN DATA -- 1 0x000-0x007 nowhere fake zero 2 0x008-0x067 LOAD #1 read from file 3 0x068-0x06f nowhere fake zero 4 0x070-0x13f LOAD #2 read from file 5 0x140-0x147 LOAD #2 zero (memsz > filesz) 6 0x148-0xff7 LOAD #3 read from file 7 0xff8-0xfff nowhere fake zero Case 1 tests the conditional right after get_pt_load_extents(). Case 2 tests file read after missing data. Case 3 tests the conditional from case 1 with non-zero buffer position. Case 5 tests the last conditional in the loop (p < endp). Case 6 tests exact match in get_pt_load_extents(). Case 7 tests the final conditional after the loop. Signed-off-by: Petr Tesarik--- makedumpfile.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) --- a/makedumpfile.c +++ b/makedumpfile.c @@ -656,14 +656,15 @@ readpage_elf(unsigned long long paddr, v p = bufptr; endp = p + info->page_size; while (p < endp) { - idx = closest_pt_load(paddr + (p - bufptr), endp - p); + idx = closest_pt_load(paddr, endp - p); if (idx < 0) break; get_pt_load_extents(idx, _start, _end, , ); - if (phys_start > paddr + (p - bufptr)) { + if (phys_start > paddr) { memset(p, 0, phys_start - paddr); p += phys_start - paddr; + paddr = phys_start; } offset += paddr - phys_start; @@ -677,6 +678,7 @@ readpage_elf(unsigned long long paddr, v return FALSE; } p += size; + paddr += size; } if (p < endp) { size = phys_end - paddr; @@ -684,6 +686,7 @@ readpage_elf(unsigned long long paddr, v size = endp - p; memset(p, 0, size); p += size; + paddr += size; } } @@ -691,7 +694,7 @@ readpage_elf(unsigned long long paddr, v ERRMSG("Attempt to read non-existent page at 0x%llx.\n", paddr); return FALSE; - } else if (p < bufptr) + } else if (p < endp) memset(p, 0, endp - p); return TRUE; @@ -708,14 +711,15 @@ readpage_elf_parallel(int fd_memory, uns p = bufptr; endp = p + info->page_size; while (p < endp) { - idx = closest_pt_load(paddr + (p - bufptr), endp - p); + idx = closest_pt_load(paddr, endp - p); if (idx < 0) break; get_pt_load_extents(idx, _start, _end, , ); - if (phys_start > paddr + (p - bufptr)) { + if (phys_start > paddr) { memset(p, 0, phys_start - paddr); p += phys_start - paddr; + paddr = phys_start; } offset += paddr - phys_start; @@ -730,6 +734,7 @@ readpage_elf_parallel(int fd_memory, uns return FALSE; } p += size; + paddr += size; } if (p < endp) { size = phys_end - paddr; @@ -737,6 +742,7 @@ readpage_elf_parallel(int fd_memory, uns size = endp - p; memset(p, 0, size); p += size; + paddr += size; } } @@ -744,7 +750,7 @@ readpage_elf_parallel(int fd_memory, uns ERRMSG("Attempt to read non-existent page at 0x%llx.\n", paddr); return FALSE; - } else if (p < bufptr) + } else if (p < endp) memset(p, 0, endp - p); return TRUE; ___ kexec mailing list