On 12/01/2015 04:39 PM, Chao Fan wrote:
Hi Zhou Wenjian,
I did some tests according to your tables. I have a problem when I set
dump_level to 31. The machine has 1T memory, and when dump_level was set
to 31, the size of vmcore is 17G. The kernel is 3.10.0-327.el7.x86_64.
The kexec-tools is kexec-tools-2.0.7-38.el7.x86_64.
If I use
core_collector time makedumpfile -l --message-level 1 -d 31
in kdump based on makedumpfile 1.5.7, the time is
63 seconds(the average of many tests).
And then I use the kdump based on makedumpfile 1.5.9.
core_collector time makedumpfile -l --message-level 1 -d 31
the time is 58 seconds.
core_collector time makedumpfile --num-threads 1 -l --message-level 1 -d 31
the time is 240 seconds.
core_collector time makedumpfile --num-threads 2 -l --message-level 1 -d 31
the time is 189 seconds.
core_collector time makedumpfile --num-threads 4 -l --message-level 1 -d 31
the time is 220 seconds.
core_collector time makedumpfile --num-threads 8 -l --message-level 1 -d 31
the time is 417 seconds.
core_collector time makedumpfile --num-threads 12 -l --message-level 1 -d 31
the time is 579 seconds.
core_collector time makedumpfile --num-threads 16 -l --message-level 1 -d 31
the time is 756 seconds.
So I do not know why if I add "--num-threads", the makedumpfile will use more
time than without "--num-threads". Since your table also shows that
makedumpfile -d 31, the threads_num is 0, the makdumpfile is fatest.
If there are any problems in my tests, please tell me.
Hello,
I think there is no problem if other test results are as expected.
--num-threads mainly reduces the time of compressing.
So for lzo, it can't do much help at most of time.
However, when "-d 31" is specified, it will be worse.
Less than 50 buffers are used to cache the compressed page.
And even the page has been filtered, it will also take a buffer.
So if "-d 31" is specified, the filtered page will use a lot
of buffers. Then the page which needs to be compressed can't
be compressed parallel.
So, it's not strange that "--num-threads" will take more time
in "-l -d 31"
--
Thanks
Zhou
Thanks,
Chao Fan
----- Original Message -----
From: "Wenjian Zhou/周文剑" <[email protected]>
To: [email protected]
Sent: Monday, June 8, 2015 11:55:41 AM
Subject: Re: [PATCH RFC 00/11] makedumpfile: parallel processing
hello all,
I test this patch set in two machines and the following is the benchmark.
These tables show the time that makedumpfile spends. And the unit is second.
"core-data" in the table means the context in the vmcore.
For example:
core-data's value is 256. It means that in the vmcore, 256 * 8 bits of
each
page
are set to 1.
"-l" in the table means producing lzo format vmcore
"-c" in the table means producing kdump-compressed format vmcore
###################################machine with 128G memory
************ makedumpfile -d 0 ******************
core-data 256 1280
threads_num
-l
0 758 881
8 932 1014
16 973 1085
-c
0 3994 4071
8 966 1007
16 1053 1192
************ makedumpfile -d 3 ******************
core-data 256 1280
threads_num
-l
0 764 847
8 948 1058
16 943 1069
-c
0 4021 4050
8 949 1029
16 1051 1190
************ makedumpfile -d 31 ******************
core-data 256 1280
threads_num
-l
0 4 4
8 639 610
16 680 680
-c
0 14 13
8 607 610
16 631 662
###################################machine with 24G memory
************ makedumpfile -d 0 ******************
core-data 0 256 512 768 1024 1280
1536 1792 2048 2304 2560 2816 3072 3328
3584 3840 4096
threads_num
-l
0 15 140 186 196 196 196
196 197 197 197 195 195 195 195 186 131
15
4 9 136 189 204 204 202
201 200 201 200 200 202 204 203 189 136
9
8 11 131 193 198 198 202
206 205 206 205 205 202 198 197 193 132
11
12 18 137 194 202 203 197
201 203 204 202 201 196 202 202 194 136
17
-c
0 80 786 967 1031 874 849
700 608 652 603 764 768 873 1031 1016 776
80
4 82 262 315 321 296 256
255 220 218 221 241 268 303 320 319 259
84
8 58 148 174 189 179 189
196 198 199 198 196 190 178 174 170 145
57
12 56 112 131 157 170 189
200 204 204 203 199 191 170 157 132 111
59
************ makedumpfile -d 1 ******************
core-data 0 256 512 768 1024 1280
1536 1792 2048 2304 2560 2816 3072 3328
3584 3840 4096
threads_num
-l
0 16 134 194 204 204 205
205 206 205 207 204 203 204 204 193 134
15
4 9 132 193 197 196 198
199 200 200 200 199 197 196 197 192 132
9
8 12 135 189 202 204 200
197 196 197 195 196 199 203 202 189 136
12
12 16 130 190 200 200 205
202 201 200 201 202 205 199 200 189 131
17
-c
0 77 775 1009 1032 872 853
699 606 643 602 758 765 870 1026 1014 774
78
4 80 262 316 322 332 257
247 217 223 218 288 256 322 322 315 258
81
8 56 146 173 176 170 184
198 205 207 203 198 185 169 180 169 149
56
12 56 110 133 152 175 185
194 202 202 202 193 184 176 152 135 114
56
************ makedumpfile -d 7 ******************
core-data 0 256 512 768 1024 1280
1536 1792 2048 2304 2560 2816 3072 3328
3584 3840 4096
threads_num
-l
0 16 138 188 197 197 197
197 197 197 198 196 197 197 197 189 137
16
4 10 131 187 202 205 203
202 202 203 203 201 203 204 201 187 131
8
8 11 135 191 199 197 201
203 205 206 204 203 200 197 199 192 134
11
12 18 134 195 201 203 197
199 202 202 201 199 196 203 201 197 134
19
-c
0 77 770 1011 1032 871 841
698 621 645 601 763 765 870 1025 1014 773
78
4 81 263 311 320 319 255
240 216 242 214 240 257 300 319 314 255
80
8 57 157 176 172 174 191
196 199 199 199 195 191 173 171 167 146
57
12 55 111 136 156 170 188
203 204 204 203 201 186 168 156 136 112
56
************ makedumpfile -d 31 ******************
core-data 0 256 512 768 1024 1280
1536 1792 2048 2304 2560 2816 3072 3328
3584 3840 4096
threads_num
-l
0 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1
4 7 8 8 8 8 8
8 8 8 8 8 8 8 8 7 8
8
8 11 11 11 10 11 11
11 11 11 11 10 11 11 11 11 11
11
12 14 13 14 13 13 15
15 13 15 13 14 14 13 15 15 15
16
-c
0 4 4 5 4 4 4
4 4 4 4 4 4 4 4 4 4
4
4 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10
10
8 12 12 12 13 12 12
12 12 12 12 13 12 14 13 12 12
13
12 14 16 14 14 13 15
15 15 14 14 14 14 16 14 15 15
14
On 06/05/2015 03:56 PM, Zhou Wenjian wrote:
This patch set implements parallel processing by means of multiple threads.
With this patch set, it is available to use multiple threads to read
and compress pages. This parallel process will save time.
This feature only supports creating dumpfile in kdump-compressed format
from
vmcore in kdump-compressed format or elf format. Currently, sadump and
xen kdump are not supported.
Qiao Nuohan (11):
Add readpage_kdump_compressed_parallel
Add mappage_elf_parallel
Add readpage_elf_parallel
Add read_pfn_parallel
Add function to initial bitmap for parallel use
Add filter_data_buffer_parallel
Add write_kdump_pages_parallel to allow parallel process
Add write_kdump_pages_parallel_cyclic to allow parallel process in
cyclic_mode
Initial and free data used for parallel process
Make makedumpfile available to read and compress pages parallelly
Add usage and manual about multiple threads process
Makefile | 2 +
erase_info.c | 29 +-
erase_info.h | 2 +
makedumpfile.8 | 24 +
makedumpfile.c | 1505
+++++++++++++++++++++++++++++++++++++++++++++++++++++++-
makedumpfile.h | 79 +++
print_info.c | 16 +
7 files changed, 1652 insertions(+), 5 deletions(-)
_______________________________________________
kexec mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/kexec
--
Thanks
Zhou Wenjian
_______________________________________________
kexec mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
kexec mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/kexec