[PATCH V1] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-16 Thread Jianpeng Ma
My workload is a raid5 which had 16 disks. And used our filesystem to
write using direct-io mode.
I used the blktrace to find those message:
8,16   0 6647 2.453665504  2579  M   W 7493152 + 8 [md0_raid5]
8,16   0 6648 2.453672411  2579  Q   W 7493160 + 8 [md0_raid5]
8,16   0 6649 2.453672606  2579  M   W 7493160 + 8 [md0_raid5]
8,16   0 6650 2.453679255  2579  Q   W 7493168 + 8 [md0_raid5]
8,16   0 6651 2.453679441  2579  M   W 7493168 + 8 [md0_raid5]
8,16   0 6652 2.453685948  2579  Q   W 7493176 + 8 [md0_raid5]
8,16   0 6653 2.453686149  2579  M   W 7493176 + 8 [md0_raid5]
8,16   0 6654 2.453693074  2579  Q   W 7493184 + 8 [md0_raid5]
8,16   0 6655 2.453693254  2579  M   W 7493184 + 8 [md0_raid5]
8,16   0 6656 2.453704290  2579  Q   W 7493192 + 8 [md0_raid5]
8,16   0 6657 2.453704482  2579  M   W 7493192 + 8 [md0_raid5]
8,16   0 6658 2.453715016  2579  Q   W 7493200 + 8 [md0_raid5]
8,16   0 6659 2.453715247  2579  M   W 7493200 + 8 [md0_raid5]
8,16   0 6660 2.453721730  2579  Q   W 7493208 + 8 [md0_raid5]
8,16   0 6661 2.453721974  2579  M   W 7493208 + 8 [md0_raid5]
8,16   0 6662 2.453728202  2579  Q   W 7493216 + 8 [md0_raid5]
8,16   0 6663 2.453728436  2579  M   W 7493216 + 8 [md0_raid5]
8,16   0 6664 2.453734782  2579  Q   W 7493224 + 8 [md0_raid5]
8,16   0 6665 2.453735019  2579  M   W 7493224 + 8 [md0_raid5]
8,16   0  2.453741401  2579  Q   W 7493232 + 8 [md0_raid5]
8,16   0 6667 2.453741632  2579  M   W 7493232 + 8 [md0_raid5]
8,16   0 6668 2.453748148  2579  Q   W 7493240 + 8 [md0_raid5]
8,16   0 6669 2.453748386  2579  M   W 7493240 + 8 [md0_raid5]
8,16   0 6670 2.453851843  2579  I   W 7493144 + 104 [md0_raid5]
8,16   00 2.453853661 0  m   N cfq2579 insert_request
8,16   0 6671 2.453854064  2579  I   W 7493120 + 24 [md0_raid5]
8,16   00 2.453854439 0  m   N cfq2579 insert_request
8,16   0 6672 2.453854793  2579  U   N [md0_raid5] 2
8,16   00 2.453855513 0  m   N cfq2579 Not idling.st->count:1
8,16   00 2.453855927 0  m   N cfq2579 dispatch_insert
8,16   00 2.453861771 0  m   N cfq2579 dispatched a request
8,16   00 2.453862248 0  m   N cfq2579 activate rq,drv=1
8,16   0 6673 2.453862332  2579  D   W 7493120 + 24 [md0_raid5]
8,16   00 2.453865957 0  m   N cfq2579 Not idling.st->count:1
8,16   00 2.453866269 0  m   N cfq2579 dispatch_insert
8,16   00 2.453866707 0  m   N cfq2579 dispatched a request
8,16   00 2.453867061 0  m   N cfq2579 activate rq,drv=2
8,16   0 6674 2.453867145  2579  D   W 7493144 + 104 [md0_raid5]
8,16   0 6675 2.454147608 0  C   W 7493120 + 24 [0]
8,16   00 2.454149357 0  m   N cfq2579 complete rqnoidle 0
8,16   0 6676 2.454791505 0  C   W 7493144 + 104 [0]
8,16   00 2.454794803 0  m   N cfq2579 complete rqnoidle 0
8,16   00 2.454795160 0  m   N cfq schedule dispatch

From above messages,we can find rq[W 7493144 + 104] and rq[W
7493120 + 24] do not merge.
Because the bio order is:
  8,16   0 6638 2.453619407  2579  Q   W 7493144 + 8 [md0_raid5]
  8,16   0 6639 2.453620460  2579  G   W 7493144 + 8 [md0_raid5]
  8,16   0 6640 2.453639311  2579  Q   W 7493120 + 8 [md0_raid5]
  8,16   0 6641 2.453639842  2579  G   W 7493120 + 8 [md0_raid5]
The bio(7493144) first and bio(7493120) later.So the subsequent
bios will be divided into two parts.
When flushing plug-list,because elv_attempt_insert_merge only support
backmerge,not supporting frontmerge.
So rq[7493120 + 24] can't merge with rq[7493144 + 104].

From my test,i found those situation can count 25% in our system.
Using this patch, there is no this situation.

Signed-off-by: Jianpeng Ma 
CC:Shaohua Li 
---
 block/blk-core.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a33870b..3c95c4d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2868,7 +2868,8 @@ static int plug_rq_cmp(void *priv, struct list_head *a, 
struct list_head *b)
struct request *rqa = container_of(a, struct request, queuelist);
struct request *rqb = container_of(b, struct request, queuelist);
 
-   return !(rqa->q <= rqb->q);
+   return !(rqa->q < rqb->q ||
+   (rqa->q == rqb->q && blk_rq_pos(rqa) < blk_rq_pos(rqb)));
 }
 
 /*
-- 
1.7.9.5


Re: Re: [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-16 Thread Jianpeng Ma
On 2012-10-16 15:48 Shaohua Li  Wrote:
>2012/10/16 Jianpeng Ma :
>> On 2012-10-15 21:18 Shaohua Li  Wrote:
>>>2012/10/15 Shaohua Li :
>>>> 2012/10/15 Jianpeng Ma :
>>>>> My workload is a raid5 which had 16 disks. And used our filesystem to
>>>>> write using direct-io mode.
>>>>> I used the blktrace to find those message:
>>>>>
>>>>> 8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 
>>>>> [md127_raid5]
>>>>> 8,16   00 1.083926214 0  m   N cfq2519 insert_request
>>>>> 8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 
>>>>> [md127_raid5]
>>>>> 8,16   00 1.083926952 0  m   N cfq2519 insert_request
>>>>> 8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
>>>>> 8,16   00 1.083927870 0  m   N cfq2519 Not 
>>>>> idling.st->count:1
>>>>> 8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
>>>>> 8,16   00 1.083928951 0  m   N cfq2519 dispatched a 
>>>>> request
>>>>> 8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
>>>>> 8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 
>>>>> [md127_raid5]
>>>>> 8,16   00 1.083933883 0  m   N cfq2519 Not 
>>>>> idling.st->count:1
>>>>> 8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
>>>>> 8,16   00 1.083934654 0  m   N cfq2519 dispatched a 
>>>>> request
>>>>> 8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
>>>>> 8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 
>>>>> [md127_raid5]
>>>>> 8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
>>>>> 8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
>>>>> 8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
>>>>>   ..
>>>>> 8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 
>>>>> [md127_raid5]
>>>>> 8,16   10 1.091396181 0  m   N cfq2519 insert_request
>>>>> 8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 
>>>>> [md127_raid5]
>>>>> 8,16   10 1.091396934 0  m   N cfq2519 insert_request
>>>>> 8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 
>>>>> [md127_raid5]
>>>>> 8,16   10 1.091397477 0  m   N cfq2519 insert_request
>>>>> 8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 
>>>>> [md127_raid5]
>>>>> 8,16   10 1.091398023 0  m   N cfq2519 insert_request
>>>>> 8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
>>>>> 8,16   10 1.091398986 0  m   N cfq2519 Not idling. 
>>>>> st->count:1
>>>>> 8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
>>>>> 8,16   10 1.091400217 0  m   N cfq2519 dispatched a 
>>>>> request
>>>>> 8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
>>>>> 8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 
>>>>> [md127_raid5]
>>>>> 8,16   10 1.091406151 0  m   N cfq2519 Not 
>>>>> idling.st->count:1
>>>>> 8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
>>>>> 8,16   10 1.091406931 0  m   N cfq2519 dispatched a 
>>>>> request
>>>>> 8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
>>>>> 8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 
>>>>> [md127_raid5]
>>>>> 8,16   10 1.091414006 0  m   N cfq2519 Not 
>>>>> idling.st->count:1
>>>>> 8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
>>>>> 8,16   10 1.091414702 0  m   N cfq2519 dispatched a 
>>>>> request
>>>>> 8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
>>>>> 8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 
>>>>> [md127_raid5]
>>>>> 8,16   10 1.091416469 0  m   N cfq2519 Not 
>>>>> id

Re: Re: [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-16 Thread Jianpeng Ma
On 2012-10-15 21:18 Shaohua Li  Wrote:
>2012/10/15 Shaohua Li :
>> 2012/10/15 Jianpeng Ma :
>>> My workload is a raid5 which had 16 disks. And used our filesystem to
>>> write using direct-io mode.
>>> I used the blktrace to find those message:
>>>
>>> 8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
>>> 8,16   00 1.083926214 0  m   N cfq2519 insert_request
>>> 8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
>>> 8,16   00 1.083926952 0  m   N cfq2519 insert_request
>>> 8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
>>> 8,16   00 1.083927870 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
>>> 8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
>>> 8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
>>> 8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
>>> 8,16   00 1.083933883 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
>>> 8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
>>> 8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
>>> 8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
>>> 8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
>>> 8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
>>>   ..
>>> 8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
>>> 8,16   10 1.091396181 0  m   N cfq2519 insert_request
>>> 8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
>>> 8,16   10 1.091396934 0  m   N cfq2519 insert_request
>>> 8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
>>> 8,16   10 1.091397477 0  m   N cfq2519 insert_request
>>> 8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
>>> 8,16   10 1.091398023 0  m   N cfq2519 insert_request
>>> 8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
>>> 8,16   10 1.091398986 0  m   N cfq2519 Not idling. 
>>> st->count:1
>>> 8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
>>> 8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
>>> 8,16   10 1.091406151 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
>>> 8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
>>> 8,16   10 1.091414006 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
>>> 8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
>>> 8,16   10 1.091416469 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091417535 0  m   N cfq2519 activate rq,drv=4
>>> 8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
>>> 8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
>>> 8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1 3606 1.092068456  4393  C   W 144322520 + 24 [0]
>>> 8,16   10 1.092069851 0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1 3607 1.092350440  4393  C   W 144322488 + 32 [0]
>>> 8,16   10 1.092351688 0  m   N cfq2519 complete rq

Re: Re: [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-16 Thread Jianpeng Ma
On 2012-10-15 21:18 Shaohua Li s...@kernel.org Wrote:
2012/10/15 Shaohua Li s...@fusionio.com:
 2012/10/15 Jianpeng Ma majianp...@gmail.com:
 My workload is a raid5 which had 16 disks. And used our filesystem to
 write using direct-io mode.
 I used the blktrace to find those message:

 8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
 8,16   00 1.083926214 0  m   N cfq2519 insert_request
 8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
 8,16   00 1.083926952 0  m   N cfq2519 insert_request
 8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
 8,16   00 1.083927870 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
 8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
 8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
 8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
 8,16   00 1.083933883 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
 8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
 8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
 8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
 8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
 8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
 8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
   ..
 8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
 8,16   10 1.091396181 0  m   N cfq2519 insert_request
 8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
 8,16   10 1.091396934 0  m   N cfq2519 insert_request
 8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
 8,16   10 1.091397477 0  m   N cfq2519 insert_request
 8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
 8,16   10 1.091398023 0  m   N cfq2519 insert_request
 8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
 8,16   10 1.091398986 0  m   N cfq2519 Not idling. 
 st-count:1
 8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
 8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
 8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
 8,16   10 1.091406151 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
 8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
 8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
 8,16   10 1.091414006 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
 8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
 8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
 8,16   10 1.091416469 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
 8,16   10 1.091417535 0  m   N cfq2519 activate rq,drv=4
 8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
 8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
 8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3606 1.092068456  4393  C   W 144322520 + 24 [0]
 8,16   10 1.092069851 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3607 1.092350440  4393  C   W 144322488 + 32 [0]
 8,16   10 1.092351688 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3608 1.093629323 0  C   W 144322432 + 56 [0]
 8,16   10 1.093631151 0  m   N cfq2519 complete rqnoidle 0
 8,16   10 1.093631574 0  m   N cfq2519 will busy wait
 8,16   10 1.093631829 0  m   N cfq schedule dispatch

 Because in func elv_attempt_insert_merge, it only to try to
 backmerge.So the four request can't merge in theory.
 I trace ten minutes and count those situation, it can count 25%.

 With the patch,i tested and not found situation like above.

 Signed-off-by: Jianpeng Ma majianp...@gmail.com
 ---
  block/blk-core.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/block/blk-core.c b/block/blk-core.c
 index a33870b..3c95c4d 100644
 --- a/block/blk

Re: Re: [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-16 Thread Jianpeng Ma
On 2012-10-16 15:48 Shaohua Li s...@kernel.org Wrote:
2012/10/16 Jianpeng Ma majianp...@gmail.com:
 On 2012-10-15 21:18 Shaohua Li s...@kernel.org Wrote:
2012/10/15 Shaohua Li s...@fusionio.com:
 2012/10/15 Jianpeng Ma majianp...@gmail.com:
 My workload is a raid5 which had 16 disks. And used our filesystem to
 write using direct-io mode.
 I used the blktrace to find those message:

 8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 
 [md127_raid5]
 8,16   00 1.083926214 0  m   N cfq2519 insert_request
 8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 
 [md127_raid5]
 8,16   00 1.083926952 0  m   N cfq2519 insert_request
 8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
 8,16   00 1.083927870 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
 8,16   00 1.083928951 0  m   N cfq2519 dispatched a 
 request
 8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
 8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 
 [md127_raid5]
 8,16   00 1.083933883 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
 8,16   00 1.083934654 0  m   N cfq2519 dispatched a 
 request
 8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
 8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 
 [md127_raid5]
 8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
 8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
 8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
   ..
 8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 
 [md127_raid5]
 8,16   10 1.091396181 0  m   N cfq2519 insert_request
 8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 
 [md127_raid5]
 8,16   10 1.091396934 0  m   N cfq2519 insert_request
 8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 
 [md127_raid5]
 8,16   10 1.091397477 0  m   N cfq2519 insert_request
 8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 
 [md127_raid5]
 8,16   10 1.091398023 0  m   N cfq2519 insert_request
 8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
 8,16   10 1.091398986 0  m   N cfq2519 Not idling. 
 st-count:1
 8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091400217 0  m   N cfq2519 dispatched a 
 request
 8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
 8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 
 [md127_raid5]
 8,16   10 1.091406151 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091406931 0  m   N cfq2519 dispatched a 
 request
 8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
 8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 
 [md127_raid5]
 8,16   10 1.091414006 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091414702 0  m   N cfq2519 dispatched a 
 request
 8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
 8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 
 [md127_raid5]
 8,16   10 1.091416469 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091417186 0  m   N cfq2519 dispatched a 
 request
 8,16   10 1.091417535 0  m   N cfq2519 activate rq,drv=4
 8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 
 [md127_raid5]
 8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
 8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3606 1.092068456  4393  C   W 144322520 + 24 [0]
 8,16   10 1.092069851 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3607 1.092350440  4393  C   W 144322488 + 32 [0]
 8,16   10 1.092351688 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3608 1.093629323 0  C   W 144322432 + 56 [0]
 8,16   10 1.093631151 0  m   N cfq2519 complete rqnoidle 0
 8,16   10 1.093631574 0  m   N cfq2519 will busy wait
 8,16   10 1.093631829 0  m   N cfq schedule dispatch

 Because in func elv_attempt_insert_merge, it only to try to
 backmerge.So the four request can't merge in theory.
 I trace ten minutes and count those situation, it can count 25%.

 With the patch,i tested and not found situation like above.

 Signed-off-by: Jianpeng Ma majianp...@gmail.com
 ---
  block/blk-core.c |3 ++-
  1 file

[PATCH V1] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-16 Thread Jianpeng Ma
My workload is a raid5 which had 16 disks. And used our filesystem to
write using direct-io mode.
I used the blktrace to find those message:
8,16   0 6647 2.453665504  2579  M   W 7493152 + 8 [md0_raid5]
8,16   0 6648 2.453672411  2579  Q   W 7493160 + 8 [md0_raid5]
8,16   0 6649 2.453672606  2579  M   W 7493160 + 8 [md0_raid5]
8,16   0 6650 2.453679255  2579  Q   W 7493168 + 8 [md0_raid5]
8,16   0 6651 2.453679441  2579  M   W 7493168 + 8 [md0_raid5]
8,16   0 6652 2.453685948  2579  Q   W 7493176 + 8 [md0_raid5]
8,16   0 6653 2.453686149  2579  M   W 7493176 + 8 [md0_raid5]
8,16   0 6654 2.453693074  2579  Q   W 7493184 + 8 [md0_raid5]
8,16   0 6655 2.453693254  2579  M   W 7493184 + 8 [md0_raid5]
8,16   0 6656 2.453704290  2579  Q   W 7493192 + 8 [md0_raid5]
8,16   0 6657 2.453704482  2579  M   W 7493192 + 8 [md0_raid5]
8,16   0 6658 2.453715016  2579  Q   W 7493200 + 8 [md0_raid5]
8,16   0 6659 2.453715247  2579  M   W 7493200 + 8 [md0_raid5]
8,16   0 6660 2.453721730  2579  Q   W 7493208 + 8 [md0_raid5]
8,16   0 6661 2.453721974  2579  M   W 7493208 + 8 [md0_raid5]
8,16   0 6662 2.453728202  2579  Q   W 7493216 + 8 [md0_raid5]
8,16   0 6663 2.453728436  2579  M   W 7493216 + 8 [md0_raid5]
8,16   0 6664 2.453734782  2579  Q   W 7493224 + 8 [md0_raid5]
8,16   0 6665 2.453735019  2579  M   W 7493224 + 8 [md0_raid5]
8,16   0  2.453741401  2579  Q   W 7493232 + 8 [md0_raid5]
8,16   0 6667 2.453741632  2579  M   W 7493232 + 8 [md0_raid5]
8,16   0 6668 2.453748148  2579  Q   W 7493240 + 8 [md0_raid5]
8,16   0 6669 2.453748386  2579  M   W 7493240 + 8 [md0_raid5]
8,16   0 6670 2.453851843  2579  I   W 7493144 + 104 [md0_raid5]
8,16   00 2.453853661 0  m   N cfq2579 insert_request
8,16   0 6671 2.453854064  2579  I   W 7493120 + 24 [md0_raid5]
8,16   00 2.453854439 0  m   N cfq2579 insert_request
8,16   0 6672 2.453854793  2579  U   N [md0_raid5] 2
8,16   00 2.453855513 0  m   N cfq2579 Not idling.st-count:1
8,16   00 2.453855927 0  m   N cfq2579 dispatch_insert
8,16   00 2.453861771 0  m   N cfq2579 dispatched a request
8,16   00 2.453862248 0  m   N cfq2579 activate rq,drv=1
8,16   0 6673 2.453862332  2579  D   W 7493120 + 24 [md0_raid5]
8,16   00 2.453865957 0  m   N cfq2579 Not idling.st-count:1
8,16   00 2.453866269 0  m   N cfq2579 dispatch_insert
8,16   00 2.453866707 0  m   N cfq2579 dispatched a request
8,16   00 2.453867061 0  m   N cfq2579 activate rq,drv=2
8,16   0 6674 2.453867145  2579  D   W 7493144 + 104 [md0_raid5]
8,16   0 6675 2.454147608 0  C   W 7493120 + 24 [0]
8,16   00 2.454149357 0  m   N cfq2579 complete rqnoidle 0
8,16   0 6676 2.454791505 0  C   W 7493144 + 104 [0]
8,16   00 2.454794803 0  m   N cfq2579 complete rqnoidle 0
8,16   00 2.454795160 0  m   N cfq schedule dispatch

From above messages,we can find rq[W 7493144 + 104] and rq[W
7493120 + 24] do not merge.
Because the bio order is:
  8,16   0 6638 2.453619407  2579  Q   W 7493144 + 8 [md0_raid5]
  8,16   0 6639 2.453620460  2579  G   W 7493144 + 8 [md0_raid5]
  8,16   0 6640 2.453639311  2579  Q   W 7493120 + 8 [md0_raid5]
  8,16   0 6641 2.453639842  2579  G   W 7493120 + 8 [md0_raid5]
The bio(7493144) first and bio(7493120) later.So the subsequent
bios will be divided into two parts.
When flushing plug-list,because elv_attempt_insert_merge only support
backmerge,not supporting frontmerge.
So rq[7493120 + 24] can't merge with rq[7493144 + 104].

From my test,i found those situation can count 25% in our system.
Using this patch, there is no this situation.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
CC:Shaohua Li s...@kernel.org
---
 block/blk-core.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a33870b..3c95c4d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2868,7 +2868,8 @@ static int plug_rq_cmp(void *priv, struct list_head *a, 
struct list_head *b)
struct request *rqa = container_of(a, struct request, queuelist);
struct request *rqb = container_of(b, struct request, queuelist);
 
-   return !(rqa-q = rqb-q);
+   return !(rqa-q  rqb-q ||
+   (rqa-q == rqb-q  blk_rq_pos(rqa)  blk_rq_pos(rqb)));
 }
 
 /*
-- 
1.7.9.5


Re: Re: [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-15 Thread Jianpeng Ma
On 2012-10-15 21:18 Shaohua Li  Wrote:
>2012/10/15 Shaohua Li :
>> 2012/10/15 Jianpeng Ma :
>>> My workload is a raid5 which had 16 disks. And used our filesystem to
>>> write using direct-io mode.
>>> I used the blktrace to find those message:
>>>
>>> 8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
>>> 8,16   00 1.083926214 0  m   N cfq2519 insert_request
>>> 8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
>>> 8,16   00 1.083926952 0  m   N cfq2519 insert_request
>>> 8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
>>> 8,16   00 1.083927870 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
>>> 8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
>>> 8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
>>> 8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
>>> 8,16   00 1.083933883 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
>>> 8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
>>> 8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
>>> 8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
>>> 8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
>>> 8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
>>>   ..
>>> 8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
>>> 8,16   10 1.091396181 0  m   N cfq2519 insert_request
>>> 8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
>>> 8,16   10 1.091396934 0  m   N cfq2519 insert_request
>>> 8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
>>> 8,16   10 1.091397477 0  m   N cfq2519 insert_request
>>> 8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
>>> 8,16   10 1.091398023 0  m   N cfq2519 insert_request
>>> 8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
>>> 8,16   10 1.091398986 0  m   N cfq2519 Not idling. 
>>> st->count:1
>>> 8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
>>> 8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
>>> 8,16   10 1.091406151 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
>>> 8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
>>> 8,16   10 1.091414006 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
>>> 8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
>>> 8,16   10 1.091416469 0  m   N cfq2519 Not 
>>> idling.st->count:1
>>> 8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
>>> 8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
>>> 8,16   10 1.091417535 0  m   N cfq2519 activate rq,drv=4
>>> 8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
>>> 8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
>>> 8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1 3606 1.092068456  4393  C   W 144322520 + 24 [0]
>>> 8,16   10 1.092069851 0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1 3607 1.092350440  4393  C   W 144322488 + 32 [0]
>>> 8,16   10 1.092351688 0  m   N cfq2519 complete rq

[PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-15 Thread Jianpeng Ma
My workload is a raid5 which had 16 disks. And used our filesystem to
write using direct-io mode.
I used the blktrace to find those message:

8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
8,16   00 1.083926214 0  m   N cfq2519 insert_request
8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
8,16   00 1.083926952 0  m   N cfq2519 insert_request
8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
8,16   00 1.083927870 0  m   N cfq2519 Not idling.st->count:1
8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
8,16   00 1.083933883 0  m   N cfq2519 Not idling.st->count:1
8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
  ..
8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
8,16   10 1.091396181 0  m   N cfq2519 insert_request
8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
8,16   10 1.091396934 0  m   N cfq2519 insert_request
8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
8,16   10 1.091397477 0  m   N cfq2519 insert_request
8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
8,16   10 1.091398023 0  m   N cfq2519 insert_request
8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
8,16   10 1.091398986 0  m   N cfq2519 Not idling. st->count:1
8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
8,16   10 1.091406151 0  m   N cfq2519 Not idling.st->count:1
8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
8,16   10 1.091414006 0  m   N cfq2519 Not idling.st->count:1
8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
8,16   10 1.091416469 0  m   N cfq2519 Not idling.st->count:1
8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
8,16   10 1.091417535 0  m   N cfq2519 activate rq,drv=4
8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
8,16   1 3606 1.092068456  4393  C   W 144322520 + 24 [0]
8,16   10 1.092069851 0  m   N cfq2519 complete rqnoidle 0
8,16   1 3607 1.092350440  4393  C   W 144322488 + 32 [0]
8,16   10 1.092351688 0  m   N cfq2519 complete rqnoidle 0
8,16   1 3608 1.093629323 0  C   W 144322432 + 56 [0]
8,16   10 1.093631151 0  m   N cfq2519 complete rqnoidle 0
8,16   10 1.093631574 0  m   N cfq2519 will busy wait
8,16   10 1.093631829 0  m   N cfq schedule dispatch

Because in func "elv_attempt_insert_merge", it only to try to
backmerge.So the four request can't merge in theory.
I trace ten minutes and count those situation, it can count 25%.

With the patch,i tested and not found situation like above.

Signed-off-by: Jianpeng Ma 
---
 block/blk-core.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a33870b..3c95c4d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2868,7 +2868,8 @@ static int plug_rq_cmp(void *priv, struct list_head *a, 
struct list_head *b)
struct request *rqa = container_of(a, struct request, queuelist);

About reuse address space when __init func removed

2012-10-15 Thread Jianpeng Ma
Hi all,
Today, I found some kernel message about memeleaking.As follows:
unreferenced object 0x8800b6e6b980 (size 64):
  comm "modprobe", pid 1137, jiffies 4294676166 (age 7326.499s)
  hex dump (first 32 bytes):
01 04 01 00 00 00 00 00 00 00 98 b5 00 88 ff ff  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[] kmemleak_alloc+0x56/0xc0
[] __kmalloc+0x173/0x310
[] 0xa009a78a
[] 0xa009ad95
[] pci_device_probe+0x75/0xa0
[] driver_probe_device+0x84/0x380
[] __driver_attach+0xa3/0xb0
[] bus_for_each_dev+0x56/0x90
[] driver_attach+0x19/0x20
[] bus_add_driver+0x1a0/0x2c0
[] driver_register+0x75/0x150
[] __pci_register_driver+0x5c/0x70
[] nfsd_last_thread+0x47/0x70 [nfsd]
[] do_one_initcall+0x3a/0x170
[] sys_init_module+0x8c/0x200
[] system_call_fastpath+0x16/0x1b

But the problem is not memleak, but the stack.
I noticed "[] nfsd_last_thread+0x47/0x70 [nfsd]".But the real 
module is mvsas.
Why the kernel print nfsd?

I added some debuginfo in func mvs_init.
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index cc59dff..d34ce01 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -821,6 +821,7 @@ static int __init mvs_init(void)
 {
int rc;
mvs_stt = sas_domain_attach_transport(_transport_ops);
+   printk(KERN_ERR"%s:0x%lx\n", __func__, _THIS_IP_);
if (!mvs_stt)
return -ENOMEM;

The result is "[3.781487] mvs_init:0xa00a2000"

I think because the __init attribute.When mvs_init execd,those memeory removed 
and the address space alse removed.So after func nfsd_last_thread used those 
address.
Is it a bug?

Thanks!

About reuse address space when __init func removed

2012-10-15 Thread Jianpeng Ma
Hi all,
Today, I found some kernel message about memeleaking.As follows:
unreferenced object 0x8800b6e6b980 (size 64):
  comm modprobe, pid 1137, jiffies 4294676166 (age 7326.499s)
  hex dump (first 32 bytes):
01 04 01 00 00 00 00 00 00 00 98 b5 00 88 ff ff  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[816a3f16] kmemleak_alloc+0x56/0xc0
[8113bd43] __kmalloc+0x173/0x310
[a009a78a] 0xa009a78a
[a009ad95] 0xa009ad95
[81300985] pci_device_probe+0x75/0xa0
[814078c4] driver_probe_device+0x84/0x380
[81407c63] __driver_attach+0xa3/0xb0
[81405a96] bus_for_each_dev+0x56/0x90
[81407359] driver_attach+0x19/0x20
[81406e80] bus_add_driver+0x1a0/0x2c0
[81408195] driver_register+0x75/0x150
[812ffa1c] __pci_register_driver+0x5c/0x70
[a00a20a7] nfsd_last_thread+0x47/0x70 [nfsd]
[810001fa] do_one_initcall+0x3a/0x170
[810a7d1c] sys_init_module+0x8c/0x200
[816cc352] system_call_fastpath+0x16/0x1b

But the problem is not memleak, but the stack.
I noticed [a00a20a7] nfsd_last_thread+0x47/0x70 [nfsd].But the real 
module is mvsas.
Why the kernel print nfsd?

I added some debuginfo in func mvs_init.
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index cc59dff..d34ce01 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -821,6 +821,7 @@ static int __init mvs_init(void)
 {
int rc;
mvs_stt = sas_domain_attach_transport(mvs_transport_ops);
+   printk(KERN_ERR%s:0x%lx\n, __func__, _THIS_IP_);
if (!mvs_stt)
return -ENOMEM;

The result is [3.781487] mvs_init:0xa00a2000

I think because the __init attribute.When mvs_init execd,those memeory removed 
and the address space alse removed.So after func nfsd_last_thread used those 
address.
Is it a bug?

Thanks!

[PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-15 Thread Jianpeng Ma
My workload is a raid5 which had 16 disks. And used our filesystem to
write using direct-io mode.
I used the blktrace to find those message:

8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
8,16   00 1.083926214 0  m   N cfq2519 insert_request
8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
8,16   00 1.083926952 0  m   N cfq2519 insert_request
8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
8,16   00 1.083927870 0  m   N cfq2519 Not idling.st-count:1
8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
8,16   00 1.083933883 0  m   N cfq2519 Not idling.st-count:1
8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
  ..
8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
8,16   10 1.091396181 0  m   N cfq2519 insert_request
8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
8,16   10 1.091396934 0  m   N cfq2519 insert_request
8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
8,16   10 1.091397477 0  m   N cfq2519 insert_request
8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
8,16   10 1.091398023 0  m   N cfq2519 insert_request
8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
8,16   10 1.091398986 0  m   N cfq2519 Not idling. st-count:1
8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
8,16   10 1.091406151 0  m   N cfq2519 Not idling.st-count:1
8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
8,16   10 1.091414006 0  m   N cfq2519 Not idling.st-count:1
8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
8,16   10 1.091416469 0  m   N cfq2519 Not idling.st-count:1
8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
8,16   10 1.091417535 0  m   N cfq2519 activate rq,drv=4
8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
8,16   1 3606 1.092068456  4393  C   W 144322520 + 24 [0]
8,16   10 1.092069851 0  m   N cfq2519 complete rqnoidle 0
8,16   1 3607 1.092350440  4393  C   W 144322488 + 32 [0]
8,16   10 1.092351688 0  m   N cfq2519 complete rqnoidle 0
8,16   1 3608 1.093629323 0  C   W 144322432 + 56 [0]
8,16   10 1.093631151 0  m   N cfq2519 complete rqnoidle 0
8,16   10 1.093631574 0  m   N cfq2519 will busy wait
8,16   10 1.093631829 0  m   N cfq schedule dispatch

Because in func elv_attempt_insert_merge, it only to try to
backmerge.So the four request can't merge in theory.
I trace ten minutes and count those situation, it can count 25%.

With the patch,i tested and not found situation like above.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
---
 block/blk-core.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a33870b..3c95c4d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2868,7 +2868,8 @@ static int plug_rq_cmp(void *priv, struct list_head *a, 
struct list_head *b)
struct request *rqa = container_of(a, struct request, queuelist);
struct request

Re: Re: [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.

2012-10-15 Thread Jianpeng Ma
On 2012-10-15 21:18 Shaohua Li s...@kernel.org Wrote:
2012/10/15 Shaohua Li s...@fusionio.com:
 2012/10/15 Jianpeng Ma majianp...@gmail.com:
 My workload is a raid5 which had 16 disks. And used our filesystem to
 write using direct-io mode.
 I used the blktrace to find those message:

 8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
 8,16   00 1.083926214 0  m   N cfq2519 insert_request
 8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
 8,16   00 1.083926952 0  m   N cfq2519 insert_request
 8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
 8,16   00 1.083927870 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
 8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
 8,16   00 1.083929443 0  m   N cfq2519 activate rq,drv=1
 8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
 8,16   00 1.083933883 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
 8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
 8,16   00 1.083935014 0  m   N cfq2519 activate rq,drv=2
 8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
 8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
 8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
 8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
   ..
 8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
 8,16   10 1.091396181 0  m   N cfq2519 insert_request
 8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
 8,16   10 1.091396934 0  m   N cfq2519 insert_request
 8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
 8,16   10 1.091397477 0  m   N cfq2519 insert_request
 8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
 8,16   10 1.091398023 0  m   N cfq2519 insert_request
 8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
 8,16   10 1.091398986 0  m   N cfq2519 Not idling. 
 st-count:1
 8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
 8,16   10 1.091400688 0  m   N cfq2519 activate rq,drv=1
 8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
 8,16   10 1.091406151 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
 8,16   10 1.091407291 0  m   N cfq2519 activate rq,drv=2
 8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
 8,16   10 1.091414006 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
 8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
 8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
 8,16   10 1.091416469 0  m   N cfq2519 Not 
 idling.st-count:1
 8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
 8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
 8,16   10 1.091417535 0  m   N cfq2519 activate rq,drv=4
 8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
 8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
 8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3606 1.092068456  4393  C   W 144322520 + 24 [0]
 8,16   10 1.092069851 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3607 1.092350440  4393  C   W 144322488 + 32 [0]
 8,16   10 1.092351688 0  m   N cfq2519 complete rqnoidle 0
 8,16   1 3608 1.093629323 0  C   W 144322432 + 56 [0]
 8,16   10 1.093631151 0  m   N cfq2519 complete rqnoidle 0
 8,16   10 1.093631574 0  m   N cfq2519 will busy wait
 8,16   10 1.093631829 0  m   N cfq schedule dispatch

 Because in func elv_attempt_insert_merge, it only to try to
 backmerge.So the four request can't merge in theory.
 I trace ten minutes and count those situation, it can count 25%.

 With the patch,i tested and not found situation like above.

 Signed-off-by: Jianpeng Ma majianp...@gmail.com
 ---
  block/blk-core.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/block/blk-core.c b/block/blk-core.c
 index a33870b..3c95c4d 100644
 --- a/block/blk

Re: [PATCH 0/3] Fix problems about handling bio to plug when bio merged failed.

2012-09-18 Thread Jianpeng Ma
On 2012-08-10 19:44 Jianpeng Ma  Wrote:
>There are some problems about handling bio which merge to plug failed.
>Patch1 will avoid unnecessary plug should_sort test,although it's not a bug.
>Patch2 correct a bug when handle more devices,it leak some devices to trace 
>plug-operation.
>
>Because the patch2,so it's not necessary to sort when flush plug.Although 
>patch2 has 
>O(n*n) complexity,it's more than list_sort which has O(nlog(n)) complexity.But 
>the plug 
>list is unlikely too long,so i think patch3 can accept.
>
>
>Jianpeng Ma (3):
>  block: avoid unnecessary plug should_sort test.
>  block: Fix not tracing all device plug-operation.
>  block: Remove unnecessary requests sort.
>
> block/blk-core.c |   35 ++-
> 1 file changed, 18 insertions(+), 17 deletions(-)
>
>-- 
>1.7.9.5
Hi axboe:
Sorry for asking you again. But I found a problem which it contained 
those code. So i asked how those patchset again.
If you discard those,i will send the patch using the old code. On the other 
hand,I will wait the patchest release and to continue.

The problem is about blk_plug. 
My workload is raid5  which had 16 disks. And used our filesystem to write used 
direct mode.
I used the blktrace to find those message:

  8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
  8,16   00 1.083926214 0  m   N cfq2519 insert_request
  8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
  8,16   00 1.083926952 0  m   N cfq2519 insert_request
  8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
  8,16   00 1.083927870 0  m   N cfq2519 Not idling. st->count:1
  8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
  8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
  8,16   00 1.083929443 0  m   N cfq2519 activate rq, drv=1
  8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
  8,16   00 1.083933883 0  m   N cfq2519 Not idling. st->count:1
  8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
  8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
  8,16   00 1.083935014 0  m   N cfq2519 activate rq, drv=2
  8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
  8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
  8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
  8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
  ..
  8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
  8,16   10 1.091396181 0  m   N cfq2519 insert_request
  8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
  8,16   10 1.091396934 0  m   N cfq2519 insert_request
  8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
  8,16   10 1.091397477 0  m   N cfq2519 insert_request
  8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
  8,16   10 1.091398023 0  m   N cfq2519 insert_request
  8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
  8,16   10 1.091398986 0  m   N cfq2519 Not idling. st->count:1
  8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
  8,16   10 1.091400688 0  m   N cfq2519 activate rq, drv=1
  8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
  8,16   10 1.091406151 0  m   N cfq2519 Not idling. st->count:1
  8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
  8,16   10 1.091407291 0  m   N cfq2519 activate rq, drv=2
  8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
  8,16   10 1.091414006 0  m   N cfq2519 Not idling. st->count:1
  8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
  8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
  8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
  8,16   10 1.091416469 0  m   N cfq2519 Not idling. st->count:1
  8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
  8,16   10 1.091417535 0  m   N cfq2519 activate rq, drv=4
  8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
  8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
  8

Re: Re: Why blktrace didn't trace requests merge?

2012-09-18 Thread Jianpeng Ma
On 2012-09-18 13:49 Jens Axboe  Wrote:
>On 2012-09-18 02:30, Jianpeng Ma wrote:
>> On 2012-09-18 02:27 Jens Axboe  Wrote:
>>> On 2012-09-17 19:55, Tejun Heo wrote:
>>>> (cc'ing Jens)
>>>>
>>>> On Mon, Sep 17, 2012 at 09:22:28AM -0400, Steven Rostedt wrote:
>>>>> On Mon, 2012-09-17 at 19:33 +0800, Jianpeng Ma wrote:
>>>>>> Hi all:
>>>>>>  I used blktrace to trace some io.But i can't find requests merge. I 
>>>>>> searched the code and did't not find.
>>>>>>  Why? 
>>>>>>  
>>>>>
>>>>> No idea. I don't use blktrace much, but I Cc'd those that understand it
>>>>> better than I.
>>>
>>> Works for me:
>>>
>>> [...]
>>>
>>>
>>>  8,00   26 0.009147735   664  A  WS 315226143 + 8 <- (8,7) 
>>> 19406344
>>>  8,00   27 0.009148677   664  Q  WS 315226143 + 8 
>>> [btrfs-submit-1]
>>>  8,00   28 0.009152967   664  G  WS 315226143 + 8 
>>> [btrfs-submit-1]
>>>  8,00   29 0.009154242   664  P   N [btrfs-submit-1]
>>>  8,00   30 0.009155538   664  A  WS 315226151 + 8 <- (8,7) 
>>> 19406352
>>>  8,00   31 0.009155743   664  Q  WS 315226151 + 8 
>>> [btrfs-submit-1]
>>>  8,00   32 0.009157086   664  M  WS 315226151 + 8 
>>> [btrfs-submit-1]
>>>  8,00   33 0.009158716   664  I  WS 315226143 + 16 
>>> [btrfs-submit-1]
>>>
>>> That's from a quick trace of /dev/sda. I started blktrace, then did:
>>>
>>> $ dd if=/dev/zero of=foo bs=4k count=128 && sync
>>>
>>> to ensure that I knew merges would be happening. Output stats at the end:
>>>
>>> Total (sda):
>>> Reads Queued:   7,   44KiB  Writes Queued: 447, 
>>> 7692KiB
>>> Read Dispatches:7,   44KiB  Write Dispatches:  416, 
>>> 7692KiB
>>> Reads Requeued: 0   Writes Requeued: 0
>>> Reads Completed:7,   44KiB  Writes Completed:  435, 
>>> 5864KiB
>>> Read Merges:0,0KiB  Write Merges:   23,  
>>> 428KiB
>>> IO unplugs:78   Timer unplugs:   0
>>>
>>> -- 
>>> Jens Axboe
>>>
>> First, Thanks your time!
>> If i understand correctly, the merge of your example is bio with
>> request, not request wiht request.  Yes or no?
>
>It is bio to request, correct. Request to request merges are relatively
>more rare.
>
>-- 
>Jens Axboe
>
Thanks very much, I know.

Jianpeng

Re: Re: Why blktrace didn't trace requests merge?

2012-09-18 Thread Jianpeng Ma
On 2012-09-18 13:49 Jens Axboe ax...@kernel.dk Wrote:
On 2012-09-18 02:30, Jianpeng Ma wrote:
 On 2012-09-18 02:27 Jens Axboe ax...@kernel.dk Wrote:
 On 2012-09-17 19:55, Tejun Heo wrote:
 (cc'ing Jens)

 On Mon, Sep 17, 2012 at 09:22:28AM -0400, Steven Rostedt wrote:
 On Mon, 2012-09-17 at 19:33 +0800, Jianpeng Ma wrote:
 Hi all:
  I used blktrace to trace some io.But i can't find requests merge. I 
 searched the code and did't not find.
  Why? 
  

 No idea. I don't use blktrace much, but I Cc'd those that understand it
 better than I.

 Works for me:

 [...]


  8,00   26 0.009147735   664  A  WS 315226143 + 8 - (8,7) 
 19406344
  8,00   27 0.009148677   664  Q  WS 315226143 + 8 
 [btrfs-submit-1]
  8,00   28 0.009152967   664  G  WS 315226143 + 8 
 [btrfs-submit-1]
  8,00   29 0.009154242   664  P   N [btrfs-submit-1]
  8,00   30 0.009155538   664  A  WS 315226151 + 8 - (8,7) 
 19406352
  8,00   31 0.009155743   664  Q  WS 315226151 + 8 
 [btrfs-submit-1]
  8,00   32 0.009157086   664  M  WS 315226151 + 8 
 [btrfs-submit-1]
  8,00   33 0.009158716   664  I  WS 315226143 + 16 
 [btrfs-submit-1]

 That's from a quick trace of /dev/sda. I started blktrace, then did:

 $ dd if=/dev/zero of=foo bs=4k count=128  sync

 to ensure that I knew merges would be happening. Output stats at the end:

 Total (sda):
 Reads Queued:   7,   44KiB  Writes Queued: 447, 
 7692KiB
 Read Dispatches:7,   44KiB  Write Dispatches:  416, 
 7692KiB
 Reads Requeued: 0   Writes Requeued: 0
 Reads Completed:7,   44KiB  Writes Completed:  435, 
 5864KiB
 Read Merges:0,0KiB  Write Merges:   23,  
 428KiB
 IO unplugs:78   Timer unplugs:   0

 -- 
 Jens Axboe

 First, Thanks your time!
 If i understand correctly, the merge of your example is bio with
 request, not request wiht request.  Yes or no?

It is bio to request, correct. Request to request merges are relatively
more rare.

-- 
Jens Axboe

Thanks very much, I know.

Jianpeng

Re: [PATCH 0/3] Fix problems about handling bio to plug when bio merged failed.

2012-09-18 Thread Jianpeng Ma
On 2012-08-10 19:44 Jianpeng Ma majianp...@gmail.com Wrote:
There are some problems about handling bio which merge to plug failed.
Patch1 will avoid unnecessary plug should_sort test,although it's not a bug.
Patch2 correct a bug when handle more devices,it leak some devices to trace 
plug-operation.

Because the patch2,so it's not necessary to sort when flush plug.Although 
patch2 has 
O(n*n) complexity,it's more than list_sort which has O(nlog(n)) complexity.But 
the plug 
list is unlikely too long,so i think patch3 can accept.


Jianpeng Ma (3):
  block: avoid unnecessary plug should_sort test.
  block: Fix not tracing all device plug-operation.
  block: Remove unnecessary requests sort.

 block/blk-core.c |   35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

-- 
1.7.9.5
Hi axboe:
Sorry for asking you again. But I found a problem which it contained 
those code. So i asked how those patchset again.
If you discard those,i will send the patch using the old code. On the other 
hand,I will wait the patchest release and to continue.

The problem is about blk_plug. 
My workload is raid5  which had 16 disks. And used our filesystem to write used 
direct mode.
I used the blktrace to find those message:

  8,16   0 3570 1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
  8,16   00 1.083926214 0  m   N cfq2519 insert_request
  8,16   0 3571 1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
  8,16   00 1.083926952 0  m   N cfq2519 insert_request
  8,16   0 3572 1.083927180  2519  U   N [md127_raid5] 2
  8,16   00 1.083927870 0  m   N cfq2519 Not idling. st-count:1
  8,16   00 1.083928320 0  m   N cfq2519 dispatch_insert
  8,16   00 1.083928951 0  m   N cfq2519 dispatched a request
  8,16   00 1.083929443 0  m   N cfq2519 activate rq, drv=1
  8,16   0 3573 1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
  8,16   00 1.083933883 0  m   N cfq2519 Not idling. st-count:1
  8,16   00 1.083934189 0  m   N cfq2519 dispatch_insert
  8,16   00 1.083934654 0  m   N cfq2519 dispatched a request
  8,16   00 1.083935014 0  m   N cfq2519 activate rq, drv=2
  8,16   0 3574 1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
  8,16   0 3575 1.084196179 0  C   W 144323176 + 24 [0]
  8,16   00 1.084197979 0  m   N cfq2519 complete rqnoidle 0
  8,16   0 3576 1.084769073 0  C   W 144323072 + 104 [0]
  ..
  8,16   1 3596 1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
  8,16   10 1.091396181 0  m   N cfq2519 insert_request
  8,16   1 3597 1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
  8,16   10 1.091396934 0  m   N cfq2519 insert_request
  8,16   1 3598 1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
  8,16   10 1.091397477 0  m   N cfq2519 insert_request
  8,16   1 3599 1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
  8,16   10 1.091398023 0  m   N cfq2519 insert_request
  8,16   1 3600 1.091398284  2519  U   N [md127_raid5] 4
  8,16   10 1.091398986 0  m   N cfq2519 Not idling. st-count:1
  8,16   10 1.091399511 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091400217 0  m   N cfq2519 dispatched a request
  8,16   10 1.091400688 0  m   N cfq2519 activate rq, drv=1
  8,16   1 3601 1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
  8,16   10 1.091406151 0  m   N cfq2519 Not idling. st-count:1
  8,16   10 1.091406460 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091406931 0  m   N cfq2519 dispatched a request
  8,16   10 1.091407291 0  m   N cfq2519 activate rq, drv=2
  8,16   1 3602 1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
  8,16   10 1.091414006 0  m   N cfq2519 Not idling. st-count:1
  8,16   10 1.091414297 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091414702 0  m   N cfq2519 dispatched a request
  8,16   10 1.091415047 0  m   N cfq2519 activate rq, drv=3
  8,16   1 3603 1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
  8,16   10 1.091416469 0  m   N cfq2519 Not idling. st-count:1
  8,16   10 1.091416754 0  m   N cfq2519 dispatch_insert
  8,16   10 1.091417186 0  m   N cfq2519 dispatched a request
  8,16   10 1.091417535 0  m   N cfq2519 activate rq, drv=4
  8,16   1 3604 1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
  8,16   1 3605 1.091857225  4393  C   W 144322544 + 16 [0]
  8,16   10 1.091858753 0  m   N cfq2519 complete rqnoidle 0
  8,16   1 3606

Re: Re: Why blktrace didn't trace requests merge?

2012-09-17 Thread Jianpeng Ma
On 2012-09-18 02:27 Jens Axboe  Wrote:
>On 2012-09-17 19:55, Tejun Heo wrote:
>> (cc'ing Jens)
>> 
>> On Mon, Sep 17, 2012 at 09:22:28AM -0400, Steven Rostedt wrote:
>>> On Mon, 2012-09-17 at 19:33 +0800, Jianpeng Ma wrote:
>>>> Hi all:
>>>>I used blktrace to trace some io.But i can't find requests merge. I 
>>>> searched the code and did't not find.
>>>>Why? 
>>>>
>>>
>>> No idea. I don't use blktrace much, but I Cc'd those that understand it
>>> better than I.
>
>Works for me:
>
>[...]
>
>
>  8,00   26 0.009147735   664  A  WS 315226143 + 8 <- (8,7) 
> 19406344
>  8,00   27 0.009148677   664  Q  WS 315226143 + 8 [btrfs-submit-1]
>  8,00   28 0.009152967   664  G  WS 315226143 + 8 [btrfs-submit-1]
>  8,00   29 0.009154242   664  P   N [btrfs-submit-1]
>  8,00   30 0.009155538   664  A  WS 315226151 + 8 <- (8,7) 
> 19406352
>  8,00   31 0.009155743   664  Q  WS 315226151 + 8 [btrfs-submit-1]
>  8,00   32 0.009157086   664  M  WS 315226151 + 8 [btrfs-submit-1]
>  8,00   33 0.009158716   664  I  WS 315226143 + 16 
> [btrfs-submit-1]
>
>That's from a quick trace of /dev/sda. I started blktrace, then did:
>
>$ dd if=/dev/zero of=foo bs=4k count=128 && sync
>
>to ensure that I knew merges would be happening. Output stats at the end:
>
>Total (sda):
> Reads Queued:   7,   44KiB  Writes Queued: 447, 
> 7692KiB
> Read Dispatches:7,   44KiB  Write Dispatches:  416, 
> 7692KiB
> Reads Requeued: 0   Writes Requeued: 0
> Reads Completed:7,   44KiB  Writes Completed:  435, 
> 5864KiB
> Read Merges:0,0KiB  Write Merges:   23,  
> 428KiB
> IO unplugs:78   Timer unplugs:   0
>
>-- 
>Jens Axboe
>
First, Thanks your time!
If i understand correctly, the merge of your example is bio with request, not 
request wiht request.
Yes or no?

Thanks!
Jianpeng


Why blktrace didn't trace requests merge?

2012-09-17 Thread Jianpeng Ma
Hi all:
I used blktrace to trace some io.But i can't find requests merge. I 
searched the code and did't not find.
Why? 

Thanks!
JianpengN�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Why blktrace didn't trace requests merge?

2012-09-17 Thread Jianpeng Ma
Hi all:
I used blktrace to trace some io.But i can't find requests merge. I 
searched the code and did't not find.
Why? 

Thanks!
JianpengN�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: Re: Why blktrace didn't trace requests merge?

2012-09-17 Thread Jianpeng Ma
On 2012-09-18 02:27 Jens Axboe ax...@kernel.dk Wrote:
On 2012-09-17 19:55, Tejun Heo wrote:
 (cc'ing Jens)
 
 On Mon, Sep 17, 2012 at 09:22:28AM -0400, Steven Rostedt wrote:
 On Mon, 2012-09-17 at 19:33 +0800, Jianpeng Ma wrote:
 Hi all:
I used blktrace to trace some io.But i can't find requests merge. I 
 searched the code and did't not find.
Why? 


 No idea. I don't use blktrace much, but I Cc'd those that understand it
 better than I.

Works for me:

[...]


  8,00   26 0.009147735   664  A  WS 315226143 + 8 - (8,7) 
 19406344
  8,00   27 0.009148677   664  Q  WS 315226143 + 8 [btrfs-submit-1]
  8,00   28 0.009152967   664  G  WS 315226143 + 8 [btrfs-submit-1]
  8,00   29 0.009154242   664  P   N [btrfs-submit-1]
  8,00   30 0.009155538   664  A  WS 315226151 + 8 - (8,7) 
 19406352
  8,00   31 0.009155743   664  Q  WS 315226151 + 8 [btrfs-submit-1]
  8,00   32 0.009157086   664  M  WS 315226151 + 8 [btrfs-submit-1]
  8,00   33 0.009158716   664  I  WS 315226143 + 16 
 [btrfs-submit-1]

That's from a quick trace of /dev/sda. I started blktrace, then did:

$ dd if=/dev/zero of=foo bs=4k count=128  sync

to ensure that I knew merges would be happening. Output stats at the end:

Total (sda):
 Reads Queued:   7,   44KiB  Writes Queued: 447, 
 7692KiB
 Read Dispatches:7,   44KiB  Write Dispatches:  416, 
 7692KiB
 Reads Requeued: 0   Writes Requeued: 0
 Reads Completed:7,   44KiB  Writes Completed:  435, 
 5864KiB
 Read Merges:0,0KiB  Write Merges:   23,  
 428KiB
 IO unplugs:78   Timer unplugs:   0

-- 
Jens Axboe

First, Thanks your time!
If i understand correctly, the merge of your example is bio with request, not 
request wiht request.
Yes or no?

Thanks!
Jianpeng


Re: Re: [PATCH 2/3] block: Fix not tracing all device plug-operation.

2012-09-14 Thread Jianpeng Ma
On 2012-08-10 21:09 Jens Axboe  Wrote:
>On 08/10/2012 01:46 PM, Jianpeng Ma wrote:
>> If process handled two or more devices,there will not be trace some
>> devices plug-operation.
>> 
>> Signed-off-by: Jianpeng Ma 
>> ---
>>  block/blk-core.c |   16 +++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>> 
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 7a3abc6..034f186 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1521,11 +1521,25 @@ get_rq:
>>  struct request *__rq;
>>  
>>  __rq = list_entry_rq(plug->list.prev);
>> -if (__rq->q != q)
>> +if (__rq->q != q) {
>>  plug->should_sort = 1;
>> +trace_block_plug(q);
>> +}
>> +} else {
>> +struct request *__rq;
>> +list_for_each_entry_reverse(__rq, >list,
>> +queuelist) {
>> +if (__rq->q == q) {
>> +list_add_tail(>queuelist,
>> +&__rq->queuelist);
>> +goto stat_acct;
>
>Did you verify this? It doesn't look right to me. You browse the list in
>reverse, which means __rq is the first one that has a matching q. Then
>you add the new req IN FRONT of that. You would want list_add() here
>instead, adding it as the last member of that q string, not in the
>middle.
>
>-- 
>Jens Axboe
>
Hi all:
How about those patches? Ok or wrong? 
Thanks!N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: Re: [PATCH 2/3] block: Fix not tracing all device plug-operation.

2012-09-14 Thread Jianpeng Ma
On 2012-08-10 21:09 Jens Axboe jax...@fusionio.com Wrote:
On 08/10/2012 01:46 PM, Jianpeng Ma wrote:
 If process handled two or more devices,there will not be trace some
 devices plug-operation.
 
 Signed-off-by: Jianpeng Ma majianp...@gmail.com
 ---
  block/blk-core.c |   16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)
 
 diff --git a/block/blk-core.c b/block/blk-core.c
 index 7a3abc6..034f186 100644
 --- a/block/blk-core.c
 +++ b/block/blk-core.c
 @@ -1521,11 +1521,25 @@ get_rq:
  struct request *__rq;
  
  __rq = list_entry_rq(plug-list.prev);
 -if (__rq-q != q)
 +if (__rq-q != q) {
  plug-should_sort = 1;
 +trace_block_plug(q);
 +}
 +} else {
 +struct request *__rq;
 +list_for_each_entry_reverse(__rq, plug-list,
 +queuelist) {
 +if (__rq-q == q) {
 +list_add_tail(req-queuelist,
 +__rq-queuelist);
 +goto stat_acct;

Did you verify this? It doesn't look right to me. You browse the list in
reverse, which means __rq is the first one that has a matching q. Then
you add the new req IN FRONT of that. You would want list_add() here
instead, adding it as the last member of that q string, not in the
middle.

-- 
Jens Axboe

Hi all:
How about those patches? Ok or wrong? 
Thanks!N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: Re: About function __create_file in debugfs

2012-09-10 Thread Jianpeng Ma
On 2012-09-08 23:25 gregkh  Wrote:
>On Sat, Sep 08, 2012 at 05:41:05PM +0800, Jianpeng Ma wrote:
>> Hi:
>>  At present,i used blktrace to trace block io.But i always met error, 
>> the message like:
>> >BLKTRACESETUP(2) /dev/sdc failed: 2/No such file or directory
>> >Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file 
>> >or directory
>> >Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file 
>> >or directory
>> >Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file 
>> >or directory
>> 
>> >Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file 
>> >or directory
>> >FAILED to start thread on CPU 0: 1/Operation not permitted
>> >FAILED to start thread on CPU 1: 1/Operation not permitted
>> >FAILED to start thread on CPU 2: 1/Operation not permitted
>> >FAILED to start thread on CPU 3: 1/Operation not permitted
>> 
>> But those isn't important. I add some message in kernel and found the reason 
>> is inode already existed.
>> But the function __create_file dosen't return correctly errno.So blktrace 
>> tool can't print correctly message.
>> I think func __create_file should return correctly message(ERR_PTR(error)) 
>> not NULL.
>
>Patches are always welcome :)
>
>greg k-h

Thanks, the patch is:
Func debugfs_create_symlink/debugfs_create_file/debugfs_create_dir,
it only return NULL when error counted. We should correctly error info
instead of.

Signed-off-by: Jianpeng Ma 
---
 fs/debugfs/inode.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 4733eab..1e94350 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -302,8 +302,10 @@ struct dentry *__create_file(const char *name, umode_t 
mode,
 
error = simple_pin_fs(_fs_type, _mount,
  _mount_count);
-   if (error)
+   if (error) {
+   dentry = ERR_PTR(error);
goto exit;
+   }
 
/* If the parent is not specified, we create it in the root.
 * We need the root dentry to do this, which is in the super 
@@ -337,7 +339,7 @@ struct dentry *__create_file(const char *name, umode_t mode,
mutex_unlock(>d_inode->i_mutex);
 
if (error) {
-   dentry = NULL;
+   dentry = ERR_PTR(error);
simple_release_fs(_mount, _mount_count);
}
 exit:
@@ -442,10 +444,10 @@ struct dentry *debugfs_create_symlink(const char *name, 
struct dentry *parent,
 
link = kstrdup(target, GFP_KERNEL);
if (!link)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
result = __create_file(name, S_IFLNK | S_IRWXUGO, parent, link, NULL);
-   if (!result)
+   if (IS_ERR_OR_NULL(result))
kfree(link);
return result;
 }
-- 
1.7.9.5


But i searched kernel-code and found at-least 100 places used those code.
How about those code? Am i correct those or not?
Thanks!N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: Re: About function __create_file in debugfs

2012-09-10 Thread Jianpeng Ma
On 2012-09-08 23:25 gregkh gre...@linuxfoundation.org Wrote:
On Sat, Sep 08, 2012 at 05:41:05PM +0800, Jianpeng Ma wrote:
 Hi:
  At present,i used blktrace to trace block io.But i always met error, 
 the message like:
 BLKTRACESETUP(2) /dev/sdc failed: 2/No such file or directory
 Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file 
 or directory
 Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file 
 or directory
 Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file 
 or directory
 
 Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file 
 or directory
 FAILED to start thread on CPU 0: 1/Operation not permitted
 FAILED to start thread on CPU 1: 1/Operation not permitted
 FAILED to start thread on CPU 2: 1/Operation not permitted
 FAILED to start thread on CPU 3: 1/Operation not permitted
 
 But those isn't important. I add some message in kernel and found the reason 
 is inode already existed.
 But the function __create_file dosen't return correctly errno.So blktrace 
 tool can't print correctly message.
 I think func __create_file should return correctly message(ERR_PTR(error)) 
 not NULL.

Patches are always welcome :)

greg k-h

Thanks, the patch is:
Func debugfs_create_symlink/debugfs_create_file/debugfs_create_dir,
it only return NULL when error counted. We should correctly error info
instead of.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
---
 fs/debugfs/inode.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 4733eab..1e94350 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -302,8 +302,10 @@ struct dentry *__create_file(const char *name, umode_t 
mode,
 
error = simple_pin_fs(debug_fs_type, debugfs_mount,
  debugfs_mount_count);
-   if (error)
+   if (error) {
+   dentry = ERR_PTR(error);
goto exit;
+   }
 
/* If the parent is not specified, we create it in the root.
 * We need the root dentry to do this, which is in the super 
@@ -337,7 +339,7 @@ struct dentry *__create_file(const char *name, umode_t mode,
mutex_unlock(parent-d_inode-i_mutex);
 
if (error) {
-   dentry = NULL;
+   dentry = ERR_PTR(error);
simple_release_fs(debugfs_mount, debugfs_mount_count);
}
 exit:
@@ -442,10 +444,10 @@ struct dentry *debugfs_create_symlink(const char *name, 
struct dentry *parent,
 
link = kstrdup(target, GFP_KERNEL);
if (!link)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
result = __create_file(name, S_IFLNK | S_IRWXUGO, parent, link, NULL);
-   if (!result)
+   if (IS_ERR_OR_NULL(result))
kfree(link);
return result;
 }
-- 
1.7.9.5


But i searched kernel-code and found at-least 100 places used those code.
How about those code? Am i correct those or not?
Thanks!N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

About function __create_file in debugfs

2012-09-08 Thread Jianpeng Ma
Hi:
At present,i used blktrace to trace block io.But i always met error, 
the message like:
>BLKTRACESETUP(2) /dev/sdc failed: 2/No such file or directory
>Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or 
>directory
>Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file or 
>directory
>Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file or 
>directory

>Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or 
>directory
>FAILED to start thread on CPU 0: 1/Operation not permitted
>FAILED to start thread on CPU 1: 1/Operation not permitted
>FAILED to start thread on CPU 2: 1/Operation not permitted
>FAILED to start thread on CPU 3: 1/Operation not permitted

But those isn't important. I add some message in kernel and found the reason is 
inode already existed.
But the function __create_file dosen't return correctly errno.So blktrace tool 
can't print correctly message.
I think func __create_file should return correctly message(ERR_PTR(error)) not 
NULL.


About function __create_file in debugfs

2012-09-08 Thread Jianpeng Ma
Hi:
At present,i used blktrace to trace block io.But i always met error, 
the message like:
BLKTRACESETUP(2) /dev/sdc failed: 2/No such file or directory
Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or 
directory
Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file or 
directory
Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file or 
directory

Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or 
directory
FAILED to start thread on CPU 0: 1/Operation not permitted
FAILED to start thread on CPU 1: 1/Operation not permitted
FAILED to start thread on CPU 2: 1/Operation not permitted
FAILED to start thread on CPU 3: 1/Operation not permitted

But those isn't important. I add some message in kernel and found the reason is 
inode already existed.
But the function __create_file dosen't return correctly errno.So blktrace tool 
can't print correctly message.
I think func __create_file should return correctly message(ERR_PTR(error)) not 
NULL.


About multiple queries to control using $DBGMT/dynamic_debug/control

2012-08-22 Thread Jianpeng Ma
Hi,
I used the $DBGMT/dynamic_debug/control to control printing 
debuginfo.But When i wote multiple queries which has at least one didn't match,
the result is ok which no error can return.
So i think i wrote correct queries.But until using dmesg, i only found 
error.
And if parm verbose of dynamic_debug is zero, i can't find the result 
by dmesg.
So i think it's not convenient.
If it did not found any query,it should return error.

Thanks!N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

About multiple queries to control using $DBGMT/dynamic_debug/control

2012-08-22 Thread Jianpeng Ma
Hi,
I used the $DBGMT/dynamic_debug/control to control printing 
debuginfo.But When i wote multiple queries which has at least one didn't match,
the result is ok which no error can return.
So i think i wrote correct queries.But until using dmesg, i only found 
error.
And if parm verbose of dynamic_debug is zero, i can't find the result 
by dmesg.
So i think it's not convenient.
If it did not found any query,it should return error.

Thanks!N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: Re: [PATCH] block: Don't use static to define "void *p" in show_partition_start().

2012-08-12 Thread Jianpeng Ma
On 2012-08-12 23:45 Michael Tokarev  Wrote:
>On 03.08.2012 12:41, Jens Axboe wrote:
>> On 08/03/2012 07:07 AM, majianpeng wrote:
>[]
>>> diff --git a/block/genhd.c b/block/genhd.c
>>> index cac7366..d839723 100644
>>> --- a/block/genhd.c
>>> +++ b/block/genhd.c
>>> @@ -835,7 +835,7 @@ static void disk_seqf_stop(struct seq_file *seqf, void 
>>> *v)
>>>  
>>>  static void *show_partition_start(struct seq_file *seqf, loff_t *pos)
>>>  {
>>> -   static void *p;
>>> +   void *p;
>>>  
>>> p = disk_seqf_start(seqf, pos);
>>> if (!IS_ERR_OR_NULL(p) && !*pos)
>> 
>> Huh, that looks like a clear bug. I've applied it, thanks.
>
>It also looks like a -stable material, don't you think?
>
>Thanks,
>
>/mjt
>
Yes, all kernel before this patach had this problem and should apply this patch.

Re: Re: [PATCH] block: Don't use static to define void *p in show_partition_start().

2012-08-12 Thread Jianpeng Ma
On 2012-08-12 23:45 Michael Tokarev m...@tls.msk.ru Wrote:
On 03.08.2012 12:41, Jens Axboe wrote:
 On 08/03/2012 07:07 AM, majianpeng wrote:
[]
 diff --git a/block/genhd.c b/block/genhd.c
 index cac7366..d839723 100644
 --- a/block/genhd.c
 +++ b/block/genhd.c
 @@ -835,7 +835,7 @@ static void disk_seqf_stop(struct seq_file *seqf, void 
 *v)
  
  static void *show_partition_start(struct seq_file *seqf, loff_t *pos)
  {
 -   static void *p;
 +   void *p;
  
 p = disk_seqf_start(seqf, pos);
 if (!IS_ERR_OR_NULL(p)  !*pos)
 
 Huh, that looks like a clear bug. I've applied it, thanks.

It also looks like a -stable material, don't you think?

Thanks,

/mjt

Yes, all kernel before this patach had this problem and should apply this patch.

[PATCH 2/3 V1] block: Fix not tracing all device plug-operation.

2012-08-10 Thread Jianpeng Ma
If process handled two or more devices,there will not be trace some
devices plug-operation.

V0-->V1
Fix a bug when insert a req to plug-list which already had the same 
request-queue, it should
used list_add not list_add_tail.

Signed-off-by: Jianpeng Ma 
Signed-off-by: Jens Axboe 
---
 block/blk-core.c |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7a3abc6..034f186 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1521,11 +1521,25 @@ get_rq:
struct request *__rq;
 
__rq = list_entry_rq(plug->list.prev);
-   if (__rq->q != q)
+   if (__rq->q != q) {
plug->should_sort = 1;
+   trace_block_plug(q);
+   }
+   } else {
+   struct request *__rq;
+   list_for_each_entry_reverse(__rq, >list,
+   queuelist) {
+   if (__rq->q == q) {
+   list_add(>queuelist,
+   &__rq->queuelist);
+   goto stat_acct;
+   }
+   }
+   trace_block_plug(q);
}
}
list_add_tail(>queuelist, >list);
+stat_acct:
drive_stat_acct(req, 1);
} else {
spin_lock_irq(q->queue_lock);
-- 
1.7.9.5
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

[RFC PATCH] fs/direct-io.c: Add REQ_NOIDLE for last bio .

2012-08-10 Thread Jianpeng Ma
For last bio of dio, there are no bio will come.So set REQ_NOIDLE.

Signed-off-by: Jianpeng Ma 
---
 fs/direct-io.c |   15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 1faf4cb..7c6958f 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -127,6 +127,7 @@ struct dio {
int page_errors;/* errno from get_user_pages() */
int is_async;   /* is IO async ? */
int io_error;   /* IO error in completion path */
+   sector_t end_sector;/* the last sector for this dio */
unsigned long refcount; /* direct_io_worker() and bios */
struct bio *bio_list;   /* singly linked via bi_private */
struct task_struct *waiter; /* waiting task (NULL if none) */
@@ -369,21 +370,28 @@ static inline void dio_bio_submit(struct dio *dio, struct 
dio_submit *sdio)
 {
struct bio *bio = sdio->bio;
unsigned long flags;
-
+   int rw = dio->rw;
bio->bi_private = dio;
 
spin_lock_irqsave(>bio_lock, flags);
dio->refcount++;
spin_unlock_irqrestore(>bio_lock, flags);
 
+   /*
+* If bio is the last for dio,so no bio can arrive for low-level
+* unless this dio completed.
+*/
+   if (bio->bi_sector + bio_sectors(bio) >= dio->end_sector)
+   rw |= REQ_NOIDLE;
+
if (dio->is_async && dio->rw == READ)
bio_set_pages_dirty(bio);
 
if (sdio->submit_io)
-   sdio->submit_io(dio->rw, bio, dio->inode,
+   sdio->submit_io(rw, bio, dio->inode,
   sdio->logical_offset_in_bio);
else
-   submit_bio(dio->rw, bio);
+   submit_bio(rw, bio);
 
sdio->bio = NULL;
sdio->boundary = 0;
@@ -1147,6 +1155,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct 
inode *inode,
 
dio->inode = inode;
dio->rw = rw;
+   dio->end_sector = end >> 9;
sdio.blkbits = blkbits;
sdio.blkfactor = inode->i_blkbits - blkbits;
sdio.block_in_file = offset >> blkbits;
-- 
1.7.9.5


[PATCH 3/3] block: Remove unnecessary requests sort.

2012-08-10 Thread Jianpeng Ma
When adding request to plug,it already sort.So there is not unnecessary.

Signed-off-by: Jianpeng Ma 
---
 block/blk-core.c |   12 
 1 file changed, 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 034f186..9dbdef6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2879,13 +2879,6 @@ void blk_start_plug(struct blk_plug *plug)
 }
 EXPORT_SYMBOL(blk_start_plug);
 
-static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
-{
-   struct request *rqa = container_of(a, struct request, queuelist);
-   struct request *rqb = container_of(b, struct request, queuelist);
-
-   return !(rqa->q <= rqb->q);
-}
 
 /*
  * If 'from_schedule' is true, then postpone the dispatch of requests
@@ -2980,11 +2973,6 @@ void blk_flush_plug_list(struct blk_plug *plug, bool 
from_schedule)
 
list_splice_init(>list, );
 
-   if (plug->should_sort) {
-   list_sort(NULL, , plug_rq_cmp);
-   plug->should_sort = 0;
-   }
-
q = NULL;
depth = 0;
 
-- 
1.7.9.5


[PATCH 2/3] block: Fix not tracing all device plug-operation.

2012-08-10 Thread Jianpeng Ma
If process handled two or more devices,there will not be trace some
devices plug-operation.

Signed-off-by: Jianpeng Ma 
---
 block/blk-core.c |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7a3abc6..034f186 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1521,11 +1521,25 @@ get_rq:
struct request *__rq;
 
__rq = list_entry_rq(plug->list.prev);
-   if (__rq->q != q)
+   if (__rq->q != q) {
plug->should_sort = 1;
+   trace_block_plug(q);
+   }
+   } else {
+   struct request *__rq;
+   list_for_each_entry_reverse(__rq, >list,
+   queuelist) {
+   if (__rq->q == q) {
+   list_add_tail(>queuelist,
+   &__rq->queuelist);
+   goto stat_acct;
+   }
+   }
+   trace_block_plug(q);
}
}
list_add_tail(>queuelist, >list);
+stat_acct:
drive_stat_acct(req, 1);
} else {
spin_lock_irq(q->queue_lock);
-- 
1.7.9.5


[PATCH 1/3] block: avoid unnecessary plug should_sort test.

2012-08-10 Thread Jianpeng Ma
If request_count >= BLK_MAX_REQUEST_COUNT,then it will exec
blk_flush_plug_list which plug all request.So no need to do plug->should_sort 
test.

Signed-off-by: Jianpeng Ma 
---
 block/blk-core.c |9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4b4dbdf..7a3abc6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1514,17 +1514,16 @@ get_rq:
if (list_empty(>list))
trace_block_plug(q);
else {
-   if (!plug->should_sort) {
+   if (request_count >= BLK_MAX_REQUEST_COUNT) {
+   blk_flush_plug_list(plug, false);
+   trace_block_plug(q);
+   } else if (!plug->should_sort) {
struct request *__rq;
 
__rq = list_entry_rq(plug->list.prev);
if (__rq->q != q)
plug->should_sort = 1;
}
-   if (request_count >= BLK_MAX_REQUEST_COUNT) {
-   blk_flush_plug_list(plug, false);
-   trace_block_plug(q);
-   }
}
list_add_tail(>queuelist, >list);
drive_stat_acct(req, 1);
-- 
1.7.9.5
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

[PATCH 0/3] Fix problems about handling bio to plug when bio merged failed.

2012-08-10 Thread Jianpeng Ma
There are some problems about handling bio which merge to plug failed.
Patch1 will avoid unnecessary plug should_sort test,although it's not a bug.
Patch2 correct a bug when handle more devices,it leak some devices to trace 
plug-operation.

Because the patch2,so it's not necessary to sort when flush plug.Although 
patch2 has 
O(n*n) complexity,it's more than list_sort which has O(nlog(n)) complexity.But 
the plug 
list is unlikely too long,so i think patch3 can accept.


Jianpeng Ma (3):
  block: avoid unnecessary plug should_sort test.
  block: Fix not tracing all device plug-operation.
  block: Remove unnecessary requests sort.

 block/blk-core.c |   35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

-- 
1.7.9.5


[PATCH 0/3] Fix problems about handling bio to plug when bio merged failed.

2012-08-10 Thread Jianpeng Ma
There are some problems about handling bio which merge to plug failed.
Patch1 will avoid unnecessary plug should_sort test,although it's not a bug.
Patch2 correct a bug when handle more devices,it leak some devices to trace 
plug-operation.

Because the patch2,so it's not necessary to sort when flush plug.Although 
patch2 has 
O(n*n) complexity,it's more than list_sort which has O(nlog(n)) complexity.But 
the plug 
list is unlikely too long,so i think patch3 can accept.


Jianpeng Ma (3):
  block: avoid unnecessary plug should_sort test.
  block: Fix not tracing all device plug-operation.
  block: Remove unnecessary requests sort.

 block/blk-core.c |   35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

-- 
1.7.9.5


[PATCH 1/3] block: avoid unnecessary plug should_sort test.

2012-08-10 Thread Jianpeng Ma
If request_count = BLK_MAX_REQUEST_COUNT,then it will exec
blk_flush_plug_list which plug all request.So no need to do plug-should_sort 
test.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
---
 block/blk-core.c |9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4b4dbdf..7a3abc6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1514,17 +1514,16 @@ get_rq:
if (list_empty(plug-list))
trace_block_plug(q);
else {
-   if (!plug-should_sort) {
+   if (request_count = BLK_MAX_REQUEST_COUNT) {
+   blk_flush_plug_list(plug, false);
+   trace_block_plug(q);
+   } else if (!plug-should_sort) {
struct request *__rq;
 
__rq = list_entry_rq(plug-list.prev);
if (__rq-q != q)
plug-should_sort = 1;
}
-   if (request_count = BLK_MAX_REQUEST_COUNT) {
-   blk_flush_plug_list(plug, false);
-   trace_block_plug(q);
-   }
}
list_add_tail(req-queuelist, plug-list);
drive_stat_acct(req, 1);
-- 
1.7.9.5
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

[PATCH 2/3] block: Fix not tracing all device plug-operation.

2012-08-10 Thread Jianpeng Ma
If process handled two or more devices,there will not be trace some
devices plug-operation.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
---
 block/blk-core.c |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7a3abc6..034f186 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1521,11 +1521,25 @@ get_rq:
struct request *__rq;
 
__rq = list_entry_rq(plug-list.prev);
-   if (__rq-q != q)
+   if (__rq-q != q) {
plug-should_sort = 1;
+   trace_block_plug(q);
+   }
+   } else {
+   struct request *__rq;
+   list_for_each_entry_reverse(__rq, plug-list,
+   queuelist) {
+   if (__rq-q == q) {
+   list_add_tail(req-queuelist,
+   __rq-queuelist);
+   goto stat_acct;
+   }
+   }
+   trace_block_plug(q);
}
}
list_add_tail(req-queuelist, plug-list);
+stat_acct:
drive_stat_acct(req, 1);
} else {
spin_lock_irq(q-queue_lock);
-- 
1.7.9.5


[PATCH 3/3] block: Remove unnecessary requests sort.

2012-08-10 Thread Jianpeng Ma
When adding request to plug,it already sort.So there is not unnecessary.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
---
 block/blk-core.c |   12 
 1 file changed, 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 034f186..9dbdef6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2879,13 +2879,6 @@ void blk_start_plug(struct blk_plug *plug)
 }
 EXPORT_SYMBOL(blk_start_plug);
 
-static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
-{
-   struct request *rqa = container_of(a, struct request, queuelist);
-   struct request *rqb = container_of(b, struct request, queuelist);
-
-   return !(rqa-q = rqb-q);
-}
 
 /*
  * If 'from_schedule' is true, then postpone the dispatch of requests
@@ -2980,11 +2973,6 @@ void blk_flush_plug_list(struct blk_plug *plug, bool 
from_schedule)
 
list_splice_init(plug-list, list);
 
-   if (plug-should_sort) {
-   list_sort(NULL, list, plug_rq_cmp);
-   plug-should_sort = 0;
-   }
-
q = NULL;
depth = 0;
 
-- 
1.7.9.5


[RFC PATCH] fs/direct-io.c: Add REQ_NOIDLE for last bio .

2012-08-10 Thread Jianpeng Ma
For last bio of dio, there are no bio will come.So set REQ_NOIDLE.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
---
 fs/direct-io.c |   15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 1faf4cb..7c6958f 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -127,6 +127,7 @@ struct dio {
int page_errors;/* errno from get_user_pages() */
int is_async;   /* is IO async ? */
int io_error;   /* IO error in completion path */
+   sector_t end_sector;/* the last sector for this dio */
unsigned long refcount; /* direct_io_worker() and bios */
struct bio *bio_list;   /* singly linked via bi_private */
struct task_struct *waiter; /* waiting task (NULL if none) */
@@ -369,21 +370,28 @@ static inline void dio_bio_submit(struct dio *dio, struct 
dio_submit *sdio)
 {
struct bio *bio = sdio-bio;
unsigned long flags;
-
+   int rw = dio-rw;
bio-bi_private = dio;
 
spin_lock_irqsave(dio-bio_lock, flags);
dio-refcount++;
spin_unlock_irqrestore(dio-bio_lock, flags);
 
+   /*
+* If bio is the last for dio,so no bio can arrive for low-level
+* unless this dio completed.
+*/
+   if (bio-bi_sector + bio_sectors(bio) = dio-end_sector)
+   rw |= REQ_NOIDLE;
+
if (dio-is_async  dio-rw == READ)
bio_set_pages_dirty(bio);
 
if (sdio-submit_io)
-   sdio-submit_io(dio-rw, bio, dio-inode,
+   sdio-submit_io(rw, bio, dio-inode,
   sdio-logical_offset_in_bio);
else
-   submit_bio(dio-rw, bio);
+   submit_bio(rw, bio);
 
sdio-bio = NULL;
sdio-boundary = 0;
@@ -1147,6 +1155,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct 
inode *inode,
 
dio-inode = inode;
dio-rw = rw;
+   dio-end_sector = end  9;
sdio.blkbits = blkbits;
sdio.blkfactor = inode-i_blkbits - blkbits;
sdio.block_in_file = offset  blkbits;
-- 
1.7.9.5


[PATCH 2/3 V1] block: Fix not tracing all device plug-operation.

2012-08-10 Thread Jianpeng Ma
If process handled two or more devices,there will not be trace some
devices plug-operation.

V0--V1
Fix a bug when insert a req to plug-list which already had the same 
request-queue, it should
used list_add not list_add_tail.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
Signed-off-by: Jens Axboe ax...@kernel.dk
---
 block/blk-core.c |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7a3abc6..034f186 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1521,11 +1521,25 @@ get_rq:
struct request *__rq;
 
__rq = list_entry_rq(plug-list.prev);
-   if (__rq-q != q)
+   if (__rq-q != q) {
plug-should_sort = 1;
+   trace_block_plug(q);
+   }
+   } else {
+   struct request *__rq;
+   list_for_each_entry_reverse(__rq, plug-list,
+   queuelist) {
+   if (__rq-q == q) {
+   list_add(req-queuelist,
+   __rq-queuelist);
+   goto stat_acct;
+   }
+   }
+   trace_block_plug(q);
}
}
list_add_tail(req-queuelist, plug-list);
+stat_acct:
drive_stat_acct(req, 1);
} else {
spin_lock_irq(q-queue_lock);
-- 
1.7.9.5
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: Re: [RFC PATCH] block:Fix some problems about handling plug in blk_queue_bio().

2012-08-07 Thread Jianpeng Ma
On 2012-08-08 11:06 Shaohua Li  Wrote:
>2012/8/8 Jianpeng Ma :
>> I think there are three problems about handling plug in blk_queue_bio():
>> 1:if request_count >= BLK_MAX_REQUEST_COUNT, avoid unnecessary 
>> plug->should_sort judge.
>this makes sense, though not a big deal, nice to fix it.
Thanks
>
>> 2:Only two device can trace plug.
>I didn't get the point, can you have more details?

>>if (plug) {
>>  /*
>>   * If this is the first request added after a plug, fire
>>   * of a plug trace. If others have been added before, check
>>   * if we have multiple devices in this plug. If so, make a
>>   * note to sort the list before dispatch.
>>   */
>>  if (list_empty(>list))
>>  trace_block_plug(q);
>>  else {
>>  if (!plug->should_sort) {
>>  struct request *__rq;

>>  __rq = list_entry_rq(plug->list.prev);
>>  if (__rq->q != q)
>>  plug->should_sort = 1;
>>  }
>>  if (request_count >= BLK_MAX_REQUEST_COUNT) {
>>  blk_flush_plug_list(plug, false);
>>  trace_block_plug(q);
The code only trace two point;
A:  list_empty(>list)
B:  request_count >= BLK_MAX_REQUEST_COUNT). it's the same like A which 
plug->list is empty.
Suppose: 
1;reqA-deviceA firstly come, it will call trace_block_plug because the 
list_empty(plug->list) is true.
2:reqB-deviceB comed, attempt_plug_merge will failed because not 
deviceB-request-queue.But it'll not to call trace_block_plug.

But call blk_flush_plug_list,it will trace_block_unplug all request_queue.
>
>> 3:When exec blk_flush_plug_list,it use list_sort which has
>> O(nlog(n)) complexity. When insert and sort, it only O(n) complexity.
>but now you do the list iterator for every request, so it's O(n*n)?
>The plug list is unlikely too long, so I didn't worry about the time
>spending on list sort.
Sorry, it's my fault.

[RFC PATCH] block:Fix some problems about handling plug in blk_queue_bio().

2012-08-07 Thread Jianpeng Ma
I think there are three problems about handling plug in blk_queue_bio():
1:if request_count >= BLK_MAX_REQUEST_COUNT, avoid unnecessary 
plug->should_sort judge.
2:Only two device can trace plug.
3:When exec blk_flush_plug_list,it use list_sort which has
O(nlog(n)) complexity. When insert and sort, it only O(n) complexity.

Signed-off-by: Jianpeng Ma 
---
 block/blk-core.c |   32 +++-
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4b4dbdf..e7759f8 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1514,20 +1514,31 @@ get_rq:
if (list_empty(>list))
trace_block_plug(q);
else {
-   if (!plug->should_sort) {
+   if (request_count >= BLK_MAX_REQUEST_COUNT) {
+   blk_flush_plug_list(plug, false);
+   trace_block_plug(q);
+   } else  if (!plug->should_sort) {
struct request *__rq;
 
__rq = list_entry_rq(plug->list.prev);
if (__rq->q != q)
plug->should_sort = 1;
-   }
-   if (request_count >= BLK_MAX_REQUEST_COUNT) {
-   blk_flush_plug_list(plug, false);
+   } else  {
+   struct request *rq;
+
+   list_for_each_entry_reverse(rq, >list, 
queuelist) {
+   if (rq->q == q) {
+   list_add(>queuelist, 
>queuelist);
+   goto stat_acct;
+   }
+   }
trace_block_plug(q);
}
}
list_add_tail(>queuelist, >list);
+stat_acct:
drive_stat_acct(req, 1);
+
} else {
spin_lock_irq(q->queue_lock);
add_acct_request(q, req, where);
@@ -2866,14 +2877,6 @@ void blk_start_plug(struct blk_plug *plug)
 }
 EXPORT_SYMBOL(blk_start_plug);
 
-static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
-{
-   struct request *rqa = container_of(a, struct request, queuelist);
-   struct request *rqb = container_of(b, struct request, queuelist);
-
-   return !(rqa->q <= rqb->q);
-}
-
 /*
  * If 'from_schedule' is true, then postpone the dispatch of requests
  * until a safe kblockd context. We due this to avoid accidental big
@@ -2967,11 +2970,6 @@ void blk_flush_plug_list(struct blk_plug *plug, bool 
from_schedule)
 
list_splice_init(>list, );
 
-   if (plug->should_sort) {
-   list_sort(NULL, , plug_rq_cmp);
-   plug->should_sort = 0;
-   }
-
q = NULL;
depth = 0;
 
-- 
1.7.9.5


[RFC PATCH] block:Fix some problems about handling plug in blk_queue_bio().

2012-08-07 Thread Jianpeng Ma
I think there are three problems about handling plug in blk_queue_bio():
1:if request_count = BLK_MAX_REQUEST_COUNT, avoid unnecessary 
plug-should_sort judge.
2:Only two device can trace plug.
3:When exec blk_flush_plug_list,it use list_sort which has
O(nlog(n)) complexity. When insert and sort, it only O(n) complexity.

Signed-off-by: Jianpeng Ma majianp...@gmail.com
---
 block/blk-core.c |   32 +++-
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4b4dbdf..e7759f8 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1514,20 +1514,31 @@ get_rq:
if (list_empty(plug-list))
trace_block_plug(q);
else {
-   if (!plug-should_sort) {
+   if (request_count = BLK_MAX_REQUEST_COUNT) {
+   blk_flush_plug_list(plug, false);
+   trace_block_plug(q);
+   } else  if (!plug-should_sort) {
struct request *__rq;
 
__rq = list_entry_rq(plug-list.prev);
if (__rq-q != q)
plug-should_sort = 1;
-   }
-   if (request_count = BLK_MAX_REQUEST_COUNT) {
-   blk_flush_plug_list(plug, false);
+   } else  {
+   struct request *rq;
+
+   list_for_each_entry_reverse(rq, plug-list, 
queuelist) {
+   if (rq-q == q) {
+   list_add(req-queuelist, 
rq-queuelist);
+   goto stat_acct;
+   }
+   }
trace_block_plug(q);
}
}
list_add_tail(req-queuelist, plug-list);
+stat_acct:
drive_stat_acct(req, 1);
+
} else {
spin_lock_irq(q-queue_lock);
add_acct_request(q, req, where);
@@ -2866,14 +2877,6 @@ void blk_start_plug(struct blk_plug *plug)
 }
 EXPORT_SYMBOL(blk_start_plug);
 
-static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
-{
-   struct request *rqa = container_of(a, struct request, queuelist);
-   struct request *rqb = container_of(b, struct request, queuelist);
-
-   return !(rqa-q = rqb-q);
-}
-
 /*
  * If 'from_schedule' is true, then postpone the dispatch of requests
  * until a safe kblockd context. We due this to avoid accidental big
@@ -2967,11 +2970,6 @@ void blk_flush_plug_list(struct blk_plug *plug, bool 
from_schedule)
 
list_splice_init(plug-list, list);
 
-   if (plug-should_sort) {
-   list_sort(NULL, list, plug_rq_cmp);
-   plug-should_sort = 0;
-   }
-
q = NULL;
depth = 0;
 
-- 
1.7.9.5


Re: Re: [RFC PATCH] block:Fix some problems about handling plug in blk_queue_bio().

2012-08-07 Thread Jianpeng Ma
On 2012-08-08 11:06 Shaohua Li s...@kernel.org Wrote:
2012/8/8 Jianpeng Ma majianp...@gmail.com:
 I think there are three problems about handling plug in blk_queue_bio():
 1:if request_count = BLK_MAX_REQUEST_COUNT, avoid unnecessary 
 plug-should_sort judge.
this makes sense, though not a big deal, nice to fix it.
Thanks

 2:Only two device can trace plug.
I didn't get the point, can you have more details?

if (plug) {
  /*
   * If this is the first request added after a plug, fire
   * of a plug trace. If others have been added before, check
   * if we have multiple devices in this plug. If so, make a
   * note to sort the list before dispatch.
   */
  if (list_empty(plug-list))
  trace_block_plug(q);
  else {
  if (!plug-should_sort) {
  struct request *__rq;

  __rq = list_entry_rq(plug-list.prev);
  if (__rq-q != q)
  plug-should_sort = 1;
  }
  if (request_count = BLK_MAX_REQUEST_COUNT) {
  blk_flush_plug_list(plug, false);
  trace_block_plug(q);
The code only trace two point;
A:  list_empty(plug-list)
B:  request_count = BLK_MAX_REQUEST_COUNT). it's the same like A which 
plug-list is empty.
Suppose: 
1;reqA-deviceA firstly come, it will call trace_block_plug because the 
list_empty(plug-list) is true.
2:reqB-deviceB comed, attempt_plug_merge will failed because not 
deviceB-request-queue.But it'll not to call trace_block_plug.

But call blk_flush_plug_list,it will trace_block_unplug all request_queue.

 3:When exec blk_flush_plug_list,it use list_sort which has
 O(nlog(n)) complexity. When insert and sort, it only O(n) complexity.
but now you do the list iterator for every request, so it's O(n*n)?
The plug list is unlikely too long, so I didn't worry about the time
spending on list sort.
Sorry, it's my fault.