答复: [PATCH][RFC] nvdimm: pmem: always flush nvdimm for write request

Li,Rongqing Sun, 14 Apr 2019 20:27:05 -0700


> -----邮件原件-----
> 发件人: Elliott, Robert (Servers) [mailto:[email protected]]
> 发送时间: 2019年4月14日 11:21
> 收件人: Li,Rongqing <[email protected]>; Dan Williams
> <[email protected]>
> 抄送: linux-nvdimm <[email protected]>
> 主题: RE: [PATCH][RFC] nvdimm: pmem: always flush nvdimm for write request
> 
> 
> 
> >> @@ -215,7 +216,7 @@ static blk_qc_t pmem_make_request(struct
> request_queue *q, struct bio *bio)
> >>    if (do_acct)
> >>            nd_iostat_end(bio, start);
> >>
> >> -  if (bio->bi_opf & REQ_FUA)
> >> +  if (bio->bi_opf & REQ_FUA || op_is_write(op))
> >>            nvdimm_flush(nd_region);
> ...
> >> Before:
> >> Jobs: 32 (f=32): [W(32)][14.2%][w=1884MiB/s][w=482k IOPS][eta
> >> 01m:43s]
> >> After:
> >> Jobs: 32 (f=32): [W(32)][8.3%][w=2378MiB/s][w=609k IOPS][eta 01m:50s]
> >>
> >> -RongQing
> 
> 
> Doing more work cannot be faster than doing less work, so something else
> must be happening here.
>


Dan Williams maybe know more.


> Please post the full fio job file and how you invoke it (i.e., with numactl).
> 
This fio file is below, and we bind fio with cpu and node:  numactl --membind=0 
taskset -c 2-24 ./fio test_io_raw

[global]
numjobs=32
direct=1
filename=/dev/pmem0.1
iodepth=32
ioengine=libaio
group_reporting=1
bs=4K
time_based=1
 
[write1]
rw=randwrite
runtime=60
stonewall

> These tools help show what is happening on the CPUs and memory channels:
>     perf top

62.40%  [kernel]            [k] memcpy_flushcache                               
                                                                                
      
 21.17%  [kernel]            [k] fput                                           
                                                                                
       
  6.12%  [kernel]            [k] apic_timer_interrupt                           
                                                                                
       
  0.89%  [kernel]            [k] rq_qos_done_bio                                
                                                                                
       
  0.66%  [kernel]            [k] bio_endio                                      
                                                                                
       
  0.44%  [kernel]            [k] aio_complete_rw                                
                                                                                
       
  0.39%  [kernel]            [k] blkdev_bio_end_io                              
                                                                                
       
  0.31%  [kernel]            [k] entry_SYSCALL_64                               
                                                                                
       
  0.26%  [kernel]            [k] bio_disassociate_task                          
                                                                                
       
  0.23%  [kernel]            [k] read_tsc                                       
                                                                                
       
  0.21%  fio                 [.] axmap_isset                                    
                                                                                
       
  0.20%  [kernel]            [k] ktime_get_raw_ts64                             
                                                                                
       
  0.19%  [vdso]              [.] 0x7ffc475e2b30                                 
                                                                                
       
  0.18%  [kernel]            [k] gup_pgd_range                                  
                                                                                
       
  0.18%  [kernel]            [k] entry_SYSCALL_64_after_hwframe                 
                                                                                
       
  0.16%  [kernel]            [k] __audit_syscall_exit                           
                                                                                
       
  0.16%  [kernel]            [k] __x86_indirect_thunk_rax                       
                                                                                
       
  0.13%  [kernel]            [k] copy_user_enhanced_fast_string                 
                                                                                
       
  0.13%  [kernel]            [k] syscall_return_via_sysret                      
                                                                                
       
  0.12%  [kernel]            [k] preempt_count_add                              
                                                                                
       
  0.12%  [kernel]            [k] preempt_count_sub                              
                                                                                
       
  0.11%  [kernel]            [k] __x64_sys_clock_gettime                        
                                                                                
       
  0.11%  [kernel]            [k] tracer_hardirqs_off                            
                                                                                
       
  0.10%  [kernel]            [k] native_write_msr                               
                                                                                
       
  0.10%  [kernel]            [k] posix_get_monotonic_raw                        
                                                                                
       
  0.10%  fio                 [.] get_io_u

>     pcm.x

http://pasted.co/6fc93b42

>     pcm-memory.x -pmm
> 
http://pasted.co/d5c0c96b


If not bind fio with cpu and numa node, the performance will larger lower, but 
this optimization is suitable both condition , it will about 40% improvement 
sometime.

-Li
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

答复: [PATCH][RFC] nvdimm: pmem: always flush nvdimm for write request

Reply via email to