> -----Original Message-----
> From: J. Bruce Fields [mailto:[email protected]]
> Sent: Wednesday, April 24, 2013 00:06
> To: Yan Burman
> Cc: Wendy Cheng; Atchley, Scott; Tom Tucker; [email protected];
> [email protected]; Or Gerlitz
> Subject: Re: NFS over RDMA benchmark
>
> On Thu, Apr 18, 2013 at 12:47:09PM +0000, Yan Burman wrote:
> >
> >
> > > -----Original Message-----
> > > From: Wendy Cheng [mailto:[email protected]]
> > > Sent: Wednesday, April 17, 2013 21:06
> > > To: Atchley, Scott
> > > Cc: Yan Burman; J. Bruce Fields; Tom Tucker;
> > > [email protected]; [email protected]
> > > Subject: Re: NFS over RDMA benchmark
> > >
> > > On Wed, Apr 17, 2013 at 10:32 AM, Atchley, Scott
> > > <[email protected]>
> > > wrote:
> > > > On Apr 17, 2013, at 1:15 PM, Wendy Cheng <[email protected]>
> > > wrote:
> > > >
> > > >> On Wed, Apr 17, 2013 at 7:36 AM, Yan Burman <[email protected]>
> > > wrote:
> > > >>> Hi.
> > > >>>
> > > >>> I've been trying to do some benchmarks for NFS over RDMA and I
> > > >>> seem to
> > > only get about half of the bandwidth that the HW can give me.
> > > >>> My setup consists of 2 servers each with 16 cores, 32Gb of
> > > >>> memory, and
> > > Mellanox ConnectX3 QDR card over PCI-e gen3.
> > > >>> These servers are connected to a QDR IB switch. The backing
> > > >>> storage on
> > > the server is tmpfs mounted with noatime.
> > > >>> I am running kernel 3.5.7.
> > > >>>
> > > >>> When running ib_send_bw, I get 4.3-4.5 GB/sec for block sizes 4-512K.
> > > >>> When I run fio over rdma mounted nfs, I get 260-2200MB/sec for
> > > >>> the
> > > same block sizes (4-512K). running over IPoIB-CM, I get 200-980MB/sec.
> > > >
> > > > Yan,
> > > >
> > > > Are you trying to optimize single client performance or server
> > > > performance
> > > with multiple clients?
> > > >
> >
> > I am trying to get maximum performance from a single server - I used 2
> processes in fio test - more than 2 did not show any performance boost.
> > I tried running fio from 2 different PCs on 2 different files, but the sum
> > of
> the two is more or less the same as running from single client PC.
> >
> > What I did see is that server is sweating a lot more than the clients and
> more than that, it has 1 core (CPU5) in 100% softirq tasklet:
> > cat /proc/softirqs
>
> Would any profiling help figure out which code it's spending time in?
> (E.g. something simple as "perf top" might have useful output.)
>
Perf top for the CPU with high tasklet count gives:
samples pcnt RIP function DSO
_______ _____ ________________ ___________________________
___________________________________________________________________
2787.00 24.1% ffffffff81062a00 mutex_spin_on_owner
/root/vmlinux
978.00 8.4% ffffffff810297f0 clflush_cache_range
/root/vmlinux
445.00 3.8% ffffffff812ea440 __domain_mapping
/root/vmlinux
441.00 3.8% 0000000000018c30 svc_recv
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/sunrpc.ko
344.00 3.0% ffffffff813a1bc0 _raw_spin_lock_bh
/root/vmlinux
333.00 2.9% ffffffff813a19e0 _raw_spin_lock_irqsave
/root/vmlinux
288.00 2.5% ffffffff813a07d0 __schedule
/root/vmlinux
249.00 2.1% ffffffff811a87e0 rb_prev
/root/vmlinux
242.00 2.1% ffffffff813a19b0 _raw_spin_lock
/root/vmlinux
184.00 1.6% 0000000000002e90 svc_rdma_sendto
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/xprtrdma/svcrdma.ko
177.00 1.5% ffffffff810ac820 get_page_from_freelist
/root/vmlinux
174.00 1.5% ffffffff812e6da0 alloc_iova
/root/vmlinux
165.00 1.4% ffffffff810b1390 put_page
/root/vmlinux
148.00 1.3% 0000000000014760 sunrpc_cache_lookup
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/sunrpc.ko
128.00 1.1% 0000000000017f20 svc_xprt_enqueue
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/sunrpc.ko
126.00 1.1% ffffffff8139f820 __mutex_lock_slowpath
/root/vmlinux
108.00 0.9% ffffffff811a81d0 rb_insert_color
/root/vmlinux
107.00 0.9% 0000000000004690 svc_rdma_recvfrom
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/xprtrdma/svcrdma.ko
102.00 0.9% 0000000000002640 send_reply
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/xprtrdma/svcrdma.ko
99.00 0.9% ffffffff810e6490 kmem_cache_alloc
/root/vmlinux
96.00 0.8% ffffffff810e5840 __slab_alloc
/root/vmlinux
91.00 0.8% 0000000000006d30 mlx4_ib_post_send
/lib/modules/3.5.7-dbg/kernel/drivers/infiniband/hw/mlx4/mlx4_ib.ko
88.00 0.8% 0000000000000dd0 svc_rdma_get_context
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/xprtrdma/svcrdma.ko
86.00 0.7% ffffffff813a1a10 _raw_spin_lock_irq
/root/vmlinux
86.00 0.7% 0000000000001530 svc_rdma_send
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/xprtrdma/svcrdma.ko
85.00 0.7% ffffffff81060a80 prepare_creds
/root/vmlinux
83.00 0.7% ffffffff810a5790 find_get_pages_contig
/root/vmlinux
79.00 0.7% ffffffff810e4620 __slab_free
/root/vmlinux
79.00 0.7% ffffffff813a1a40 _raw_spin_unlock_irqrestore
/root/vmlinux
77.00 0.7% ffffffff81065610 finish_task_switch
/root/vmlinux
76.00 0.7% ffffffff812e9270 pfn_to_dma_pte
/root/vmlinux
75.00 0.6% ffffffff810976d0 __call_rcu
/root/vmlinux
73.00 0.6% ffffffff811a2fa0 _atomic_dec_and_lock
/root/vmlinux
73.00 0.6% 00000000000002e0 svc_rdma_has_wspace
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/xprtrdma/svcrdma.ko
67.00 0.6% ffffffff813a1a70 _raw_read_lock
/root/vmlinux
65.00 0.6% 000000000000f590 svcauth_unix_set_client
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/sunrpc.ko
63.00 0.5% 00000000000180e0 svc_reserve
/lib/modules/3.5.7-dbg/kernel/net/sunrpc/sunrpc.ko
60.00 0.5% 00000000000064d0 stamp_send_wqe
/lib/modules/3.5.7-dbg/kernel/drivers/infiniband/hw/mlx4/mlx4_ib.ko
57.00 0.5% ffffffff810ac110 free_hot_cold_page
/root/vmlinux
57.00 0.5% ffffffff811ae540 memcpy
/root/vmlinux
56.00 0.5% ffffffff810ad1a0 __alloc_pages_nodemask
/root/vmlinux
55.00 0.5% ffffffff81118200 splice_to_pipe
/root/vmlinux
53.00 0.5% ffffffff810e3bc0 get_partial_node
/root/vmlinux
49.00 0.4% ffffffff812eb840 __intel_map_single
/root/vmlinux
> --b.
>
> > CPU0 CPU1 CPU2 CPU3 CPU4
> > CPU5 CPU6
> CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13
> CPU14
> CPU15
> > HI: 0 0 0 0 0
> > 0 0 0 0
> 0 0 0 0 0 0 0
> > TIMER: 418767 46596 43515 44547 50099
> > 34815
> 40634 40337 39551 93442 73733 42631 42509
> 41592
> 40351 61793
> > NET_TX: 28719 309 1421 1294 1730
> > 1243 832
> 937 11 44 41 20 26 19
> 15 29
> > NET_RX: 612070 19 22 21 6
> > 235 3 2
> 9 6 17 16 20 13 16 10
> > BLOCK: 5941 0 0 0 0
> > 0 0 0
> 519 259 1238 272 253 174 215
> 2618
> > BLOCK_IOPOLL: 0 0 0 0 0
> > 0 0 0
> 0 0 0 0 0 0 0 0
> > TASKLET: 28 1 1 1 1
> > 1540653 1 1
> 29 1 1 1 1 1 1
> 2
> > SCHED: 364965 26547 16807 18403 22919
> > 8678
> 14358 14091 16981 64903 47141 18517 19179
> 18036
> 17037 38261
> > HRTIMER: 13 0 1 1 0
> > 0 0 0
> 0 0 0 0 1 1 0 1
> > RCU: 945823 841546 715281 892762 823564
> > 42663
> 863063 841622 333577 389013 393501 239103 221524
> 258159
> 313426 234030
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html