Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?
Hi All, Just a follow up - it seems like whatever it was doing it eventually got done with and the speed picked back up again. The send/recv finally finished -- I guess I could do with a little patience :) Lachlan On Mon, Dec 5, 2011 at 10:47 AM, Lachlan Mulcahy wrote: > Hi All, > > We are currently doing a zfs send/recv with mbuffer to send incremental > changes across and it seems to be running quite slowly, with zfs receive > the apparent bottle neck. > > The process itself seems to be using almost 100% of a single CPU in "sys" > time. > > Wondering if anyone has any ideas if this is normal or if this is just > going to run forever and never finish... > > > details - two machines connected via Gigabit Ethernet on the same LAN. > > Sending server: > > zfs send -i 20111201_1 data@20111205_1 | mbuffer -s 128k -m 1G -O > tdp03r-int:9090 > > Receiving server: > > mbuffer -s 128k -m 1G -I 9090 | zfs receive -vF tank/db/data > > mbuffer showing: > > in @ 256 KiB/s, out @ 256 KiB/s, 306 GiB total, buffer 100% ful > > > > My debug: > > DTraceToolkit hotkernel reports: > > zfs`lzjb_decompress10 0.0% > unix`page_nextn31 0.0% > genunix`fsflush_do_pages 37 0.0% > zfs`dbuf_free_range 183 0.1% > genunix`list_next5822 3.7% > unix`mach_cpu_idle 150261 96.1% > > > Top shows: > >PID USERNAME NLWP PRI NICE SIZE RES STATETIMECPU COMMAND > 22945 root1 600 13M 3004K cpu/6 144:21 3.79% zfs >550 root 28 590 39M 22M sleep 10:19 0.06% fmd > > I'd say the 3.7% or so here is so low because we are providing not per > CPU, but aggregate CPU usage. mpstat seems to show the real story. > > mpstat 1 shows output much like this each second: > > CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt > idl > 00 00 329 108 830 1730 00 0 0 > 100 > 10 00 1001 940 2310 00 0 0 > 100 > 20 00320 280510 00 0 0 > 100 > 30 00180 110000 00 0 0 > 100 > 40 00166 100200 00 0 0 > 100 > 50 00 6020000 00 0 0 > 100 > 60 00 2000000 00 0 0 > 100 > 70 00 9040000160 0 0 > 100 > 80 00 6030000 00 3 0 > 97 > 90 00 3100000 00 0 0 > 100 > 100 00222 350110 00 89 0 > 11 > 110 00 2000000 00 0 0 > 100 > 120 00 3020100 20 0 0 > 100 > 130 00 2000000 00 0 0 > 100 > 140 0024 1760020610 0 0 > 100 > 150 00140 240010 20 0 0 > 100 > 160 00 2000000 00 0 0 > 100 > 170 0010280050780 1 0 > 99 > 180 00 2000000 00 0 0 > 100 > 190 00 5120000100 0 0 > 100 > 200 00 2000000 00 0 0 > 100 > 210 00 9240000 40 0 0 > 100 > 220 00 4000000 00 0 0 > 100 > 230 00 2000000 00 0 0 > 100 > > > So I'm lead to believe that zfs receive is spending almost 100% of a > single CPUs time doing a lot of genunix`list_next ... > > Any ideas what is going on here? > > Best Regards, > -- > Lachlan Mulcahy > Senior DBA, > Marin Software Inc. > San Francisco, USA > > AU Mobile: +61 458 448 721 > US Mobile: +1 (415) 867 2839 > Office : +1 (415) 671 6080 > > -- Lachlan Mulcahy Senior DBA, Marin Software Inc. San Francisco, USA AU Mobile: +61 458 448 721 US Mobile: +1 (415) 867 2839 Office : +1 (415) 671 6080 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?
Hi Bob, On Mon, Dec 5, 2011 at 12:31 PM, Bob Friesenhahn < bfrie...@simple.dallas.tx.us> wrote: > On Mon, 5 Dec 2011, Lachlan Mulcahy wrote: > >> >> Anything else you suggest I'd check for faults? (Though I'm sort of >> doubting it is an issue, I'm happy to be >> thorough) >> > > Try running > > fmdump -ef > > and see if new low-level fault events are comming in during the zfs > receive. > > Just a bunch of what I would guess is unrelated USB errors? Dec 05 20:58:09.0246 ereport.io.usb.epse Dec 05 20:58:50.9207 ereport.io.usb.epse Dec 05 20:59:36.0242 ereport.io.usb.epse Dec 05 20:59:39.0230 ereport.io.usb.epse Dec 05 21:00:21.0223 ereport.io.usb.epse Dec 05 21:01:06.0215 ereport.io.usb.epse Dec 05 21:01:09.0314 ereport.io.usb.epse Dec 05 21:01:50.9213 ereport.io.usb.epse Dec 05 21:02:36.0299 ereport.io.usb.epse Dec 05 21:02:39.0298 ereport.io.usb.epse Regards, -- Lachlan Mulcahy Senior DBA, Marin Software Inc. San Francisco, USA AU Mobile: +61 458 448 721 US Mobile: +1 (415) 867 2839 Office : +1 (415) 671 6080 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?
On Mon, 5 Dec 2011, Lachlan Mulcahy wrote: Anything else you suggest I'd check for faults? (Though I'm sort of doubting it is an issue, I'm happy to be thorough) Try running fmdump -ef and see if new low-level fault events are comming in during the zfs receive. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?
Hi Bob, On Mon, Dec 5, 2011 at 11:19 AM, Bob Friesenhahn < bfrie...@simple.dallas.tx.us> wrote: > On Mon, 5 Dec 2011, Lachlan Mulcahy wrote: > >> genunix`list_next ** 5822 3.7% >> unix`mach_cpu_idle** 150261 96.1% >> > > Rather idle. Actually this is a bit misleading since it is averaged over all the cores in the system. This system has 24 cores and the approximate 90% used on a single core in system time can be shown to be 3.75% 90 / 24 = 3.75 Similarly 23/24 cores are close to 100% idle. Top shows: > >PID USERNAME NLWP PRI NICE SIZE RES STATETIMECPU COMMAND > 22945 root1 600 13M 3004K cpu/6 144:21 3.79% zfs >550 root 28 590 39M 22M sleep 10:19 0.06% fmd > Having 'fmd' (fault monitor daemon) show up in top at all is rather ominous > since it implies that faults are actively being reported. You might want > to look into what is making it active. Nothing in /var/adm/messages as this is the last two days entries: Dec 4 18:46:20 mslvstdp03r sshd[20926]: [ID 800047 auth.crit] fatal: Read from socket failed: Connection reset by peer Dec 5 01:40:07 mslvstdp03r sshd[21808]: [ID 800047 auth.crit] fatal: Read from socket failed: Connection reset by peer Anything else you suggest I'd check for faults? (Though I'm sort of doubting it is an issue, I'm happy to be thorough) So I'm lead to believe that zfs receive is spending almost 100% of a > single CPUs time doing a lot of > genunix`list_next ... > Or maybe it is only doing 3.7% of this. There seems to be a whole lot of > nothing going on. See the math above -- also mpstat shows a single CPU at around 90% system time and since top reports zfs as the only active process the circumstantial evidence is fairly indicative. > Any ideas what is going on here? > Definitely check if low-level faults are being reported to fmd. > Any logs other than /var/adm/messages I should check (apologies, I'm quite new to (open)solaris ) Regards, -- Lachlan Mulcahy Senior DBA, Marin Software Inc. San Francisco, USA AU Mobile: +61 458 448 721 US Mobile: +1 (415) 867 2839 Office : +1 (415) 671 6080 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?
On 12/05/11 10:47, Lachlan Mulcahy wrote: > zfs`lzjb_decompress10 0.0% > unix`page_nextn31 0.0% > genunix`fsflush_do_pages 37 0.0% > zfs`dbuf_free_range 183 0.1% > genunix`list_next5822 3.7% > unix`mach_cpu_idle 150261 96.1% your best bet in a situation like this -- where there's a lot of cpu time spent in a generic routine -- is to use an alternate profiling method that shows complete stack traces rather than just the top function on the stack. often the names of functions two or three or four deep in the stack will point at what's really responsible. something as simple as: dtrace -n 'profile-1001 { @[stack()] = count(); }' (let it run for a bit then interrupt it). should show who's calling list_next() so much. - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?
On Mon, 5 Dec 2011, Lachlan Mulcahy wrote: genunix`list_next 5822 3.7% unix`mach_cpu_idle 150261 96.1% Rather idle. Top shows: PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND 22945 root 1 60 0 13M 3004K cpu/6 144:21 3.79% zfs 550 root 28 59 0 39M 22M sleep 10:19 0.06% fmd Having 'fmd' (fault monitor daemon) show up in top at all is rather ominous since it implies that faults are actively being reported. You might want to look into what is making it active. So I'm lead to believe that zfs receive is spending almost 100% of a single CPUs time doing a lot of genunix`list_next ... Or maybe it is only doing 3.7% of this. There seems to be a whole lot of nothing going on. Any ideas what is going on here? Definitely check if low-level faults are being reported to fmd. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss