Re: [zfs-discuss] Maximum zfs send/receive throughput
Does this maybe ring a bell with someone? Update: The cause of the problem was OpenSolaris bug 6826836 Deadlock possible in dmu_object_reclaim() http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6826836 It could be fixed by upgrading the OpenSolaris 2009.06 system to 0.5.11-0.111.17 (via the non-free official support repository). -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
I'm not very familar with mdb. I've tried this: Ah, this looks much better: root 641 0.0 0.0 7660 2624 ?S Nov 08 2:16 /sbin/zfs receive -dF datapool/share/ (...) # echo 0t641::pid2proc|::walk thread|::findstack -v | mdb -k stack pointer for thread ff09236198e0: ff003d9b5670 [ ff003d9b5670 _resume_from_idle+0xf1() ] ff003d9b56a0 swtch+0x147() ff003d9b56d0 cv_wait+0x61(ff0a4fbd4228, ff0a4fbd40e8) ff003d9b5710 dmu_tx_wait+0x80(ff0948aa4600) ff003d9b5750 dmu_tx_assign+0x4b(ff0948aa4600, 1) ff003d9b57e0 dmu_free_long_range_impl+0x12a(ff0911456d60, ff0a4fbd4028, 0, , 0) ff003d9b5840 dmu_free_long_range+0x5b(ff0911456d60, 53e34, 0, ) ff003d9b58d0 dmu_object_reclaim+0x112(ff0911456d60, 53e34, 13, 1e00, 11, 108) ff003d9b5930 restore_object+0xff(ff003d9b5950, ff0911456d60, ff003d9b59c0) ff003d9b5a90 dmu_recv_stream+0x48d(ff003d9b5be0, ff094d089440, ff003d9b5ad8) ff003d9b5c40 zfs_ioc_recv+0x2c0(ff092492b000) ff003d9b5cc0 zfsdev_ioctl+0x10b(b6, 5a1c, 8044e50, 13, ff0948b60e50, ff003d9b5de4) ff003d9b5d00 cdev_ioctl+0x45(b6, 5a1c, 8044e50, 13, ff0948b60e50, ff003d9b5de4) ff003d9b5d40 spec_ioctl+0x83(ff0921e54640, 5a1c, 8044e50, 13, ff0948b60e50, ff003d9b5de4, 0) ff003d9b5dc0 fop_ioctl+0x7b(ff0921e54640, 5a1c, 8044e50, 13, ff0948b60e50, ff003d9b5de4, 0) ff003d9b5ec0 ioctl+0x18e(3, 5a1c, 8044e50) ff003d9b5f10 sys_syscall32+0x101() Does this maybe ring a bell with someone? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
Does anyone know the current state of bug #6975124? Has there been any progress since August? I currently have an OpenSolaris 2009.06 snv_111b system (entire 0.5.11-0.111.14) which *repeatedly* gets stuck after a couple of minutes during a large (xxx GB) incremental zfs receive operation. The process does not crash, it simply keeps sleeping and there is no progress at all. PID USERNAME NLWP PRI NICE SIZE RES STATETIMECPU COMMAND 641 root 1 60 0 7660K 2624K sleep 2:160.00% zfs Both truss and mdb are not able to show *any* activity or status of the zfs receive process: # truss -p 641 *hangs* I'm not very familar with mdb. I've tried this: # mdb -p 641 mdb: failed to initialize //lib/libc_db.so.1: libthread_db call failed unexpectedly mdb: warning: debugger will only be able to examine raw LWPs Loading modules: [ ld.so.1 libumem.so.1 libavl.so.1 libnvpair.so.1 ] ::stack ::stackregs ::status debugging PID 641 (32-bit) file: /sbin/zfs threading model: raw lwps status: process is running, debugger stop directive pending I'm wondering if #6975124 could be the cause of my problem, too. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
I'm wondering if #6975124 could be the cause of my problem, too. there are several zfs send (and receive) related issues with 111b. You might seriously want to consider upgrading to more recent opensolaris (134) or openindiana Yours Markus Kovero ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
I have been looking at why a zfs receive operation is terribly slow and one observation that seemed directly linked to why it is slow is that at any one time one of the cpus is pegged at 100% sys while the other 5 in my case are relatively quiet. I haven't dug any deeper than that, but was curious to know if anyone else observed the same behavior? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
Just an update, I had a ticket open with Sun regarding this and it looks like they have a CR for what I was seeing (6975124). -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
Jim Barker wrote: Just an update, I had a ticket open with Sun regarding this and it looks like they have a CR for what I was seeing (6975124). That would seem to describe a zfs receive which has stopped for 12 hours. You described yours as slow, which is not the term I personally would use for one which is stopped. However, you haven't given anything like enough detail here of your situation and what's happening for me to make any worthwhile guesses. -- Andrew Gabriel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
Andrew, Correct. The reason I initially opened the case was because I could essentially hang a zfs receive operation and any further zfs commands issued on the box would never come back. Just today I had one of my slow receives just come to a screaching halt and where I saw 1 cpu spike all the time, it is now exhibiting the same behavior as the hang (absolutely no activity, quiet as a mouse). I guess I didn't wait long enough for the slow process to finally hang. It is hung now and will stay that way until the end of time. I thought I had found a way to get around the freeze, but I guess I just delayed the freeze a little longer. I provided Oracle some explorer output and a crash dump to analyze and this is the data they used to provide the information I passed on. Jim Barker -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Thomas Maier-Komor you can probably improve overall performance by using mbuffer [1] to stream the data over the network. At least some people have reported increased performance. mbuffer will buffer the datastream and disconnect zfs send operations from network latencies. Get it there: original source: http://www.maier-komor.de/mbuffer.html binary package: http://www.opencsw.org/packages/CSWmbuffer/ mbuffer is also available in opencsw / blastwave. IMHO, easier and faster and better than building things from source, most of the time. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Maximum zfs send/receive throughput
It seems we are hitting a boundary with zfs send/receive over a network link (10Gb/s). We can see peak values of up to 150MB/s, but on average about 40-50MB/s are replicated. This is far away from the bandwidth that a 10Gb link can offer. Is it possible, that ZFS is giving replication a too low priority/throttling it too much? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Mika Borner It seems we are hitting a boundary with zfs send/receive over a network link (10Gb/s). We can see peak values of up to 150MB/s, but on average about 40-50MB/s are replicated. This is far away from the bandwidth that a 10Gb link can offer. Is it possible, that ZFS is giving replication a too low priority/throttling it too much? I don't think this is called replication, so ... careful about terminology. zfs send can go as fast as your hardware is able to read. If you'd like to know how fast your hardware is, try this: zfs send somefilesystem | pv -i 30 /dev/null (You might want to install pv from opencsw or blastwave.) I think, in your case, you'll see something around 40-50MB/s I will also add this much: If you send the original snapshot of your complete filesystem, it'll probably go very fast. (Much faster than 40-50 MB/s). Because all those blocks are essentially sequential blocks on disk. When you're sending incrementals ... They are essentially more fragmented ... so the total throughput is lower. The disks have to perform a greater random IO percentage. I have a very fast server, and my zfs send is about half as fast as yours. In both cases, it's enormously faster than some other backup tool, like tar or rsync or whatever. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Maximum zfs send/receive throughput
On 25.06.2010 14:32, Mika Borner wrote: It seems we are hitting a boundary with zfs send/receive over a network link (10Gb/s). We can see peak values of up to 150MB/s, but on average about 40-50MB/s are replicated. This is far away from the bandwidth that a 10Gb link can offer. Is it possible, that ZFS is giving replication a too low priority/throttling it too much? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss you can probably improve overall performance by using mbuffer [1] to stream the data over the network. At least some people have reported increased performance. mbuffer will buffer the datastream and disconnect zfs send operations from network latencies. Get it there: original source: http://www.maier-komor.de/mbuffer.html binary package: http://www.opencsw.org/packages/CSWmbuffer/ - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss