Hi,

I've recently set-up an OpenSolaris machine running on x84_64 hardware.
This machine is now my 'network backup server' which runs rsync to
transfer files from remote servers to it's gigabit iSCSI connected
backup-zpool.

All works well; ZFS really kicks ass, but i'm running into a performance
issue with which i might need some help.

When i start rsync'ing a large server to this backup-machine, the
networkstack 'freezes' from time to time. This seems to correlate to
the writes to the iSCSI pool. Ping-times to the machine rise to above 4
seconds, but no packets are lost.

>From a ping running on a different, unrelated, server:

| 64 bytes from (172.16.0.30): icmp_seq=1 ttl=253 time=0.607 ms
| 64 bytes from (172.16.0.30): icmp_seq=2 ttl=253 time=0.498 ms
| 64 bytes from (172.16.0.30): icmp_seq=3 ttl=253 time=0.498 ms
| [ freeze here ]
| 64 bytes from (172.16.0.30): icmp_seq=4 ttl=253 time=3471 ms
| 64 bytes from (172.16.0.30): icmp_seq=5 ttl=253 time=2471 ms
| 64 bytes from (172.16.0.30): icmp_seq=6 ttl=253 time=1471 ms
| 64 bytes from (172.16.0.30): icmp_seq=7 ttl=253 time=471 ms
| 64 bytes from (172.16.0.30): icmp_seq=8 ttl=253 time=0.518 ms
| 64 bytes from (172.16.0.30): icmp_seq=9 ttl=253 time=0.498 ms
| [ freeze here ]
| 64 bytes from (172.16.0.30): icmp_seq=10 ttl=253 time=3938 ms
| 64 bytes from (172.16.0.30): icmp_seq=11 ttl=253 time=2931 ms
| 64 bytes from (172.16.0.30): icmp_seq=12 ttl=253 time=1929 ms
| 64 bytes from (172.16.0.30): icmp_seq=13 ttl=253 time=921 ms
| 64 bytes from (172.16.0.30): icmp_seq=14 ttl=253 time=0.517 ms
| [ .. etc .. same pattern .. ]

The 'zpool iostat backup 1' output shows writes to the pool at the
moment the connection 'freezes' for a few seconds:

|                capacity     operations    bandwidth
| pool         used  avail   read  write   read  write
| ----------  -----  -----  -----  -----  -----  -----
| backup      1.86T  1.52T      0    686      0  30.2M
| backup      1.86T  1.52T      0      0      0      0
| backup      1.86T  1.52T      0      0      0      0
| [freeze here]
| backup      1.86T  1.52T      0    653      0  30.2M
| backup      1.86T  1.52T      0     81      0   258K
| backup      1.86T  1.52T      0      0      0      0
| backup      1.86T  1.52T      0      0      0      0
| [freeze here]
| backup      1.86T  1.52T      0    801      0  32.9M
| backup      1.86T  1.52T      0     62      0   686K
| backup      1.86T  1.52T      0      0      0      0
| [freeze here]
| backup      1.86T  1.52T      0    691      0  30.2M

I excluded the NICs as the source of these problems by wget'ing
500mb.bin's from my gbit connected ftp-server with 8 threads
concurrently. No freezes occur at that time which leads me to believe it
has to do with either ZFS or iSCSI.

I'm at a loss of debugging where it might actually go wrong.

Is there anyone willing to shed some light on debugging this issue?

Thanks in advance for any hints :)
-Sndr.
-- 
| A calendar's days are numbered. 
| 4096R/20CC6CD2 - 6D40 1A20 B9AA 87D4 84C7  FBD6 F3A9 9442 20CC 6CD2

Reply via email to