On 31.05.11 17:08, Mikolaj Golub wrote:
As I wrote privately, it would be nice to see both netstat and hast logs (from
both nodes) for the same rather long period, when several cases occured. It
would be good to place them somewere on web so other guys could access them
too, as I will be offline for 7-10 days and will not be able to help you until
I am back.
The test finished running for almost three hours, and so here is the
collected data:
(for the duration of test, on the secondary node)
systat -if
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average
Interface Traffic Peak Total
lo0 in 0.000 KB/s 0.000 KB/s 1.126 KB
out 0.000 KB/s 0.000 KB/s 1.126 KB
ix1 in 0.003 KB/s 230.590 MB/s 614.688 GB
out 0.054 KB/s 7.425 MB/s 19.910 GB
igb0 in 0.025 KB/s 3.636 KB/s 566.897 KB
out 0.072 KB/s 4.296 KB/s 1.091 MB
The primary node is b1a, the secondary node is b1b.
kernel (built just after csup update):
FreeBSD b1a 8.2-STABLE FreeBSD 8.2-STABLE #1: Mon May 30 14:17:50 EEST
2011 root@b1a:/usr/obj/usr/src/sys/GENERIC amd64
from primary
messages: http://news.digsys.bg/~admin/hast/test31may/b1a-messages
netstat -in: http://news.digsys.bg/~admin/hast/test31may/b1a-netstat -in
netstat-s: http://news.digsys.bg/~admin/hast/test31may/b1a-netstat-s
from secondary
messages: http://news.digsys.bg/~admin/hast/test31may/b1b-messages
netstat -in: http://news.digsys.bg/~admin/hast/test31may/b1b-netstat -in
netstat-s: http://news.digsys.bg/~admin/hast/test31may/b1b-netstat-s
DK> One additional note: while playing with this setup, I tried to
DK> simulate local disk going away in the hope HAST will switch to using
DK> the remote disk. Instead of asking someone at the site to pull out the
DK> drive, I just issued on the primary
DK> hastctl role init data0
DK> which resulted in kernel panic. Unfortunately, there was no sufficient
DK> dump space for 48GB. I will re-run this again with more drives for the
DK> crash dump. Anything you want me to look for in particular? (kernels
DK> have no KDB compiled in yet)
Well, removing physical disk (device /dev/gpt/data0 consumed by hastd
dissapears) and switching a resource to init role (devive /dev/hast/data0
consumed by FS dissapears) are two different things. Sure you should not
normally change the resource role (destroy hast device) before unmounting
(exporting) FS.
Then how do I proceed with a failed drive? Or a flaky drive that is
still visible to the OS, that I want to remove from HAST and replace
with a different one? How do I ask HAST to switch I/O to the secondary?
Is there other way to get a drive out of HAST? In any case, even if this
is not allowed operation, it should not panic.
I am now going to reboot and run the same tests without checksums.
Daniel
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[email protected]"