I am beating my head against a wall, and beginning to wonder if this is my 
fault, or the system's fault.  Because it seems to me like it SHOULD work.  I 
am very confident now that I got it all set up right...

I wonder if this is one of those areas that was unstable at the time of the 
latest open source release, and it's something I'm being bullied into paying 
Oracle for...  Is it?  I'm planning to go try solaris 11 today as a trial, to 
see if it solves the problem.  I'm not sure if I hope for the best, or hope for 
it to fail too.

I have two identical systems, call them host1 and host2, sitting side by side, 
both running openindiana 151a6.  They are connected to a LAN, and the second 
interface is connected to each other via crossover cable.  They have 6 disks 
each.  The first 2 disks (disk0 and disk1) are configured as an OS mirror.  The 
remaining 4 disks (disk2 - disk5) are shared via iscsi over the crossover 
interface.

Each machine connects to 4 iscsi local disks via 127.0.0.1, and 4 remote disks 
via iscsi, using the other machine's crossover IP address.  Thankfully, the 
device names are the same on both sides, so I can abbreviate and simplify 
things for myself by doing something like this:

export H1D2=c5t600144F00800273F3337506659260001d0
...
export H2D5=c5t600144F0080027238750506658910004d0

I create two pools.  Each pool has 2 local disks and 2 remote disks in a mirror.
sudo zpool create HAstorage1 mirror $H1D2 $H2D2 mirror $H1D3 $H2D3
sudo zpool create HAstorage2 mirror $H1D4 $H2D4 mirror $H1D5 $H2D5

I shutdown the VM's which are running without any problem on the local storage. 
 I use zfs send to migrate them onto HAstorage1 and HAstorage2, I tweak their 
config files and import them back into virtualbox.  Launch them.  Everything 
goes perfectly.

I leave the machines in production for a while.  Continually monitor 
/var/adm/messages and zpool status on both systems.  No problems.  I beat the 
hell out of everything, close and reopen virtualbox, restart all the guest 
VM's, no problem.  Export pool and import on the other system.  Bring up guest 
machines on the other system.  Everything is awesome.  Until I reboot one of 
the hosts.

It doesn't seem to matter if I use "reboot -p" or "init 6"  It seems to happen 
*sometimes* in both situations.

The one thing that is consistent, during the reboot, both on the way down and 
on the way up, I'll see console message being thrown about unable to access 
scsi device.  Sometimes the machine will actually *fail* to reboot, and I'll 
have to power-reset.

After reboot, sometimes the pool appears perfectly healthy,  and sometimes it 
appears as degraded, where ... This is so odd ...  Sometimes the offline disk 
is the remote disk, and sometimes the offline disk is the local disk.  In 
either case, I zpool clear the offending device, and the resilver takes place, 
and the pool looks healthy again.

Then I launch a VM.  (Linux VM).  The guest gets as far as grub, and loading 
the kernel, and start going through the startup scripts, and before the guest 
is fully up, it starts choking, and goes into readonly mode, and fails to boot. 
 I know it's going to fail before it fails, because it performs crazy dog slow. 
 Meanwhile, if I watch /var/adm/messages on the host, I again see scsi errors 
being thrown.  Oddly, zpool status still looks normal.  Generally, virtualbox 
will choke so bad I'll have to force kill virtualbox...  Sometimes I can 
gracefully kill virtualbox.  Sometimes it's so bad I can't even do that, and 
I'm forced to power reset.

I would normally suspect a hardware or ethernet or local disk driver problem 
for these type of symptoms - But I feel that's pretty solidly eliminated, by a 
few tests:

1- As long as I don't reboot the host, the guest VM's can stay in production 
the whole time.  All the traffic is going across the network, the mirror is 
simply working.  I can export pool on one system, and import it on the other, 
and launch the VM's on the other host.  No problem.  I can "zfs send" these 
filesystems across the interface, and receive on the other side, no problem.  
My ethernet error count stays zero according to netstat, on both sides.  So I 
really don't think it's an ethernet hardware or bad cable problem.

2- Only during reboot, I suddenly get scsi errors in the log.  To me, it seems, 
this shouldn't happen.

Surprisingly to me:  During the minute when one system is down, the other 
system still shows nothing in /var/adm/messages, and still shows the pool as 
healthy.  At first I wondered if this meant I screwed up the mapping, and was 
actually using all the local disks, but that's not the case.  It turns out, in 
actuality, the host that's still up is treating the failure as simple IO 
errors, increasing the error count for the unavailable devices.  Only after the 
count gets sufficiently high, does the running system finally mark the 
offending device as offline / unavailable.  "Error count too high."  When the 
other system comes back up again, annoyingly, it doesn't automatically bring 
the unavailable device back online.  I have to "zpool clear" the offending 
device.  As soon as I clear one device - the other one automatically comes 
online too.

I would expect a little more plug-n-play-ish intelligence here.  When the 
remote iscsi device disappears, it seems to me, it should be handled a little 
more similarly to somebody yanking out the USB external disk.  Rather than 
retrying and increasing the error count, the running system should be alerted 
this disk is going offline, and handle that situation more gracefully ...  
Well, that's just my expectation anyway.  I only care right now because it's 
causing problems.

For sanity check, here is how I created the iscsi devices:
(Before any comments, I know now that I don't need to be so obsessive about 
keeping track of which local device maps to which iscsi device name, but this 
should still be fine, and this is what I did originally; I would probably do it 
differently next time, with only one target per host machine, and simply allow 
the LUNs to map themselves.)

(on both systems)
sudo pkg install pkg:/network/iscsi/target
sudo svcadm enable -s svc:/system/stmf
sudo svcadm enable -s svc:/network/iscsi/target
sudo iscsiadm modify discovery --static enable

(on host1)
export DISKNUM=2
sudo sbdadm create-lu /dev/rdsk/c4t${DISKNUM}d0
export GUID=(whatever it said)
sudo stmfadm create-tg disk${DISKNUM}
sudo stmfadm add-view -t disk${DISKNUM} $GUID
sudo itadm create-target
export TARGET=(whatever it said)
sudo stmfadm offline-target $TARGET
sudo stmfadm add-tg-member -g disk${DISKNUM} $TARGET
sudo stmfadm online-target $TARGET
sudo iscsiadm add static-config ${TARGET},127.0.0.1
sudo format -e
(Make a note of the new device name, and hit Ctrl-C.  Keep record in a 
spreadsheet somewhere, "host1 disk2 = c5t600144F00800273F3337506659260001d0")

(Now on the other host)
export TARGET=(whatever it said)
sudo stmfadm online-target $TARGET
sudo iscsiadm add static-config ${TARGET},192.168.7.7

Repeat all the above, for each disk, and for each host.  In the end, I can see 
all 8 disks from both hosts, and I have a spreadsheet to remember which iscsi 
device name maps to which physical device.



-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to