On 10/ 1/12 05:17 PM, Edward Ned Harvey (lopser) wrote:
What's different in your setup from mine?Why doesn't your stuff fail the
way I'm seeing?
Solaris 10 vs OpenIndiana.
The iSCSI initiator is very fragile (in fact I get the impression the
whole scsi sd layer is pretty fragile). It's not tolerant of the unexpected.
If we need to do maintenance on the iSCSI servers we zpool offline all
the affected disks on the Solaris 10 clients, then disable the
iscsi/target service on the iSCSI server. If they're not being used, it
seems you can reboot the iSCSI server without issue. Then when the
maintenance is done (e.g. OS update, drive replacements, hardware
changes, etc) we enable the iscsi/target service and zpool online the
drives. They resilver and things carry on as before.
When we started using iSCSI, the initiator in Solaris 10 u6 was awful.
u7 improved it, and on u8 it was pretty good. I have no experience of
the OpenSolaris/OpenIndiana initiator.
But if the network goes away or a server goes away things end up in a
pretty awful state and you often (but not always, depending on what
happened) have to reboot all the hosts.
It's one of many reasons why we wrote off centralised storage, and our
new cloud is entirely local storage based, with SmartOS pxe booting dell
C6xxx series cloud servers which have lots of 600GB 10k SAS drives.
These 3 blog posts sum up all my views on centralised storage:
http://joyent.com/blog/network-storage-in-the-cloud-delicious-but-deadly
http://joyent.com/blog/magical-block-store-when-abstractions-fail-us/
http://joyent.com/blog/on-cascading-failures-and-amazons-elastic-block-store/
The pain just isn't worth the gain.
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com