Re: [DRBD-user] DRBD and iSCSI (which? ^o^) versus scalability

Phillip Frost Fri, 27 Jul 2012 04:36:28 -0700

On Jul 26, 2012, at 10:32 PM, Christian Balzer wrote:

> 1. Which bloody iSCSI stack?


I've been satisfied with LIO, though I can't say I've tested the others as 
extensively. I'm using the Debian kernel from sequeeze-backports. I'm using an 
Ethernet backend, so I can't comment on anything more expensive or bleeding 
edge.

> 2. The Debian sid (bleeding edge) pacemaker seems
> to be either not quite up to date or nobody ever uses LIO, this warning
> every 10 seconds doesn't instill confidence either:
> ---
> Jul 27 10:52:41 borg00b iSCSILogicalUnit[27911]: WARNING: Configuration 
> paramete
> r "scsi_id" is not supported by the iSCSI implementation and will be ignored.
> ---

Are you using pacemaker from squeeze-backports?

The warning is benign, but you will find the RA provided with Pacemaker will 
fail horribly with LIO for other reasons. LIO has a bug where it continues to 
reference the underlying device even after it's been freed, as long as there 
are connections to that LU. If you use Pacemaker's RAs, the LUs are 
unconfigured before the target, and there's a small window there where LIO may 
receive a request, attempt to access the backing device of the LU you just 
unconfigured, causing a kernel panic. I was able to hit it more often than not 
if you force a LU to migrate while reading it with dd. I bet if you stop the LU 
without stopping the target you can get it every time.

The workaround is to tear down the TPG first, which will close the iSCSI 
connections before tearing down the backing devices, thus avoiding the bug. 
Incidentally, LIO will also take care to clean up all LUNs, backing devices, 
and other stuff used by a target when you delete the target, so the stop 
procedure is quite easy.

Anyhow, you can't do things in this order with the heartbeat resource agents. I 
borrowed the relevant bits from them and adapted them to my own RA. References:

http://comments.gmane.org/gmane.linux.scsi.target.devel/1568?set_cite=hide
http://oss.clusterlabs.org/pipermail/pacemaker/2012-July/014754.html

> 3. Has anybody here deployed more than 10
> targets/LUNs? And done so w/o going crazy or running into issues mentioned
> in 2)? 
> How? Self made scripts/puppet?

I've played with about 20 targets, most with 2 LUs, in a testing environment. 
I'm working on moving it to production now. I already had a description of all 
the VMs in Puppet, so I used that to generate the Pacemaker configuration. I 
generate a /etc/crm.conf, and when it changes, I have Puppet programmed to load 
it into a shadow CIB. Nagios checks for differences between that shadow and the 
live CIB so I get notified when action is required. Then I double-check it for 
sanity, run it through crm_simulate, and merge it. Notably, this also alerts me 
about things like forgetting I put a node in standby, or unmanaging a service 
for maintenance, or leaving a constraint from "crm resource migrate ..." in 
place.

Of course, 1000 VMs is two orders of magnitude more than this. I really have no 
idea how Pacemaker and LIO scale to that size.


_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] DRBD and iSCSI (which? ^o^) versus scalability

Reply via email to