Hello Justin, We just finished implementing an AOE based SAN that is replicated between geographically separated datacenters using DRBD proxy. Our usage of DRBD proxy prevented us from using a primary-primary setup. Due to that we choose to go with vblade because it runs a single process per LUN, which makes the transition from primary to secondary easier.
We are 75% through the migration to the new AOE based SAN. Our biggest headache so far has been the corner case we've hit. We have vblade (direct io) exporting DRBD devices. The disk used by the DRBD devices are LVM logical volumes. We've seen issues where LVM pvmove will not complete and gets stuck. DRBD throws block drbd11: Local backing block device frozen? and then the vblade process gets stuck in the D state. We do not see the same issue if vblade is not running (ie the LUN is secondary). So we've been migrating resources between the datacenters quite a bit to allow us to do all the pvmoves needed to finish the migration off of our old iSCSI SAN. On Wed, Sep 28, 2011 at 12:04:53PM +0200, Justin Albstmeijer wrote: > Hi aoe users and developers. > > I'm testing a setup that consists of two storage servers, which > replicate lvm volumes using drbd in master-master mode. > > Both storage servers export the same drbd/aoe block devices using qaoed > or ggaoed. > > The aoe kernel module, on the client machines witch mount the aoe block > devices (aoe devices are exclusively mounted on a server), load balances > between the two storage servers. > > All seems to work fine and performance is acceptable. > > The only worry I have is that qaoed or ggaoed might buffer the writes > before committing them to drbd, causing inconstancy in the replication. > This could be a problem in normal operation, but surely if one of the > storage servers would power-off unexpectedly without committing all it's > writes to drbd. > > Am I right to worry about this?. > > Should for this reason direct-io be enabled in the qaoed or ggaoed > configuration?. I have not tested the performance impact yet on this > setup, but from other aoe tests I would expect a sharp decrease in > performance. > > Should I consider not exporting the same drbd/aoe on each storage server > or investigate if the aoe kernel module can work in fail-over mode to > limit the possible impact of this non-committed/lost data still in the > buffers? > > Any advise/feedback is welcome. > > Justin > > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Aoetools-discuss mailing list > Aoetools-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss -- James R. Leu Software Architect INOC 608.204.0203 608.663.4555 fax j...@inoc.com www.inoc.com *** DELIVERING UPTIME ***
pgpSYRXgt2oVE.pgp
Description: PGP signature
------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________ Aoetools-discuss mailing list Aoetools-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/aoetools-discuss