On 11 May 2011, Mike Christie wrote:
> On 05/10/2011 10:59 PM, Mark Nipper wrote:
> >     So to summarize, I posted a ridiculously long message on
> >the KVM list.  It can be seen at:
> >---
> >http://thread.gmane.org/gmane.comp.emulators.kvm.devel/71803
> From that mail I saw:
> Buffer I/O error on device dm-0, logical block ...
> end_request: I/O error, dev vda, sector ...
> Are you using dm-multipath over iscsi by any chance?
> Could you send the /var/log/messages for the system running the iscsi
> initaitor?
> Also you said in that mail you are using RHEL 6 for the initiator
> system, correct?

        I was attempting to use dm-multipath at one point because
the way the documentation is worded, I thought it was necessary
to use multipath to get any kind of queueing when errors
occurred.  But I only thought that because having set
replacement_timeout = -1 didn't seem to be working at all.

        I replied to myself earlier today having resolved this
matter.  My ultimate problem was that I was using tgtd as my
target and it was coming up on the current active DRBD node
before the logical units were made available, as discussed
previously on this very list.  Changing the ordering of my
resource agents in my HA stack (making the shared virtual IP
come up last after everything else was established on the new
active node) helped to avoid that race condition altogether.

        My only concern now is that even this isn't entirely
sufficient all the time based on that same, previous discussion
where the original poster ended up relying on a 'killall -9 tgtd'
to avoid any sort of trouble.  I'm assuming the OCF RA's are
handling all of this correctly, but that's only an assumption at
this point.

