On Mon, Mar 31, 2014 at 11:03 AM, Richard Pieri <[email protected]>wrote:

> Bill Ricker wrote:
>
>> I've seen a big-name commercial block-replication solution duplicate
>> trashed data to the cold spare ... wasn't pretty !
>>
>
> Another great example of how replication is not backup.


Exactly.

Extra copies of blocks in the local SAN or remote SAN don't help if App or
Block device driver  or Multipath software mangles the bits somehow prior
to all the copying.

It was actual backups, restored to a non-replicated test system, that got
those users on-line again.

(FWIW, that was not at my last shop, but a related firm running the same
application. *Our* copy of the app used transaction-replication, not block
replication, for 2nd site disaster recovery only.  HA for ours was
heartbeat-triggered restart on 2nd local node, pulling vDisks with
multipath SAN. The SAN controller served as the 3rd party to avoid split
brain; 2nd node could successfully request vDisk reassignment only if
controller recognized primary was disconnected. Had extra redundancy option
in SAN too, which might have been more trouble than it was worth. )

(Split-brain is why i've avoided remote auto-restart. If you need
distributed HA, you need to architect for hot-hot distributed
load-balancing -- not easily retrofitted to monolithic legacy apps!)

My two cents, I saw more failures from Multipath software's interaction
with other software exposing inadequately tested edge cases in the whole
stack than i saw failures averted by Multipath.

-- 
Bill
@n1vux [email protected]
_______________________________________________
Discuss mailing list
[email protected]
http://lists.blu.org/mailman/listinfo/discuss

Reply via email to