> So, did you make experiments with sudden reboot of one of the nodes with 
> simultaneous high load  > (inserting or updating a lot of records)?
 
Hi Alexey, That's pretty much the only test that we've tried in several 
different ways. The problem is that Firebird is just too reliable, so I don't 
have a mental model of how to break it. We've been using it for 15 years and 
only ever had problems with the generation of HDDs in the early 00s that 
reported successful write to the OS but cached forever - specifically Maxtors. 
Apart from that we had a power supply blow once on a 10GB database that 
corrupted just a single record at the moment of death, and all that took was a 
careful extract of the data from that table either side of the bad record. For 
testing DRBD we've tried pulling power during heavy activity, and then repeated 
this with iptables dropping all traffic between the nodes to simulate to the 
secondary the total immediate failure of the primary in a more test friendly 
way. So far Firebird just shrugs a bit and gets back on with the work on the 
secondary.  My next test, hopefully tomorrow, will be to turn Forced Writes 
off, and kill the link in the 5 second time between doing stuff and the OS 
deciding to do anything with it, but I think I'm still on a hiding to nothing 
unless I can get the packets to drop part way through the splurge of writing.  
We are not worried about HA, we are just trying to get real-time replication 
for persistence of data - and I've no idea how to kill it! 
 Ian
 
 

Reply via email to