Recently, I added a new server to our network, using the 3Ware RAID controller (the 9500S-4LP card) and 3x140G SATA drives ... overall, the system works, but I'm getting a very odd behaviour that I've never seen before ...

I have a process that run an rsync from another server to 'duplicate' the VPSs ... a 'live backup' sort of thing ... this is running on all our servers, without incident, *except*, it appears, the SATA server ...

I had disabled it for a time, and just re-enabled it this morning, and somehow or another, it seems to be causing file system corruption ...

As most 'old timers' here know, we use UNIONFS on all our servers ... when the corruption occurs, it looks like the "directory structures" are being changed ... this one is hard to explain :( For example, /usr/local/cyrus/bin has a bunch of binaries in it ... the binaries are kept on the "lower layer", so the upper layer only has a /usr/local/cyrus/bin directory created/ghosted, but no copies of the binaries ... so, when you are in the VPS, and do an ls of that directory, you see:

# ls /usr/local/cyrus/bin
arbitron        cyr_expire      lmtpd           notifyd         smmapd
chk_cyrus       cyrdump         masssievec      pop3d           squatter
ctl_cyrusdb     deliver         master          pop3proxyd      timsieved
ctl_deliver     fud             mbexamine       quota           tls_prune
ctl_mboxlist    imapd           mbpath          reconstruct
cvt_cyrusdb     ipurge          mkimap          sievec

When the 'corruption' happens, those all disappear, almost as if someone did a 'rm -rf' of the directory within the VPS, and then a 'mkdir' ... except that, from what I've been able to tell, this only happens randomly, it happens on any of the VPSs *and* only around the time that the rsync process is running ...

As if, somehow, the rsync is taxing the system and causing bad writes ... but I can't find anything anywhere to indicate a problem ...

To "fix" things, I umount the UNIONFS layer, and then do a 'find / cpio' to copy the "top layer" back over to fix the directory structure itself ...

The thing is, I don't even know *where* to begin debugging this issue, since there aren't any errors being reported anywhere ... but maybe someone out there has an idea?

thanks ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]           Yahoo!: yscrappy              ICQ: 7615664
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to