Disclaimer: These are simply observations based on my experience.

Rsync is really very capable of doing some checking on its own with  
the "--checksum" option, not only are the file sizes and dates  
checked, but also a checksum of the local and remote file. So you can  
find some assurance there that what is local and remote are, intact  
and identical. Rsync is a very popular replication tool used by many  
many large scale software mirrors around the world with great success.

In maintaing [EMAIL PROTECTED] for the last year and working with our System  
Administrator to make our SAN solution more stable, he and I came to  
the conclusion that corruption is not occurring within files as much  
as in the inode table used by linux to track the existing files,  
expecially in the case of a server crash, Linux deals with system  
crashes/recovery by taking any existing open files at the time of the  
crash and moving their inodes into a lost&found directory. When we  
have a case where an disk or network mount crashes the same thing  
occurs, we are always able to see the whole files intact and are  
primarily just restoring them back to their appropriate location.  
When these crashes occur on a write, its important to understand that  
the system almost always writing in a buffered address area and  
finally moving the inode pointer to that new address, this means that  
while the new content can be incomplete, what usually ends up in the  
lost&found is the original file.

I'd really like to see more System Admin feedback on these lists  
concerning what is really "technically necessary" from an  
administration standpoint in terms of data integrity and what in the  
"preservation area" is simply a replication of effort and fairly  
standard processes already done in the "Systems Administration" area.

-Mark

On Feb 17, 2007, at 8:03 PM, Hutchinson, Alvin wrote:

> We are running DSpace 1.3.2 on development and production servers.  
> We intend to submit content to the first server (behind the  
> firewall) and mirror the contents to the second server via database  
> dump/restore and assetstore copy via filesync (or rsync).
>
> My question: is there some way of ensuring data integrity or  
> validity in copying the assetstore files? I'm sure there are other  
> sites with a similar setup and it occurs to me that a simple file  
> copy is prone to loss or undetected corruption. No?
>
> Anyone have any best practices they'd like to share regarding  
> mirroring?
>
> Thanks in advance,
>
> ---------------------------------------------------------------------- 
> ---
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to  
> share your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php? 
> page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech

~~~~~~~~~~~~~
Mark R. Diggory - DSpace Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology
Office: E25-131
Phone: (617) 253-1096



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to