We're planning on using encryption at the filesystem level (whole-disk encryption) and, to be honest, I don't have a mechanism that can produce the changes I'm talking about. Neither does my boss, unfortunately ;) He came along one day and asked, "how do we know when data changed on disk without us doing it?" -- and no, I couldn't get a mechanism out of him then.
I've yet to go through LUCENE-737 (and the Nabble thread it refers to.) I'd missed it; thanks for the pointer. Maliciousness is certainly a possibility, but not likely. Because a lot of the data we store is sensitive, we've made sure that the system surrounding the data is secure and that nobody actually has access to the data itself (there's no root access on these boxes, the one user that can log in is jailed and the network is "secure".) What's more, we hold four copies of the index on four seperate boxes, two each in two geographically seperated data centers, and whoever wanted to change the data would have to get into both centers and mod all four copies. Any hardware-level fault would also have to operate on all four copies, so that isn't likely, either. What's most likely is a software fault. My thought is to have a seperate service running whose sole purpose is to "check data integrity", whatever that means, and (hopefully) shares little code with our main service. Of course, we still have some third-party code to accomodate (Lucene included, of course) and while those have been reliable so far, we can't rule out future problems. I suppose that the main implementation problem here is that comparing the four copies of the raw index data itself to each other would operate on a LOT of data. I was wondering if anyone had had success with an implementation that operated on individual documents, groups of documents or some other, smaller group of data. Thanks again, sorry for leaving the mechanism and encryption details out. -j --- Daniel Noll <[EMAIL PROTECTED]> wrote: > On Friday 03 August 2007 16:03:22 Doron Cohen wrote: > > What is the anticipated cause of corruption? Malicious? > > Hardware fault? This somewhat reminds of discussions in > > the list about encrypting the index. See LUCENE-737 > > and a discussion pointed by it. One of the opinions > > there was that encryption should be handled at a lower > > level (OS/FS). Wouldn't that hold here as well? > > That's actually a good point. These days we have filesystems like ZFS which > check for corruption automatically. This should remove a lot of the extra > digesting work people would otherwise need to do to ensure consistency. > > Daniel > > > -- > Daniel Noll > Nuix Pty Ltd > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia Ph: +61 2 9280 0699 > Web: http://nuix.com/ Fax: +61 2 9212 6902 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > ____________________________________________________________________________________ Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]