-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 It has been reported previously that on rare occasions metakit databases hosted on Windows shared drives can suffer data corruption (http://www.equi4.com/pipermail/metakit/2005-June/002050.html also http://www.equi4.com/pipermail/metakit/2003-May/001130.html). I have been investigating this and finally managed to get the problem occuring reliably so that I could track this down.
Some previous suggestions made on the metakit list suggested that some other thread may be affecting the file pointer. Metakit provides a good abstraction layer for handling file I/O and it is possible to replace this. In our case we switched to using Windows overlapped I/O which permits us to avoid doing a seek then a write and to replace this with a single write that takes a structure with the seek position. Doing this had no effect on our data corruption. However opening the file with different share modes did affect this issue. Metakit uses the C library open() call to open the target file and permits both reading and writing by other processes. On a network share it seems that we can avoid data corruption by opening the file with exclusive access only. The problem is in the handling of memory mapped files on network shared drives. Metakit reads data by mapping the file read-only and then reading data from the mapped memory. When it needs to write data - which only occurs when we call commit - the new data is written into the underlying file using standard C library I/O. The memory map is then discarded and the file remapped read-only as before. This always works fine on local drives. However on a network shared drive on some occasions when we remap the file we do not get the correct data. What we in fact end up with is something the correct size containing the previously mapped data but with the tail of the file padded with zeros. If we halt the program at this point and view a hex dump of the file data we find that the memory mapped data and the file data are not the same. From this point on metakit is being fed garbage and it all goes downhill from here. The current fix involves two steps. It seems that provided the file is opened for exclusive access the memory mapping problem does not occur. Therefore we have introduced a check using PathIsNetworkPath() and if the file is remote then we set exclusive sharing. We did attempt to use LockFile() to take exclusive access to the file just during the commit phase but for some reason this does not help. Secondly we can validate the memory mapped data by checking a pair of file marks written by metakit at the beginning and the end of the data section. If the marks are invalid then we unmap and stop using memory mapped views. Metakit is able to operate non-memory mapped but with a performance hit. Either of these checks appears to be sufficient but and we are currently using both. I must point out that this problem occurs fairly rarely. My test is unfortunately not suitable for posting to this list and I have not yet been able to create a simple reliably failing sample. Pat Thoyts. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (MingW32) iQCVAwUBQwXmkGB90JXwhOSJAQj4wgP9FTBR5k7+7hEEysZJwXtGXN/PIA0MlgIZ 7IRtSNs/6w1uG41v7rXVU3H1iBTLHvszWPPYs4Mo6BsLNhqgiWDdRorOpgqKpi5w 2psHJmVetAC2vbapP7onYNu1KMvytPgOAYv6H0O3mJl5ATDPFmDCsfSxP6jEywcB D4o0AnMo4bM= =/TQs -----END PGP SIGNATURE----- _____________________________________________ Metakit mailing list - Metakit@equi4.com http://www.equi4.com/mailman/listinfo/metakit