-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

It has been reported previously that on rare occasions metakit databases
hosted on Windows shared drives can suffer data corruption
(http://www.equi4.com/pipermail/metakit/2005-June/002050.html also
http://www.equi4.com/pipermail/metakit/2003-May/001130.html). I have
been investigating this and finally managed to get the problem occuring
reliably so that I could track this down.

Some previous suggestions made on the metakit list suggested that some
other thread may be affecting the file pointer. Metakit provides a good
abstraction layer for handling file I/O and it is possible to replace
this. In our case we switched to using Windows overlapped I/O which
permits us to avoid doing a seek then a write and to replace this with a
single write that takes a structure with the seek position. Doing this
had no effect on our data corruption. However opening the file with
different share modes did affect this issue. Metakit uses the C library
open() call to open the target file and permits both reading and writing
by other processes. On a network share it seems that we can avoid data
corruption by opening the file with exclusive access only.

The problem is in the handling of memory mapped files on network shared
drives. Metakit reads data by mapping the file read-only and then
reading data from the mapped memory. When it needs to write data - which
only occurs when we call commit - the new data is written into the
underlying file using standard C library I/O. The memory map is then
discarded and the file remapped read-only as before.

This always works fine on local drives. However on a network shared
drive on some occasions when we remap the file we do not get the correct
data. What we in fact end up with is something the correct size
containing the previously mapped data but with the tail of the file
padded with zeros. If we halt the program at this point and view a hex
dump of the file data we find that the memory mapped data and the file
data are not the same. From this point on metakit is being fed garbage
and it all goes downhill from here.

The current fix involves two steps. It seems that provided the file is
opened for exclusive access the memory mapping problem does not occur.
Therefore we have introduced a check using PathIsNetworkPath() and if
the file is remote then we set exclusive sharing. We did attempt to use
LockFile() to take exclusive access to the file just during the commit
phase but for some reason this does not help.

Secondly we can validate the memory mapped data by checking a pair of
file marks written by metakit at the beginning and the end of the data
section. If the marks are invalid then we unmap and stop using memory
mapped views. Metakit is able to operate non-memory mapped but with a
performance hit. Either of these checks appears to be sufficient but and
we are currently using both.

I must point out that this problem occurs fairly rarely. My test is
unfortunately not suitable for posting to this list and I have not yet
been able to create a simple reliably failing sample.

Pat Thoyts.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (MingW32)

iQCVAwUBQwXmkGB90JXwhOSJAQj4wgP9FTBR5k7+7hEEysZJwXtGXN/PIA0MlgIZ
7IRtSNs/6w1uG41v7rXVU3H1iBTLHvszWPPYs4Mo6BsLNhqgiWDdRorOpgqKpi5w
2psHJmVetAC2vbapP7onYNu1KMvytPgOAYv6H0O3mJl5ATDPFmDCsfSxP6jEywcB
D4o0AnMo4bM=
=/TQs
-----END PGP SIGNATURE-----
_____________________________________________
Metakit mailing list  -  Metakit@equi4.com
http://www.equi4.com/mailman/listinfo/metakit

Reply via email to