Author: stefan2
Date: Sun Aug 24 22:05:30 2014
New Revision: 1620208
URL: http://svn.apache.org/r1620208
Log:
On the revprop-caching-ng branch: Some cleanup and doc update.
No functional change.
* subversion/libsvn_fs_fs/revprops.c
(ATOMIC_REVPROP_GENERATION,
ATOMIC_REVPROP_TIMEOUT,
ATOMIC_REVPROP_NAMESPACE): Remove these obsolete defines.
(Revprop caching management): Document the new intended behavior.
Modified:
subversion/branches/revprop-caching-ng/subversion/libsvn_fs_fs/revprops.c
Modified:
subversion/branches/revprop-caching-ng/subversion/libsvn_fs_fs/revprops.c
URL:
http://svn.apache.org/viewvc/subversion/branches/revprop-caching-ng/subversion/libsvn_fs_fs/revprops.c?rev=1620208&r1=1620207&r2=1620208&view=diff
==============================================================================
--- subversion/branches/revprop-caching-ng/subversion/libsvn_fs_fs/revprops.c
(original)
+++ subversion/branches/revprop-caching-ng/subversion/libsvn_fs_fs/revprops.c
Sun Aug 24 22:05:30 2014
@@ -41,12 +41,6 @@
process got aborted and that we have re-read revprops. */
#define REVPROP_CHANGE_TIMEOUT (10 * 1000000)
-/* The following are names of atomics that will be used to communicate
- * revprop updates across all processes on this machine. */
-#define ATOMIC_REVPROP_GENERATION "rev-prop-generation"
-#define ATOMIC_REVPROP_TIMEOUT "rev-prop-timeout"
-#define ATOMIC_REVPROP_NAMESPACE "rev-prop-atomics"
-
svn_error_t *
svn_fs_fs__upgrade_pack_revprops(svn_fs_t *fs,
svn_fs_upgrade_notify_t notify_func,
@@ -143,57 +137,44 @@ svn_fs_fs__upgrade_cleanup_pack_revprops
/* Revprop caching management.
*
- * ### TODO ### Update!
- *
* Mechanism:
* ----------
*
* Revprop caching needs to be activated and will be deactivated for the
* respective FS instance if the necessary infrastructure could not be
- * initialized. In deactivated mode, there is almost no runtime overhead
- * associated with revprop caching. As long as no revprops are being read
- * or changed, revprop caching imposes no overhead.
+ * initialized. As long as no revprops are being read or changed, revprop
+ * caching imposes no overhead.
*
* When activated, we cache revprops using (revision, generation) pairs
* as keys with the generation being incremented upon every revprop change.
* Since the cache is process-local, the generation needs to be tracked
* for at least as long as the process lives but may be reset afterwards.
*
- * To track the revprop generation, we use two-layer approach. On the lower
- * level, we use named atomics to have a system-wide consistent value for
- * the current revprop generation. However, those named atomics will only
- * remain valid for as long as at least one process / thread in the system
- * accesses revprops in the respective repository. The underlying shared
- * memory gets cleaned up afterwards.
- *
- * On the second level, we will use a persistent file to track the latest
- * revprop generation. It will be written upon each revprop change but
- * only be read if we are the first process to initialize the named atomics
- * with that value.
- *
- * The overhead for the second and following accesses to revprops is
- * almost zero on most systems.
- *
- *
- * Tech aspects:
- * -------------
- *
- * A problem is that we need to provide a globally available file name to
- * back the SHM implementation on OSes that need it. We can only assume
- * write access to some file within the respective repositories. Because
- * a given server process may access thousands of repositories during its
- * lifetime, keeping the SHM data alive for all of them is also not an
- * option.
- *
- * So, we store the new revprop generation on disk as part of each
- * setrevprop call, i.e. this write will be serialized and the write order
- * be guaranteed by the repository write lock.
- *
- * The only racy situation occurs when the data is being read again by two
- * processes concurrently but in that situation, the first process to
- * finish that procedure is guaranteed to be the only one that initializes
- * the SHM data. Since even writers will first go through that
- * initialization phase, they will never operate on stale data.
+ * We track the revprop generation in a persistent, unbuffered file that
+ * we may keep open for the lifetime of the svn_fs_t. It is the OS'
+ * responsibility to provide us with the latest contents upon read. To
+ * detect incomplete updates due to non-atomic reads, we put a MD5 checksum
+ * next to the actual generation number and verify that it matches.
+ *
+ * Since we cannot guarantee that the OS will provide us with up-to-date
+ * data buffers for open files, we re-open and re-read the file before
+ * modifying it. This will prevent lost updates.
+ *
+ * A race condition exists between switching to the modified revprop data
+ * and bumping the generation number. In particular, the process may crash
+ * just after switching to the new revprop data and before bumping the
+ * generation. To be able to detect this scenario, we bump the generation
+ * twice per revprop change: once immediately before (creating an odd number)
+ * and once after the atomic switch (even generation).
+ *
+ * A writer holding the write lock can immediately assume a crashed writer
+ * in case of an odd generation or they would not have been able to acquire
+ * the lock. A reader detecting an odd generation will use that number and
+ * be forced to re-read any revprop data - usually getting the new revprops
+ * already. If the generation file modification timestamp is too old, the
+ * reader will assume a crashed writer, acquire the write lock and bump
+ * the generation if it is still odd. So, for about REVPROP_CHANGE_TIMEOUT
+ * after the crash, reader caches may be stale.
*/
/* Make sure the revprop_generation member in FS is set.