Saw the fixes -- looks good, thanks.
Daniel Shahaf wrote on Sun, Feb 12, 2012 at 04:42:08 +0200: > Stefan Fuhrmann wrote on Sun, Feb 12, 2012 at 03:06:31 +0100: > > On 09.02.2012 16:05, Daniel Shahaf wrote: > > >stef...@apache.org wrote on Wed, Feb 08, 2012 at 00:44:26 -0000: > > >>Author: stefan2 > > >>Date: Wed Feb 8 00:44:26 2012 > > >>New Revision: 1241718 > > >> > > >>URL: http://svn.apache.org/viewvc?rev=1241718&view=rev > > >>Log: > > >>Major improvement in delta window handling: Cache intermediate > > >>combined delta windows such that changes "close by" won't need > > >>to discover and read the whole chain again. > > >> > > >>For algorithms that traverse history linearly, this optimization > > >>gives delta combination an amortized constant runtime. > > >> > > >>For now, we only cache representations< 100kB. Support for larger > > >>reps can be added later. > > >> > > >>* subversion/libsvn_fs_fs/fs.h > > >> (fs_fs_data_t): add cache for combined windows > > >>* subversion/libsvn_fs_fs/caching.c > > >> (svn_fs_fs__initialize_caches): initialize that cache > > >> > > >>* subversion/libsvn_fs_fs/fs_fs.c > > >> (rep_state): add reference to new cache > > >> (create_rep_state_body): init that reference > > >> (rep_read_baton): add reference to cached base window > > >> (get_cached_combined_window, set_cached_combined_window): > > >> new utility functions > > >> (build_rep_list): terminate delta chain early if cached > > >> base window is available > > >> (rep_read_get_baton): adapt caller > > >> (get_combined_window): re-implement > > >> (get_contents): handle new special case; adapt to > > >> get_combined_window() signature changes > > >> > > >>Modified: > > >> subversion/trunk/subversion/libsvn_fs_fs/caching.c > > >> subversion/trunk/subversion/libsvn_fs_fs/fs.h > > >> subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c > > >> > > >I haven't reviewed this, but a question: > > > > > >>+/* Read the WINDOW_P for the rep state RS from the current FSFS session's > > >>+ * cache. This will be a no-op and IS_CACHED will be set to FALSE if no > > >>+ * cache has been given. If a cache is available IS_CACHED will inform > > >>+ * the caller about the success of the lookup. Allocations (of the window > > >>+ * in particualar) will be made from POOL. > > >>+ */ > > >>+static svn_error_t * > > >>+get_cached_combined_window(svn_stringbuf_t **window_p, > > >>+ struct rep_state *rs, > > >>+ svn_boolean_t *is_cached, > > >>+ apr_pool_t *pool) > > >>+{ > > >>+ if (! rs->combined_cache) > > >>+ { > > >>+ /* txdelta window has not been enabled */ > > >>+ *is_cached = FALSE; > > >>+ } > > >>+ else > > >>+ { > > >>+ /* ask the cache for the desired txdelta window */ > > >>+ return svn_cache__get((void **)window_p, > > >>+ is_cached, > > >>+ rs->combined_cache, > > >>+ get_window_key(rs, rs->start, pool), > > >How does the cache key identify the particular combined window being > > >cached? > > > > Undeltified windows use the same key as their deltified > > representation; basically revision file + offset. The distinction > > between deltified and un-deltified is made by the cache > > instance prefix. > > What revision file and what offset, and how do they related to the > window object contained in the cache? > > (I'm going to guess that the key is the rev-file/offset of a rep that > generates the same fulltext as the cached window; but I shouldn't have > to guess.) > > > >get_window_key() may return "". If it returns "" when called as an > > >argument to svn_cache__set() and then also here, won't the cache return > > >a wrong result? > > > > > There is a comment in get_window_key() for this case. > > "" will only be returned if we can't get the name of the > > open APR file. This is virtually impossible. If it happens > > anyways, it will hit the deltified window cache first, we will > > report a repository corruption. > > > > In plain English: there is an unlikely, but not impossible, scenario > where the only thing between your new code and silent corruption > (specifically: incorrect retrieval of a fulltext) is the order of > lookups in two different caches. > > That sounds awfully brittle to me, and the sensitivity of the lookup > order is not documented anywhere. > > > But maybe we should change the cache API definition > > to support and reject NULL keys. get_window_key() could > > then return simply NULL and could do with fewer assumptions. > > > > What does "support and reject" mean? That trying to get(key=NULL) > always returns "not found" and trying to set(key=NULL) doesn't change > the cache's state? > > > -- Stefan^2. > >