stef...@apache.org wrote on Mon, Dec 26, 2011 at 23:37:25 -0000: > Author: stefan2 > Date: Mon Dec 26 23:37:25 2011 > New Revision: 1224836 > > URL: http://svn.apache.org/viewvc?rev=1224836&view=rev > Log: > Tune FSFS deltification stratey: Use linear deltification on the very top of > the > deltification history and skip-delta only for larger distances. Most of the > runtime > overhead is being counterbalanced by reduced I/O and masked by our membuffer > caching. > OTOH, we examine skip delta ranges linearly during commit. Therefore, runtime > becomes an issue when committing nodes with very deep change histories. > This patch simply limits the deltification history to some reasonable length. > > * subversion/libsvn_fs_fs/fs_fs.c > (SVN_FS_FS_MAX_LINEAR_DELTIFICATION, > SVN_FS_FS_MAX_DELTIFICATION_WALK): new tuning parameters > (choose_delta_base): implement the new deltification strategy > * notes/knobs > document the new defines > > Modified: > subversion/trunk/notes/knobs > subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c > > Modified: subversion/trunk/notes/knobs > URL: > http://svn.apache.org/viewvc/subversion/trunk/notes/knobs?rev=1224836&r1=1224835&r2=1224836&view=diff > ============================================================================== > --- subversion/trunk/notes/knobs (original) > +++ subversion/trunk/notes/knobs Mon Dec 26 23:37:25 2011 > @@ -33,6 +33,8 @@ DEFAULT_HTTP_LIBRARY > MAX_SECS_TO_LINGER > SUFFIX_LINES_TO_KEEP > SVN_FS_FS_DEFAULT_MAX_FILES_PER_DIR > +SVN_FS_FS_MAX_LINEAR_DELTIFICATION > +SVN_FS_FS_MAX_DELTIFICATION_WALK > SVN_UNALIGNED_ACCESS_IS_OK > > 2.2 Features > @@ -125,7 +127,25 @@ SVN_I_LIKE_LATENCY_SO_IGNORE_HTTPV2 > Range: natural integers > Suggested: 1, 2, 3, 4, 5, 7, 11 > > -3.6 SVN_UNALIGNED_ACCESS_IS_OK > +3.6 SVN_FS_FS_MAX_LINEAR_DELTIFICATION > + > + Scope: libsvn_fs_fs > + Purpose: max length + 1 of the linear deltification history > + before skip-deltification kicks in > + Default: 16 > + Range: natural integers > + Suggested: 2, 4, 8, 16, 32, 64 > + > +3.7 SVN_FS_FS_MAX_DELTIFICATION_WALK > + > + Scope: libsvn_fs_fs > + Purpose: max skip deltification range. Change histories > + longer than that will be restarted with a fulltext. > + Default: 1023 > + Range: natural integers > + Suggested: 1, 2, 3, 4, 5, 7, 11 > +
The suggested values (and Range) for both of these are inconsistent with the comments at the definitions (which want a power of two, or a power of two minus one). > +3.8 SVN_UNALIGNED_ACCESS_IS_OK > > Scope: (global) > Purpose: enable data accesss optimizations. > > Modified: subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c > URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c?rev=1224836&r1=1224835&r2=1224836&view=diff > ============================================================================== > --- subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c (original) > +++ subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c Mon Dec 26 23:37:25 2011 > @@ -78,6 +78,20 @@ > #define SVN_FS_FS_DEFAULT_MAX_FILES_PER_DIR 1000 > #endif > > +/* Begin deltification after a node history exceeded this this limit. > + Useful values are 4 to 64 with 16 being a good compromise between > + computational overhead and pository size savings. > + Should be a power of 2. > + Values < 2 will result in standard skip-delta behavior. */ > +#define SVN_FS_FS_MAX_LINEAR_DELTIFICATION 16 > + > +/* Finding a deltification base takes operations proportional to the > + number of changes being skipped. To prevent exploding runtime > + during commits, limit the deltification range to this value. > + Should be a power of 2 minus one. > + Values < 1 disable deltification. */ > +#define SVN_FS_FS_MAX_DELTIFICATION_WALK 1023 > + > /* Following are defines that specify the textual elements of the > native filesystem directories and revision files. */ > > @@ -5186,6 +5200,7 @@ choose_delta_base(representation_t **rep > apr_pool_t *pool) > { > int count; > + int walk; > node_revision_t *base; > > /* If we have no predecessors, then use the empty stream as a > @@ -5203,6 +5218,23 @@ choose_delta_base(representation_t **rep > count = noderev->predecessor_count; > count = count & (count - 1); > > + /* We use skip delta for limiting the number of delta operations > + along very long node histories. Close to HEAD however, we create > + a linear history to minimize delta size. */ > + walk = noderev->predecessor_count - count; > + if (walk < SVN_FS_FS_MAX_LINEAR_DELTIFICATION) > + count = noderev->predecessor_count - 1; > + So, before this change the number of delta steps of the Nth revision was equal the number of 1-bits in the binary representation of predecessor_count, with this change it's equal to (predecessor_count & 15) + (the number of 1-bits in the binary representation of (predecessor_count >> 4)), right? In other words, this change doesn't affect the high level structure of the deltas tree (except it makes each leaf a 15-deltas chain), but it trades disk space savings for more delta combinations in historical cats (eg, revision 16n+8 will need X+8 deltas, instead of X+1.) Are there scenarios in which this change is significant? What about people who store large binaries that change frequently in svn, should we be recommending them to set SVN_FS_FS_MAX_LINEAR_DELTIFICATION to a small number? Should we be making these fsfs.conf settings, instead of compile-time macros? > + /* Finding the delta base over a very long distance can become extremely > + expensive for very deep histories, possibly causing client timeouts etc. > + OTOH, this is a rare operation and its gains are minimal. Lets simply > + start deltification anew close every other 1000 changes or so. */ > + if (walk > SVN_FS_FS_MAX_DELTIFICATION_WALK) > + { > + *rep = NULL; > + return SVN_NO_ERROR; > + } > + > /* Walk back a number of predecessors equal to the difference > between count and the original predecessor count. (For example, > if noderev has ten predecessors and we want the eighth file rev, > >