On Fri, Sep 3, 2010 at 8:39 AM, Greg Stein <gst...@gmail.com> wrote: > It "should" already be faster. Obviously, that's not the case.
I just spent a little bit time with Shark and gdb. A cold run of 'svn st' against Subversion trunk checkouts for 1.6 yields 0.402 seconds and 1.7 is 0.919 seconds. Hot runs for 1.6 are about 0.055 seconds with 1.7 at 0.750 seconds. One striking difference in the perf profile between 1.6 & trunk is that we seem to do a larger amount of stat() calls in 1.7. >From looking at the traces and code, I *think* svn_wc__db_pdh_parse_local_abspath's call to svn_io_check_special_path may be in play here: /* ### at some point in the future, we may need to find a way to get ### rid of this stat() call. it is going to happen for EVERY call ### into wc_db which references a file. calls for directories could ### get an early-exit in the hash lookup just above. */ SVN_ERR(svn_io_check_special_path(local_abspath, &kind, &special /* unused */, scratch_pool)); I see this stat() call getting called approximately seven times *per* file in a WC - that can't be good. The patch below brings us down to one invocation of this particular svn_io_check_special_path call per status run. (If anyone would like the stacktraces, I can send 'em along. If we could somehow stop asking for this seven times per file, that'd be good too. *grin*) Shark says this patch saves us about 2.5% time-wise - not a whole lot - but, something like this is likely needed to relieve pressure on the I/O subsystem. (I haven't rigorously tested this patch to see if it breaks anything else - hope someone more familiar with libsvn_wc will see if this is useful.) I'll keep looking through the timing profiles to see what else I can uncover. -- justin --- Remember WC files we've seen before to save on stat() calls. * subversion/libsvn_wc/wc_db_private.h (svn_wc__db_t): Add a hash table to remember file entries. * subversion/libsvn_wc/wc_db_pdh.c (svn_wc__db_open): Create file_data hash. (svn_wc__db_pdh_parse_local_abspath): Use hash to save stat() lookups. Index: wc_db_pdh.c =================================================================== --- wc_db_pdh.c (revision 992534) +++ wc_db_pdh.c (working copy) @@ -216,6 +216,7 @@ svn_wc__db_open(svn_wc__db_t **db, (*db)->auto_upgrade = auto_upgrade; (*db)->enforce_empty_wq = enforce_empty_wq; (*db)->dir_data = apr_hash_make(result_pool); + (*db)->file_data = apr_hash_make(result_pool); (*db)->state_pool = result_pool; return SVN_NO_ERROR; @@ -424,10 +425,21 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ return SVN_NO_ERROR; } - /* ### at some point in the future, we may need to find a way to get - ### rid of this stat() call. it is going to happen for EVERY call - ### into wc_db which references a file. calls for directories could - ### get an early-exit in the hash lookup just above. */ + /* Have we already successfully seen this file before? */ + *pdh = apr_hash_get(db->file_data, local_abspath, APR_HASH_KEY_STRING); + if (*pdh != NULL && (*pdh)->wcroot != NULL) { + const char *local_abspath_parent, *dir_relpath; + svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath, + scratch_pool); + + /* Stashed directory's local_relpath + basename. */ + dir_relpath = svn_wc__db_pdh_compute_relpath(*pdh, NULL); + *local_relpath = svn_relpath_join(dir_relpath, + build_relpath, + result_pool); + return SVN_NO_ERROR; + } + SVN_ERR(svn_io_check_special_path(local_abspath, &kind, &special /* unused */, scratch_pool)); if (kind != svn_node_dir) @@ -440,7 +452,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ For both of these cases, strip the basename off of the path and move up one level. Keep record of what we strip, though, since we'll need it later to construct local_relpath. */ - svn_dirent_split(&local_abspath, &build_relpath, local_abspath, + const char *local_abspath_parent; + svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath, scratch_pool); /* ### if *pdh != NULL (from further above), then there is (quite @@ -448,7 +461,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ ### clear it out? but what if there is an access baton? */ /* Is this directory in our hash? */ - *pdh = apr_hash_get(db->dir_data, local_abspath, APR_HASH_KEY_STRING); + *pdh = apr_hash_get(db->dir_data, local_abspath_parent, + APR_HASH_KEY_STRING); if (*pdh != NULL && (*pdh)->wcroot != NULL) { const char *dir_relpath; @@ -458,6 +472,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ *local_relpath = svn_relpath_join(dir_relpath, build_relpath, result_pool); + apr_hash_set(db->file_data, local_abspath, APR_HASH_KEY_STRING, + *pdh); return SVN_NO_ERROR; } Index: wc_db_private.h =================================================================== --- wc_db_private.h (revision 992534) +++ wc_db_private.h (working copy) @@ -52,6 +52,10 @@ struct svn_wc__db_t { const char *local_abspath -> svn_wc__db_pdh_t *pdh */ apr_hash_t *dir_data; + /* Map a known working copy file to its parent data. + const char *local_abspath -> svn_wc__db_pdh_t *pdh */ + apr_hash_t *file_data; + /* As we grow the state of this DB, allocate that state here. */ apr_pool_t *state_pool; };