On Mon, Dec 6, 2010 at 4:28 PM, Hyrum K. Wright <hyrum_wri...@mail.utexas.edu> wrote: > On Mon, Dec 6, 2010 at 3:15 PM, Daniel Becroft <djcbecr...@gmail.com> wrote: >> On Mon, Dec 6, 2010 at 8:50 PM, Daniel Becroft <djcbecr...@gmail.com> wrote: >> >>> On Mon, Dec 6, 2010 at 7:13 PM, Daniel Shahaf >>> <d...@daniel.shahaf.name>wrote: >>> >>>> Instead of guessing which function causes the lstat() calls, could we >>>> have a tool tell? >>>> >>>> I've looked at 'ltrace -S', but it seems to me that will only tell us >>>> which public svn_wc_* API is responsible for the calls. Perhaps there >>>> is another tool that can help us follow the entire call chain down to >>>> the lstat() call? >>>> >>>> Daniel >>>> . o O (gdb with a breakpoint in lstat() and some fu to extract >>>> statistics about the stack trace?) >>>> >>> >>> Very true. >>> >>> Okay, after a crash course in using GDB, I've run some further testing. >>> Running with breakpoints set on the lstat() system call, it looks to be the >>> known issue that was originally pointed to. A partial stack trace is below: >>> >>> #0 0x00007ffff644ab45 in __lxstat (vers=<value optimised out>, >>> name=<value optimised out>, buf=0x7fffffffcbf0) >>> at ../sysdeps/unix/sysv/linux/wordsize-64/lxstat.c:38 >>> #1 0x00007ffff69269b6 in lstat (finfo=0x7fffffffccf0, >>> fname=0x6bd638 "/home/djcbecroft/dev/workingcopy/A/beta.txt", >>> wanted=33137, pool=0x6bd588) at /usr/include/sys/stat.h:464 >>> #2 apr_stat (finfo=0x7fffffffccf0, >>> fname=0x6bd638 "/home/djcbecroft/dev/workingcopy/A/beta.txt", >>> wanted=33137, pool=0x6bd588) at file_io/unix/filestat.c:292 >>> #3 0x00007ffff6fb1199 in io_check_path ( >>> path=0x6bd608 "/home/djcbecroft/dev/workingcopy/A/beta.txt", >>> resolve_symlinks=0, is_special_p=0x7fffffffced8, kind=0x7fffffffcedc, >>> pool=0x6bd588) at subversion/libsvn_subr/io.c:222 >>> #4 0x00007ffff6fb1346 in svn_io_check_special_path ( >>> path=0x6bd608 "/home/djcbecroft/dev/workingcopy/A/beta.txt", >>> kind=0x7fffffffcedc, is_special=0x7fffffffced8, pool=0x6bd588) >>> at subversion/libsvn_subr/io.c:283 >>> #5 0x00007ffff794513d in svn_wc__db_pdh_parse_local_abspath ( >>> pdh=0x7fffffffd038, local_relpath=0x7fffffffd030, db=0x64ad38, >>> local_abspath=0x6bd608 "/home/djcbecroft/dev/workingcopy/A/beta.txt", >>> smode=svn_sqlite__mode_readwrite, result_pool=0x6bd588, >>> scratch_pool=0x6bd588) at subversion/libsvn_wc/wc_db_pdh.c:382 >>> #6 0x00007ffff7935d58 in svn_wc__db_read_info (status=0x7fffffffd184, >>> kind=0x7fffffffd188, revision=0x0, repos_relpath=0x0, >>> repos_root_url=0x0, >>> repos_uuid=0x0, changed_rev=0x0, changed_date=0x0, changed_author=0x0, >>> last_mod_time=0x0, depth=0x0, checksum=0x0, translated_size=0x0, >>> target=0x0, changelist=0x0, original_repos_relpath=0x0, >>> original_root_url=0x0, original_uuid=0x0, original_revision=0x0, >>> props_mod=0x0, have_base=0x0, have_work=0x0, conflicted=0x0, lock=0x0, >>> db=0x64ad38, >>> local_abspath=0x6bd608 "/home/djcbecroft/dev/workingcopy/A/beta.txt", >>> result_pool=0x6bd588, scratch_pool=0x6bd588) >>> at subversion/libsvn_wc/wc_db.c:5261 >>> >>> However, I then put the breakpont in the svn_wc__db_read_info function and, >>> just for A/beta.txt, it gets called 7 times, when in terms leads to 7 >>> additional lstat() calls. Looking further the call stack appears as follows: >>> >>> #0 svn_wc__db_read_info (status=0x7fffffffd184, kind=0x7fffffffd188, >>> revision=0x0, repos_relpath=0x0, repos_root_url=0x0, repos_uuid=0x0, >>> changed_rev=0x0, changed_date=0x0, changed_author=0x0, >>> last_mod_time=0x0, >>> depth=0x0, checksum=0x0, translated_size=0x0, target=0x0, >>> changelist=0x0, >>> original_repos_relpath=0x0, original_root_url=0x0, original_uuid=0x0, >>> original_revision=0x0, props_mod=0x0, have_base=0x0, have_work=0x0, >>> conflicted=0x0, lock=0x0, db=0x64ad38, >>> local_abspath=0x6bd608 "/home/djcbecroft/dev/workingcopy/A/beta.txt", >>> result_pool=0x6bd588, scratch_pool=0x6bd588) >>> at subversion/libsvn_wc/wc_db.c:5259 >>> #1 0x00007ffff78fab96 in walker_helper (db=0x64ad38, >>> dir_abspath=0x6ba288 "/home/djcbecroft/dev/workingcopy/A", >>> show_hidden=1, >>> walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>, >>> walk_baton=0x7fffffffd4e0, depth=svn_depth_infinity, >>> cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0, >>> scratch_pool=0x6ba208) at subversion/libsvn_wc/node.c:679 >>> #2 0x00007ffff78facb5 in walker_helper (db=0x64ad38, >>> dir_abspath=0x67d258 "/home/djcbecroft/dev/workingcopy", show_hidden=1, >>> >>> walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>, >>> walk_baton=0x7fffffffd4e0, depth=svn_depth_infinity, >>> cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0, >>> scratch_pool=0x6a0dc8) at subversion/libsvn_wc/node.c:716 >>> #3 0x00007ffff78faf9a in svn_wc__internal_walk_children (db=0x64ad38, >>> local_abspath=0x67d258 "/home/djcbecroft/dev/workingcopy", >>> show_hidden=1, >>> walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>, >>> walk_baton=0x7fffffffd4e0, walk_depth=svn_depth_infinity, >>> cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0, >>> scratch_pool=0x6a0dc8) at subversion/libsvn_wc/node.c:763 >>> #4 0x00007ffff78fb06d in svn_wc__node_walk_children (wc_ctx=0x64ad20, >>> local_abspath=0x67d258 "/home/djcbecroft/dev/workingcopy", >>> show_hidden=1, >>> walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>, >>> walk_baton=0x7fffffffd4e0, walk_depth=svn_depth_infinity, >>> cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0, >>> scratch_pool=0x6a0dc8) at subversion/libsvn_wc/node.c:785 >>> >>> Using additional breakpoints, I have verified that >>> svn_wc__node_walk_children, svn_wc__internal_walk_children and walker_helper >>> only get called once. It seems to be an issue within the walker_helper >>> itself. >>> >>> I'll continue looking at it further - let me know if I'm barking up the >>> wrong tree. >>> >>> Cheers, >>> Daniel B. >>> ... who now has a much greater appreciation for those that debug in GDB on >>> a regular basis. >>> >> >> I think I've found the culprit. As we are walking the children >> (svn_wc__internal_walk_children, node.c), we call the walk_callback for each >> node. At the moment, this is the get_mergeinfo_walk_cb function in merge.c. >> Inside this, we call the following: >> >> svn_wc__node_is_status_present >> svn_wc__node_is_status_deleted >> svn_wc__node_is_status_absent >> svn_wc__path_switched >> svn_wc_read_kind >> svn_wc__node_get_depth >> >> Each of these will run a separate svn_wc__db_read_info call, and therefore a >> separate lstat(). There's also a couple of other calls that might call it >> indirectly. > > So it looks an improvement would be something like > svn_wc__get_info_for_merge_callback() which does it all in one go (or > some other less-verbosely-named function).
Picked the low-hanging fruit in r1042892. More work could be done, but I think the problems lie deeper, namely: * read_info() was originally envisioned as being cheap to call. Apparently that isn't the case anymore, part of which is no doubt caused by... * the call to svn_io_check_special_path() in vn_wc__db_pdh_parse_local_abspath(). There is a comment in that file that: /* ### at some point in the future, we may need to find a way to get ### rid of this stat() call. it is going to happen for EVERY call ### into wc_db which references a file. calls for directories could ### get an early-exit in the hash lookup just above. */ It's really unfortunate that every wc_db call on a file which fetches a pdh has to call stat(). I tried to implement an optimization that looks at the parent (which is much more likely to succeed, now that we have single-db), but it failed for externals (which have their own wc.db databases). We could work around that, but to do so, we'd need to know if we are looking at a file or a directory...and the way to do that is by calling stat(). Improvements, patches, suggestions encouraged. -Hyrum