I'll repost with more specifics once I have them, but for now, I'm just asking 
for advice on how to get better specifics.

There is some sort of problem, where sometimes, a commit or other operation 
which should take ~10sec instead requires ~15min.  It is reproducible, but it 
depends on the data being committed, and currently the data being committed is 
private, so I can't demonstrate the problem to the outside world.

I tried reproducing the problem using random data, but it didn't happen.  I 
tried introducing some structure to the random data, but it still didn't happen.

The data in question is ~45M data files.  I have several different versions of 
the same file, as generated by engineers who reported the problem.  In an 
attempt to better understand the data structure inside the files, I did a 
rolling md5, of every 1M chunk of the file, and then diff'd the md5's and found 
that approx 1 in 20 of the 1M chunks match from version to version, so from 
version to version, some large sections of the file have changed, but it's not 
all changed.  Also, I didn't do any larger or smaller granularity than 1M 
chunks, so it's possible that even within a specific 1M section of the file, 
the data might be unchanged, or just reordered, or shifted or something like 
that ...  When I gzip the files, they compress to approx 20% of their original 
size, which means there's plenty of repeated patterns within the file, even 
within the 1M chunks that have changed from rev to rev.

In order to reproduce the problem, I make a new repo, I do a checkout via 
file:///<file:///\\>, I copy rev 1 of some file to the WC, I do an add and 
commit.  It completes in 11sec.  I then overwrite it with rev2, commit, 
overwrite with rev3, etc.  After around rev10 or so, suddenly the commit takes 
15minutes instead of 10sec.  I destroy my repo and WC and start all over again. 
 When it happens, I kill -KILL svn, do a svn cleanup, and attempt the commit 
again.  Once the problem situation is encountered, it doesn't go away until 
after a successfully completed commit.  As long as I interrupt my commit (and 
do a cleanup), even if I overwrite the file with various other new versions and 
attempt the commit, this particular rev is always stuck as a "15min" rev.

In order to get a better understanding of precisely what is the problem, (and 
precisely what svn is doing during that time) ... svn is 100% cpu bound.  So I 
have taken the following strategy:

(This is where the question is.)  I am asking you guys if there's any debug 
mode for svn, or any better way to debug.

I went into subversion/svn, and I edited every single .c file.  I put a 
fprintf(stderr,"function name\n"); into every function, just to show me where 
svn is going after it's initiated.  There are a lot of files, and there are a 
lot of functions within those files.  The flow of the program is far from 
straightforward.  So far, I've put in a lot of effort, but I don't have any 
result.  It's bed time.  Tomorrow, unless somebody here offers me any better 
advice, I plan to continue sprinkling printf()'s into the svn source code, 
until I can find what functions or sections the process is spending all of its 
compute cycles in.

People have suggested this is going to be xdelta.  Probably it is.  But it's 
not yet proven.

Thanks for any tips...


Reply via email to