Hi Bert, Many thanks for the quick response.
> I think you can find a lot of issues similar to your issue in our issue > tracker. Searching fail on my part - sorry for re-treading old ground. > For WC-NG we move all the entries data in a single wc.db file in a .svn > directory below the root of your working copy. This database is accessed via > SQLite, so it doesn't need the chunked rewriting or anything of that. (It > even has in-memory caching and transaction handling, so we don't have to do > that in Subversion itself any more) Sounds great. We have quite a deep, dense directory structure and so a full update (or any walk over the whole working copy) involves accessing hundreds of subdirectories. Merging is particularly paintful. I imagine this could help a great deal. > > 2) subversion appears to generate a temporary file in .svn\prop-base\ for > > every file that's being updated. It's generating filenames sequentially, > > which means that when 5,800 files are being updated it ends up doing this: > > > > file_open tempfile.tmp? Already exists! > > file_open tempfile.2.tmp? Already exists! > > file_open tempfile.3.tmp? Already exists! > > ...some time later > > file_open tempfile.5800.tmp? Yes! > > Wow. > > Are you sure that this is in prop-base, not .svn/tmp? Yes, definitely. Each of these files have a svn:mime-type property of 'application/octet-stream', so I guess it's that (the property isn't changing between updates however) > For 1.7 we made the tempfilename generator better in guessing new names, but > for property handling we won't be using files in 1.7. (Looking at these > numbers and those that follow later in your mail, we might have to look in > porting some of this back to 1.6). I'd love to see this in 1.6, as it's biting us quite hard right now - to the extent that we're seriously discussing moving this stuff out of version control (which is terrifying). I'm sure we'll switch over to 1.7 as soon as we can however. > Properties will be moved in wc.db, to remove the file accesses completely. > (We can update them with the node information in a single transaction; > without additional file accesses) Again, sounds great :) > > Is there any inherent reason these files need to be generated > sequentially? > > From reading the comments in 'svn_io_open_uniquely_named' it sounds like > > these files are named sequentially for the benefit of people looking at > > conflicts in their working directory. As these files are being generated > > within the 'magic' .svn folder, is there any reason to number them > > sequentially? Just calling rand() until there were no collisions would > > probably give a huge increase in performance. > > In 1.7 we have a new api that uses a smarter algorithm, but we can't add > public apis to 1.6 now. It's a shame that the api would need to change to support this. I suppose checking to see if the tempfile was being generated under '.svn/prop-base' and using an alternative strategy is too gross? (I'm half joking) > > In case it's relevant, I'm using the CollabNet build of subversion on > > Windows 7 64bit. Here's 'svn --version': > > > > C:\dev\CW_br2>svn --version > > This issue is actually worse on Windows then on linux, because NTFS is a > fully transactional filesystem with a more advanced locking handling. And > for this it needs to do more to open a file. (Some tests I performed 1.5 > year ago indicated that NTFS is more than 100 times slower on handling > extremely small files, then the EXT3 filesystem on Linux. While througput > within a single file is not far apart). Yeah - we're seeing the same issue on some of our Linux boxes. The problem is still there, but it's not as severe. Many thanks, Paul