Re: Anomaly with the new code - Re: git-svn performance
Hin-Tak Leung ht...@users.sourceforge.net wrote: Eric Wong normalper...@yhbt.net wrote: Which SVN version are you using? I'm cloning (currently on r373xx) https://svn.r-project.org/R using --stdlayout and unable to see memory growth of the git-svn Perl process beyond 40M (on a 32-bit system). git-svn hit 45M and took 11:44 to finish. My ping times to svn.r-project.org is around 150ms (I'm running this from a server in Fremont, California). I'll keep the repo around and periodically fetch to see how it runs. I'll apply the 10 patches against 2.1.0 and see then. As I wrote in my last reply, my 3rd clone took about 8 hours to finish, and the max resident size is about 700MB (according to GNU time). The time command is not a good measurement since it includes child process memory use (which may be file-backed mmap for git repack or git cat-file --batch). My measurements are just the RSS of the git-svn Perl process (from ps aux or VmRSS in /proc/$PID/status on Linux) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anomaly with the new code - Re: git-svn performance
Hin-Tak Leung ht...@users.sourceforge.net wrote: On Sat, Oct 25, 2014 00:34 BST Eric Wong wrote: 0006 is insufficient and incompatible with older SVN. I pushed git-svn: reload RA every log-window-size (commit dfa72fdb96befbd790f623bb2909a347176753c2) instead which saves much more memory: it is fetching against the new clone taking twice as long and consuming twice as much memory. Which SVN version are you using? I'm cloning (currently on r373xx) https://svn.r-project.org/R using --stdlayout and unable to see memory growth of the git-svn Perl process beyond 40M (on a 32-bit system). I also tried http:// (not https), svn+ssh:// on my local (64-bit) system and did not see memory growth, either: http://mid.gmane.org/20141027014033.ga4...@dcvr.yhbt.net I'm using svn 1.6.17 on Debian stable in all cases. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anomaly with the new code - Re: git-svn performance
Eric Wong normalper...@yhbt.net wrote: Which SVN version are you using? I'm cloning (currently on r373xx) https://svn.r-project.org/R using --stdlayout and unable to see memory growth of the git-svn Perl process beyond 40M (on a 32-bit system). git-svn hit 45M and took 11:44 to finish. My ping times to svn.r-project.org is around 150ms (I'm running this from a server in Fremont, California). I'll keep the repo around and periodically fetch to see how it runs. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anomaly with the new code - Re: git-svn performance
-- On Mon, Oct 27, 2014 06:38 GMT Eric Wong wrote: Which SVN version are you using? I'm cloning (currently on r373xx) https://svn.r-project.org/R using --stdlayout and unable to see memory growth of the git-svn Perl process beyond 40M (on a 32-bit system). I also tried http:// (not https), svn+ssh:// on my local (64-bit) system and did not see memory growth, either: http://mid.gmane.org/20141027014033.ga4...@dcvr.yhbt.net I'm using svn 1.6.17 on Debian stable in all cases. The memory consumption does seem to go up a good deal after r48xxx -ish (the total being about 67xxx-ish now), when there are a fair number of branches. Seeing as you seem to be able to make the memory consumption drops further, I'll rebuild git with dropping/adding those patches now. I also just realised /usr/bin/time -v git svn fetch --all also includes the periodic auto- garbage collection from git itself if fetching more than a number of commits, so may not be accurate once git svn's memory consumption drops below a certain level. Is there any way of coping with that? I made a 3rd clone yesterday - it took 8 hours 15 minutes, and Command being timed: git svn fetch --all User time (seconds): 6897.80 System time (seconds): 18853.08 Percent of CPU this job got: 86% Elapsed (wall clock) time (h:mm:ss or m:ss): 8:14:00 ... Maximum resident set size (kbytes): 675436 and fetching the next 8 commits: $ /usr/bin/time -v git svn fetch --all M doc/NEWS.Rd r66871 = 0a7f50fc04dee174229513a0d80fecfcd12975ca (refs/remotes/trunk) ... M doc/manual/R-exts.texi r66879 = ede68f65df714c3ba283579d85105393c1eccc80 (refs/remotes/trunk) Auto packing the repository in background for optimum performance. See git help gc for manual housekeeping. Command being timed: git svn fetch --all User time (seconds): 856.82 System time (seconds): 29.78 Percent of CPU this job got: 98% Elapsed (wall clock) time (h:mm:ss or m:ss): 15:03.39 ... Maximum resident set size (kbytes): 791088 and quite similar against the 2nd clone, but against the first clone (which were created by fetching every few days over a few years): Command being timed: git svn fetch --all User time (seconds): 518.00 System time (seconds): 28.62 Percent of CPU this job got: 98% Elapsed (wall clock) time (h:mm:ss or m:ss): 9:16.84 ... Maximum resident set size (kbytes): 403160 So it seems the first clone is rather different from the recent ones. I haven't got round to compare the branches yet - it is actually easier than I thought, since I only need to compare the branch HEADs. (I already mentioned that trunk is different, due to a blank vs 3 word commit message about 2 years ago - I reckon I might see similar issues in the other branches - I'll go and write a script to check that now). All recent fetch were done with git 2.1.0 patched with the 6 patches I mentioned, on fedora 20 x86_64. BTW, I have been meaning to ask - are you the same Eric Wong who maintained some chinese packages on Debian some years ago? :-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anomaly with the new code - Re: git-svn performance
-- On Mon, Oct 27, 2014 16:56 GMT Eric Wong wrote: Eric Wong normalper...@yhbt.net wrote: Which SVN version are you using? I'm cloning (currently on r373xx) https://svn.r-project.org/R using --stdlayout and unable to see memory growth of the git-svn Perl process beyond 40M (on a 32-bit system). git-svn hit 45M and took 11:44 to finish. My ping times to svn.r-project.org is around 150ms (I'm running this from a server in Fremont, California). I'll keep the repo around and periodically fetch to see how it runs. I'll apply the 10 patches against 2.1.0 and see then. As I wrote in my last reply, my 3rd clone took about 8 hours to finish, and the max resident size is about 700MB (according to GNU time). AFAIK the hosting server is in northern Europe (Copahagen?), I think, so it is supposed to be faster for me fetching from UK. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anomaly with the new code - Re: git-svn performance
Hin-Tak Leung ht...@users.sourceforge.net wrote: On Sat, Oct 25, 2014 00:34 BST Eric Wong wrote: Hin-Tak Leung ht...@users.sourceforge.net wrote: 0006-git-svn-clear-global-SVN-pool-between-get_log-invoca.patch 0006 is insufficient and incompatible with older SVN. I pushed git-svn: reload RA every log-window-size (commit dfa72fdb96befbd790f623bb2909a347176753c2) instead which saves much more memory: it is fetching against the new clone taking twice as long and consuming twice as much memory. Ugh, I've only tested git-svn: reload RA every log-window-size with file:// repos so far, so it looks like I'll need to setup remote repos on my test system to test. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anomaly with the new code - Re: git-svn performance
-- On Sat, Oct 25, 2014 00:34 BST Eric Wong wrote: Hin-Tak Leung ht...@users.sourceforge.net wrote: I keep tabs of a particular svn repository over many years and run git svn fetch --all every few days. So that's the old clone. Since this discussion started, I made a new one with git 2.1.0 patched with the first two patches below, a couple of weeks ago. And I ran 'git svn fetch --all' on both every few days since. I have added a few more patches, so the whole list is the 6 below against 2.1.0. The latest fetch is really strange - the fetch against the new clone took almost twice as long and uses almost twice as much memory, vs against the old. 17 min, 800 MB vs 10 min 400MB. Details below. Maybe this is a performance issue about how the clones were made? Memory usage seems to grow with the amount of revisions fetched, see below. And higher memory means slower fork() on Linux systems. but this is fetching the same number of revisions, and same revisions to keep the two clone in sync. So the issue is about how distant history is stored and used/searched, i think. 0001-git-svn-only-look-at-the-new-parts-of-svn-mergeinfo.patch 0002-git-svn-only-look-at-the-root-path-for-svn-mergeinfo.patch 0003-git-svn-reduce-check_cherry_pick-cache-overhead.patch 0004-git-svn-cache-only-mergeinfo-revisions.patch 0006-git-svn-clear-global-SVN-pool-between-get_log-invoca.patch 0006 is insufficient and incompatible with older SVN. I pushed git-svn: reload RA every log-window-size (commit dfa72fdb96befbd790f623bb2909a347176753c2) instead which saves much more memory: it is fetching against the new clone taking twice as long and consuming twice as much memory. http://mid.gmane.org/20141024225352.gb31...@dcvr.yhbt.net But there still seems to be some slow growth with many revisions which is not mergeinfo-related. 0007-git-svn-remove-mergeinfo-rev-caching.patch I think it is also safe to remove the _rev_list memoization since it uses a lot of memory. The remaining caches should be tiny (but useful, I think). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Anomaly with the new code - Re: git-svn performance
I keep tabs of a particular svn repository over many years and run git svn fetch --all every few days. So that's the old clone. Since this discussion started, I made a new one with git 2.1.0 patched with the first two patches below, a couple of weeks ago. And I ran 'git svn fetch --all' on both every few days since. I have added a few more patches, so the whole list is the 6 below against 2.1.0. The latest fetch is really strange - the fetch against the new clone took almost twice as long and uses almost twice as much memory, vs against the old. 17 min, 800 MB vs 10 min 400MB. Details below. Maybe this is a performance issue about how the clones were made? 0001-git-svn-only-look-at-the-new-parts-of-svn-mergeinfo.patch 0002-git-svn-only-look-at-the-root-path-for-svn-mergeinfo.patch 0003-git-svn-reduce-check_cherry_pick-cache-overhead.patch 0004-git-svn-cache-only-mergeinfo-revisions.patch 0006-git-svn-clear-global-SVN-pool-between-get_log-invoca.patch 0007-git-svn-remove-mergeinfo-rev-caching.patch (I dropped #5 because it doesn't seem interesting?) --- $ /usr/bin/time -v git svn fetch --all ... Command being timed: git svn fetch --all User time (seconds): 622.20 System time (seconds): 12.52 Percent of CPU this job got: 98% Elapsed (wall clock) time (h:mm:ss or m:ss): 10:42.21 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 399588 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 320 Minor (reclaiming a frame) page faults: 383987 Voluntary context switches: 2088 Involuntary context switches: 68304 Swaps: 0 File system inputs: 168288 File system outputs: 148960 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 [Hin-Tak@localhost R]$ cd ../R-2/ [Hin-Tak@localhost R-2]$ /usr/bin/time -v git svn fetch --all M src/library/stats/R/hclust.R M src/library/stats/R/dendrogram.R r66853 = 7c18b2e4084529d5912cf789c045f2eab7d4083c (refs/remotes/trunk) M doc/manual/R-exts.texi r66854 = bc7b131e34eaf04859fede1ecedb796c0a33be02 (refs/remotes/trunk) M doc/manual/R-exts.texi Checking svn:mergeinfo changes since r66844: 6 sources, 1 changed W:svn cherry-pick ignored (/trunk:66824,66854) - missing 1084 commit(s) (eg 6453a2d844e27f2963ba87142028b023c50385ef) r66855 = de5daf8db948732fa96c3d5b32077d8057e2a7e7 (refs/remotes/R-3-1-branch) M src/modules/internet/internet.c r66856 = a1e9300c6dd49ec4c3dd11f861bca0dbe3ca65b4 (refs/remotes/trunk) M doc/manual/R-admin.texi r66857 = eb5f3175e67a806482c39def71246f5d18bf8660 (refs/remotes/trunk) M doc/manual/R-admin.texi Checking svn:mergeinfo changes since r66855: 6 sources, 1 changed W:svn cherry-pick ignored (/trunk:66854,66857) - missing 1086 commit(s) (eg e8cc0c31ddeeea3f8fa1ad47105d09a2c19e1a98) r66858 = 10c8013f103d57c8a717b738e2a51c8d397c88f0 (refs/remotes/R-3-1-branch) M VERSION r66859 = 0f865f247da3191431bb17bcc3c307e8735dbd97 (refs/remotes/R-3-1-branch) Command being timed: git svn fetch --all User time (seconds): 1023.06 System time (seconds): 15.30 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 17:27.65 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 785332 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 884 Minor (reclaiming a frame) page faults: 527668 Voluntary context switches: 2792 Involuntary context switches: 107718 Swaps: 0 File system inputs: 194704 File system outputs: 170032 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 --- -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Anomaly with the new code - Re: git-svn performance
Hin-Tak Leung ht...@users.sourceforge.net wrote: I keep tabs of a particular svn repository over many years and run git svn fetch --all every few days. So that's the old clone. Since this discussion started, I made a new one with git 2.1.0 patched with the first two patches below, a couple of weeks ago. And I ran 'git svn fetch --all' on both every few days since. I have added a few more patches, so the whole list is the 6 below against 2.1.0. The latest fetch is really strange - the fetch against the new clone took almost twice as long and uses almost twice as much memory, vs against the old. 17 min, 800 MB vs 10 min 400MB. Details below. Maybe this is a performance issue about how the clones were made? Memory usage seems to grow with the amount of revisions fetched, see below. And higher memory means slower fork() on Linux systems. 0001-git-svn-only-look-at-the-new-parts-of-svn-mergeinfo.patch 0002-git-svn-only-look-at-the-root-path-for-svn-mergeinfo.patch 0003-git-svn-reduce-check_cherry_pick-cache-overhead.patch 0004-git-svn-cache-only-mergeinfo-revisions.patch 0006-git-svn-clear-global-SVN-pool-between-get_log-invoca.patch 0006 is insufficient and incompatible with older SVN. I pushed git-svn: reload RA every log-window-size (commit dfa72fdb96befbd790f623bb2909a347176753c2) instead which saves much more memory: http://mid.gmane.org/20141024225352.gb31...@dcvr.yhbt.net But there still seems to be some slow growth with many revisions which is not mergeinfo-related. 0007-git-svn-remove-mergeinfo-rev-caching.patch I think it is also safe to remove the _rev_list memoization since it uses a lot of memory. The remaining caches should be tiny (but useful, I think). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html