Re: hg archive (files) performance regression.
On 3/03/2018 1:32 PM, Matt Harbison wrote: AFAICT, the code prior to this vfs call didn't do atomic files, and didn't mention why the atomic aspect changed. I guess it's just something that was overlooked during refactoring to use the vfs? It looks like archive doesn't check for an empty directory, but it still seems OK, so go for it. https://www.mercurial-scm.org/wiki/ContributingChanges I'll look into it, but not being a python developer (c#, delphi usually) I'm not sure I'm setup to compile hg itself. If all it needs is a 1 word change then I was kind hoping one of the core contributors might look at it ;) I did some quick profiling archiving the hg repo, and it looks like a lot of time is spent calculating tag info.ᅵ Do you see a significant improvement when archiving a tagged revision, and/or using '--config ui.archivemeta=False'?ᅵ (Ironically, 4.2.2 gave me the worst performance, but it seems like it had something to do with evolve not being able to load?) I haven't tested it with tags, but what we do is export a specific revision or tip. We also do specify --config ui.archivemeta=False (although I didn't in my testing before), I just tried it and that doesn't seem to make any real difference. -- Regards Vincent Parrett CEO - VSoft Technologies Pty Ltd https://www.finalbuilder.com Blog: https://www.finalbuilder.com/resources/blogs Automate your Software builds with FinalBuilder. Open Source : https://github.com/VSoftTechnologies ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: hg archive (files) performance regression.
On Fri, 02 Mar 2018 02:13:18 -0500, Vincent Parrettwrote: On 2/03/2018 4:31 PM, Vincent Parrett wrote: I'm not a python dev (mostly c# and delphi), so still getting my head around the hg code base, but I'm curious why the atomictemp=true is used in fileit.addfile? I get that it's in the vfs to work around file locking issues, but with the archive command with type files, it's likely that the archive is going to an empty target directory and this seems wasteful. So I just knocked up an extension (ciarchive) using the code from archival.py (hg-stable repo) - and in class fileit.addfile : changed f = self.opener(name, "w", atomictemp=True) to f = self.opener(name, "w", atomictemp=False) hg.exe archive --time --subrepos --no-decode --quiet c:\temp\archive27 time: real 22.224 secs (user 6.203+0.000 sys 12.078+0.000) hg.exe ciarchive --time --subrepos --no-decode --quiet c:\temp\archive28 time: real 17.316 secs (user 6.609+0.000 sys 7.453+0.000) The repo has the following files : 9438 File(s)531,462,248 bytes 2039 Dir(s) That's a substantial performance increase (our customers have very large repos where this will make a large difference in build times). Of course I'd much rather not be maintaining an extension that uses the internal api of hg, any chance this change can be made in the archive command, or at least be made configurable (assuming this change is safe!)? AFAICT, the code prior to this vfs call didn't do atomic files, and didn't mention why the atomic aspect changed. It looks like archive doesn't check for an empty directory, but it still seems OK, so go for it. https://www.mercurial-scm.org/wiki/ContributingChanges I did some quick profiling archiving the hg repo, and it looks like a lot of time is spent calculating tag info. Do you see a significant improvement when archiving a tagged revision, and/or using '--config ui.archivemeta=False'? (Ironically, 4.2.2 gave me the worst performance, but it seems like it had something to do with evolve not being able to load?) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: hg archive (files) performance regression.
On 2/03/2018 4:31 PM, Vincent Parrett wrote: I'm not a python dev (mostly c# and delphi), so still getting my head around the hg code base, but I'm curious why the atomictemp=true is used in fileit.addfile? I get that it's in the vfs to work around file locking issues, but with the archive command with type files, it's likely that the archive is going to an empty target directory and this seems wasteful. So I just knocked up an extension (ciarchive) using the code from archival.py (hg-stable repo) - and in class fileit.addfile : changed f = self.opener(name, "w", atomictemp=True) to f = self.opener(name, "w", atomictemp=False) hg.exe archive --time --subrepos --no-decode --quiet c:\temp\archive27 time: real 22.224 secs (user 6.203+0.000 sys 12.078+0.000) hg.exe ciarchive --time --subrepos --no-decode --quiet c:\temp\archive28 time: real 17.316 secs (user 6.609+0.000 sys 7.453+0.000) The repo has the following files : 9438 File(s) 531,462,248 bytes 2039 Dir(s) That's a substantial performance increase (our customers have very large repos where this will make a large difference in build times). Of course I'd much rather not be maintaining an extension that uses the internal api of hg, any chance this change can be made in the archive command, or at least be made configurable (assuming this change is safe!)? -- Regards Vincent Parrett CEO - VSoft Technologies Pty Ltd https://www.finalbuilder.com Blog: https://www.finalbuilder.com/resources/blogs Automate your Software builds with FinalBuilder. Open Source : https://github.com/VSoftTechnologies ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
hg archive (files) performance regression.
Somewhere between hg 4.2.2 and 4.5 the archive (files, to empty folder) command gotten around 10-13% slower. Testing on windows 10 x64 (latest updates), between two ssd drives : hg 4.5 : - "C:\Program Files\Mercurial\hg.exe" archive --time --subrepos --no-decode --quiet --profile c:\temp\archive9 Total time: 20.218750 seconds time: real 23.745 secs (user 8.109+0.000 sys 12.688+0.000) - hg 4.2.2 - "C:\Program Files\Mercurial-422\hg.exe" archive --time --subrepos --no-decode --quiet --profile c:\temp\archive10 Total time: 15.984375 seconds time: real 20.678 secs (user 7.234+0.000 sys 9.297+0.000) - I've confirmed this with a few different repos, the example above has lots of large files, I tested with others with thousands of source files and the slow down is is still around the 10-13% mark. I'm not a python dev (mostly c# and delphi), so still getting my head around the hg code base, but I'm curious why the atomictemp=true is used in fileit.addfile? I get that it's in the vfs to work around file locking issues, but with the archive command with type files, it's likely that the archive is going to an empty target directory and this seems wasteful. Is there anything else that can be done to speed up the archive (to files) command? -- Regards Vincent Parrett CEO - VSoft Technologies Pty Ltd https://www.finalbuilder.com Blog: https://www.finalbuilder.com/resources/blogs Automate your Software builds with FinalBuilder. Open Source : https://github.com/VSoftTechnologies ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel