Re: hg archive (files) performance regression.

2018-03-02 Thread Vincent Parrett

On 3/03/2018 1:32 PM, Matt Harbison wrote:
AFAICT, the code prior to this vfs call didn't do atomic files, and 
didn't mention why the atomic aspect changed. 


I guess it's just something that was overlooked during refactoring to 
use the vfs?


It looks like archive doesn't check for an empty directory, but it 
still seems OK, so go for it.


https://www.mercurial-scm.org/wiki/ContributingChanges


I'll look into it, but not being a python developer (c#, delphi usually) 
I'm not sure I'm setup to compile hg itself. If all it needs is a 1 word 
change then I was kind hoping one of the core contributors might look at 
it ;)


I did some quick profiling archiving the hg repo, and it looks like a 
lot of time is spent calculating tag info.ᅵ Do you see a significant 
improvement when archiving a tagged revision, and/or using '--config 
ui.archivemeta=False'?ᅵ (Ironically, 4.2.2 gave me the worst 
performance, but it seems like it had something to do with evolve not 
being able to load?)


I haven't tested it with tags, but what we do is export a specific 
revision or tip. We also do specify --config ui.archivemeta=False 
(although I didn't in my testing before), I just tried it and that 
doesn't seem to make any real difference.


--
Regards

Vincent Parrett

CEO - VSoft Technologies Pty Ltd
https://www.finalbuilder.com
Blog: https://www.finalbuilder.com/resources/blogs
Automate your Software builds with FinalBuilder.
Open Source : https://github.com/VSoftTechnologies


___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: hg archive (files) performance regression.

2018-03-02 Thread Matt Harbison
On Fri, 02 Mar 2018 02:13:18 -0500, Vincent Parrett  
 wrote:



On 2/03/2018 4:31 PM, Vincent Parrett wrote:
I'm not a python dev (mostly c# and delphi), so still getting my head  
around the hg code base, but I'm curious why the atomictemp=true is  
used in fileit.addfile? I get that it's in the vfs to work around file  
locking issues, but with the archive command with type files, it's  
likely that the archive is going to an empty target directory and this  
seems wasteful.


So I just knocked up an extension (ciarchive) using the code from  
archival.py (hg-stable repo) - and in class fileit.addfile :


changed

 f = self.opener(name, "w", atomictemp=True)

to

 f = self.opener(name, "w", atomictemp=False)

hg.exe archive --time --subrepos --no-decode --quiet c:\temp\archive27
time: real 22.224 secs (user 6.203+0.000 sys 12.078+0.000)

hg.exe ciarchive --time --subrepos --no-decode --quiet c:\temp\archive28
time: real 17.316 secs (user 6.609+0.000 sys 7.453+0.000)

The repo has the following files :
 9438 File(s)531,462,248 bytes
 2039 Dir(s)

That's a substantial performance increase (our customers have very large  
repos where this will make a large difference in build times).  Of  
course I'd much rather not be maintaining an extension that uses the  
internal api of hg, any chance this change can be made in the archive  
command, or at least be made configurable (assuming this change is  
safe!)?


AFAICT, the code prior to this vfs call didn't do atomic files, and didn't  
mention why the atomic aspect changed.  It looks like archive doesn't  
check for an empty directory, but it still seems OK, so go for it.


https://www.mercurial-scm.org/wiki/ContributingChanges

I did some quick profiling archiving the hg repo, and it looks like a lot  
of time is spent calculating tag info.  Do you see a significant  
improvement when archiving a tagged revision, and/or using '--config  
ui.archivemeta=False'?  (Ironically, 4.2.2 gave me the worst performance,  
but it seems like it had something to do with evolve not being able to  
load?)

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: hg archive (files) performance regression.

2018-03-01 Thread Vincent Parrett

On 2/03/2018 4:31 PM, Vincent Parrett wrote:
I'm not a python dev (mostly c# and delphi), so still getting my head 
around the hg code base, but I'm curious why the atomictemp=true is 
used in fileit.addfile? I get that it's in the vfs to work around file 
locking issues, but with the archive command with type files, it's 
likely that the archive is going to an empty target directory and this 
seems wasteful. 


So I just knocked up an extension (ciarchive) using the code from 
archival.py (hg-stable repo) - and in class fileit.addfile :


changed

    f = self.opener(name, "w", atomictemp=True)

to

    f = self.opener(name, "w", atomictemp=False)

hg.exe archive --time --subrepos --no-decode --quiet c:\temp\archive27
time: real 22.224 secs (user 6.203+0.000 sys 12.078+0.000)

hg.exe ciarchive --time --subrepos --no-decode --quiet c:\temp\archive28
time: real 17.316 secs (user 6.609+0.000 sys 7.453+0.000)

The repo has the following files :
    9438 File(s)    531,462,248 bytes
    2039 Dir(s)

That's a substantial performance increase (our customers have very large 
repos where this will make a large difference in build times).  Of 
course I'd much rather not be maintaining an extension that uses the 
internal api of hg, any chance this change can be made in the archive 
command, or at least be made configurable (assuming this change is safe!)?


--
Regards

Vincent Parrett

CEO - VSoft Technologies Pty Ltd
https://www.finalbuilder.com
Blog: https://www.finalbuilder.com/resources/blogs
Automate your Software builds with FinalBuilder.
Open Source : https://github.com/VSoftTechnologies


___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


hg archive (files) performance regression.

2018-03-01 Thread Vincent Parrett


Somewhere between hg 4.2.2 and 4.5 the archive (files, to empty folder) 
command gotten around 10-13% slower.


Testing on windows 10 x64 (latest updates), between two ssd drives :

hg 4.5 :
-
"C:\Program Files\Mercurial\hg.exe" archive --time --subrepos 
--no-decode --quiet --profile c:\temp\archive9

Total time: 20.218750 seconds
time: real 23.745 secs (user 8.109+0.000 sys 12.688+0.000)
-

hg 4.2.2
-

"C:\Program Files\Mercurial-422\hg.exe" archive --time --subrepos 
--no-decode --quiet --profile c:\temp\archive10

Total time: 15.984375 seconds
time: real 20.678 secs (user 7.234+0.000 sys 9.297+0.000)
-

I've confirmed this with a few different repos, the example above has 
lots of large files, I tested with others with thousands of source files 
and the slow down is is still around the 10-13% mark.


I'm not a python dev (mostly c# and delphi), so still getting my head 
around the hg code base, but I'm curious why the atomictemp=true is used 
in fileit.addfile? I get that it's in the vfs to work around file 
locking issues, but with the archive command with type files, it's 
likely that the archive is going to an empty target directory and this 
seems wasteful.


Is there anything else that can be done to speed up the archive (to 
files) command?


--

Regards

Vincent Parrett

CEO - VSoft Technologies Pty Ltd
https://www.finalbuilder.com
Blog: https://www.finalbuilder.com/resources/blogs
Automate your Software builds with FinalBuilder.
Open Source : https://github.com/VSoftTechnologies


___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel