Re: [PATCH 2 of 2] changegroup: skip delta when the underlying revlog do not use them

2016-10-18 Thread Yuya Nishihara
On Fri, 14 Oct 2016 03:08:06 +0200, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David 
> # Date 1476401471 -7200
> #  Fri Oct 14 01:31:11 2016 +0200
> # Node ID 747c0a1084ef6251a987e469197bad7796756403
> # Parent  e19eb107706e7210c3b359d66f5274911b181566
> # EXP-Topic storedeltachains
> changegroup: skip delta when the underlying revlog do not use them

Seems fine per the discussion on this thread. Queued, thanks.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2] changegroup: skip delta when the underlying revlog do not use them

2016-10-18 Thread Augie Fackler
On Sun, Oct 16, 2016 at 02:02:23PM +0200, Pierre-Yves David wrote:
>
>
> On 10/14/2016 09:01 AM, Gregory Szorc wrote:
> >Cool. I was going to author this patch when I got back home!
> >
> >This patch will result in CPU regression for old clients having to 
> >re-deltify. It would be nice to have numbers for that. I'm optimistic it is 
> >roughly the same as the server gains and it won't be significant enough to 
> >not take the patch. We also don't have a perf* command to measure 
> >changegroup application for a single component IIRC. So getting data isn't 
> >trivial :/
>
> As you just said, the redeltifying are probably around the same cost than
> the one we save with this patch (6 seconds for a full Mozilla clone). The
> extra cost seems reasonable to me in that case.

Queued these, thanks.

>
> Cheers,
>
> --
> Pierre-Yves David
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2] changegroup: skip delta when the underlying revlog do not use them

2016-10-16 Thread Pierre-Yves David



On 10/14/2016 09:01 AM, Gregory Szorc wrote:

Cool. I was going to author this patch when I got back home!

This patch will result in CPU regression for old clients having to re-deltify. 
It would be nice to have numbers for that. I'm optimistic it is roughly the 
same as the server gains and it won't be significant enough to not take the 
patch. We also don't have a perf* command to measure changegroup application 
for a single component IIRC. So getting data isn't trivial :/


As you just said, the redeltifying are probably around the same cost 
than the one we save with this patch (6 seconds for a full Mozilla 
clone). The extra cost seems reasonable to me in that case.


Cheers,

--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2] changegroup: skip delta when the underlying revlog do not use them

2016-10-14 Thread Gregory Szorc
Cool. I was going to author this patch when I got back home!

This patch will result in CPU regression for old clients having to re-deltify. 
It would be nice to have numbers for that. I'm optimistic it is roughly the 
same as the server gains and it won't be significant enough to not take the 
patch. We also don't have a perf* command to measure changegroup application 
for a single component IIRC. So getting data isn't trivial :/

> On Oct 14, 2016, at 03:08, Pierre-Yves David  
> wrote:
> 
> # HG changeset patch
> # User Pierre-Yves David 
> # Date 1476401471 -7200
> #  Fri Oct 14 01:31:11 2016 +0200
> # Node ID 747c0a1084ef6251a987e469197bad7796756403
> # Parent  e19eb107706e7210c3b359d66f5274911b181566
> # EXP-Topic storedeltachains
> changegroup: skip delta when the underlying revlog do not use them
> 
> Revlog can now be configured to store full snapshot only. This is used on the
> changelog. However, the changegroup packing was still recomputing deltas to be
> sent over the wire.
> 
> We now just reuse the full snapshot directly in this case, skipping delta
> computation. This provides use with a large speed up(-30%):
> 
> # perfchangegroupchangelog on mercurial
> ! wall 2.010326 comb 2.02 user 2.00 sys 0.02 (best of 5)
> ! wall 1.382039 comb 1.38 user 1.37 sys 0.01 (best of 8)
> 
> # perfchangegroupchangelog on pypy
> ! wall 5.792589 comb 5.78 user 5.78 sys 0.00 (best of 3)
> ! wall 3.911158 comb 3.92 user 3.90 sys 0.02 (best of 3)
> 
> # perfchangegroupchangelog on mozilla central
> ! wall 20.683727 comb 20.68 user 20.63 sys 0.05 (best of 3)
> ! wall 14.190204 comb 14.19 user 14.15 sys 0.04 (best of 3)
> 
> Many tests have to be updated because of the change in bundle content. All
> theses update have been verified.  Because diffing changelog was not very
> valuable, the resulting bundle have similar size (often a bit smaller):
> 
> # full bundle of mozilla central
> with delta:1142740533B
> without delta: 1142173300B
> 
> So this is a win all over the board.
> 
> diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
> --- a/mercurial/changegroup.py
> +++ b/mercurial/changegroup.py
> @@ -818,18 +818,24 @@ class cg2packer(cg1packer):
> 
> def deltaparent(self, revlog, rev, p1, p2, prev):
> dp = revlog.deltaparent(rev)
> -# Avoid sending full revisions when delta parent is null. Pick
> -# prev in that case. It's tempting to pick p1 in this case, as p1
> -# will be smaller in the common case. However, computing a delta
> -# against p1 may require resolving the raw text of p1, which could
> -# be expensive. The revlog caches should have prev cached, meaning
> -# less CPU for changegroup generation. There is likely room to add
> -# a flag and/or config option to control this behavior.
> -#
> -# Pick prev when we can't be sure remote has the base revision.
> -if dp == nullrev or (dp != p1 and dp != p2 and dp != prev):
> +if dp == nullrev and revlog.storedeltachains:
> +# Avoid sending full revisions when delta parent is null. Pick 
> prev
> +# in that case. It's tempting to pick p1 in this case, as p1 will
> +# be smaller in the common case. However, computing a delta 
> against
> +# p1 may require resolving the raw text of p1, which could be
> +# expensive. The revlog caches should have prev cached, meaning
> +# less CPU for changegroup generation. There is likely room to 
> add
> +# a flag and/or config option to control this behavior.
> return prev
> -return dp
> +elif dp == nullrev:
> +# revlog is configure to use full snapshot for a reason,
> +# stick to full snapshot.
> +return nullrev
> +elif dp not in (p1, p2, prev):
> +# Pick prev when we can't be sure remote has the base revision.
> +return prev
> +else:
> +return dp
> 
> def builddeltaheader(self, node, p1n, p2n, basenode, linknode, flags):
> # Do nothing with flags, it is implicitly 0 in cg1 and cg2problem 
> diff --git a/tests/test-acl.t b/tests/test-acl.t
> --- a/tests/test-acl.t
> +++ b/tests/test-acl.t
> @@ -113,7 +113,7 @@ Extension disabled for lack of a hook
>   adding quux/file.py revisions
>   added 3 changesets with 3 changes to 3 files
>   updating the branch cache
> -  bundle2-input-part: total payload size 1606
> +  bundle2-input-part: total payload size 1553
>   bundle2-input-part: "pushkey" (params: 4 mandatory) supported
>   pushing key for "phases:911600dab2ae7a9baff75958b84fe606851ce955"
>   bundle2-input-bundle: 3 parts total
> @@ -178,7 +178,7 @@ Extension disabled for lack of acl.sourc
>   calling hook pretxnchangegroup.acl: hgext.acl.hook
>   acl: changes have source "push" - skipping
>   updating the branch cache
> -  bund

[PATCH 2 of 2] changegroup: skip delta when the underlying revlog do not use them

2016-10-13 Thread Pierre-Yves David
# HG changeset patch
# User Pierre-Yves David 
# Date 1476401471 -7200
#  Fri Oct 14 01:31:11 2016 +0200
# Node ID 747c0a1084ef6251a987e469197bad7796756403
# Parent  e19eb107706e7210c3b359d66f5274911b181566
# EXP-Topic storedeltachains
changegroup: skip delta when the underlying revlog do not use them

Revlog can now be configured to store full snapshot only. This is used on the
changelog. However, the changegroup packing was still recomputing deltas to be
sent over the wire.

We now just reuse the full snapshot directly in this case, skipping delta
computation. This provides use with a large speed up(-30%):

# perfchangegroupchangelog on mercurial
! wall 2.010326 comb 2.02 user 2.00 sys 0.02 (best of 5)
! wall 1.382039 comb 1.38 user 1.37 sys 0.01 (best of 8)

# perfchangegroupchangelog on pypy
! wall 5.792589 comb 5.78 user 5.78 sys 0.00 (best of 3)
! wall 3.911158 comb 3.92 user 3.90 sys 0.02 (best of 3)

# perfchangegroupchangelog on mozilla central
! wall 20.683727 comb 20.68 user 20.63 sys 0.05 (best of 3)
! wall 14.190204 comb 14.19 user 14.15 sys 0.04 (best of 3)

Many tests have to be updated because of the change in bundle content. All
theses update have been verified.  Because diffing changelog was not very
valuable, the resulting bundle have similar size (often a bit smaller):

# full bundle of mozilla central
with delta:1142740533B
without delta: 1142173300B

So this is a win all over the board.

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -818,18 +818,24 @@ class cg2packer(cg1packer):
 
 def deltaparent(self, revlog, rev, p1, p2, prev):
 dp = revlog.deltaparent(rev)
-# Avoid sending full revisions when delta parent is null. Pick
-# prev in that case. It's tempting to pick p1 in this case, as p1
-# will be smaller in the common case. However, computing a delta
-# against p1 may require resolving the raw text of p1, which could
-# be expensive. The revlog caches should have prev cached, meaning
-# less CPU for changegroup generation. There is likely room to add
-# a flag and/or config option to control this behavior.
-#
-# Pick prev when we can't be sure remote has the base revision.
-if dp == nullrev or (dp != p1 and dp != p2 and dp != prev):
+if dp == nullrev and revlog.storedeltachains:
+# Avoid sending full revisions when delta parent is null. Pick prev
+# in that case. It's tempting to pick p1 in this case, as p1 will
+# be smaller in the common case. However, computing a delta against
+# p1 may require resolving the raw text of p1, which could be
+# expensive. The revlog caches should have prev cached, meaning
+# less CPU for changegroup generation. There is likely room to add
+# a flag and/or config option to control this behavior.
 return prev
-return dp
+elif dp == nullrev:
+# revlog is configure to use full snapshot for a reason,
+# stick to full snapshot.
+return nullrev
+elif dp not in (p1, p2, prev):
+# Pick prev when we can't be sure remote has the base revision.
+return prev
+else:
+return dp
 
 def builddeltaheader(self, node, p1n, p2n, basenode, linknode, flags):
 # Do nothing with flags, it is implicitly 0 in cg1 and cg2
diff --git a/tests/test-acl.t b/tests/test-acl.t
--- a/tests/test-acl.t
+++ b/tests/test-acl.t
@@ -113,7 +113,7 @@ Extension disabled for lack of a hook
   adding quux/file.py revisions
   added 3 changesets with 3 changes to 3 files
   updating the branch cache
-  bundle2-input-part: total payload size 1606
+  bundle2-input-part: total payload size 1553
   bundle2-input-part: "pushkey" (params: 4 mandatory) supported
   pushing key for "phases:911600dab2ae7a9baff75958b84fe606851ce955"
   bundle2-input-bundle: 3 parts total
@@ -178,7 +178,7 @@ Extension disabled for lack of acl.sourc
   calling hook pretxnchangegroup.acl: hgext.acl.hook
   acl: changes have source "push" - skipping
   updating the branch cache
-  bundle2-input-part: total payload size 1606
+  bundle2-input-part: total payload size 1553
   bundle2-input-part: "pushkey" (params: 4 mandatory) supported
   pushing key for "phases:911600dab2ae7a9baff75958b84fe606851ce955"
   bundle2-input-bundle: 3 parts total
@@ -254,7 +254,7 @@ No [acl.allow]/[acl.deny]
   acl: branch access granted: "911600dab2ae" on branch "default"
   acl: path access granted: "911600dab2ae"
   updating the branch cache
-  bundle2-input-part: total payload size 1606
+  bundle2-input-part: total payload size 1553
   bundle2-input-part: "pushkey" (params: 4 mandatory) supported
   pushing key for "phases:911600dab2ae7a9baff75958b84fe606851ce955"
   bundle2-inp