D6430: rust-discovery: using from Python code

2019-08-14 Thread gracinet (Georges Racinet)
Closed by commit rHG4d20b1fe8a72: rust-discovery: using from Python code 
(authored by gracinet).
This revision was automatically updated to reflect the committed changes.
This revision was not accepted when it landed; it landed in state "Needs 
Review".

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D6430?vs=15487&id=16192

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

AFFECTED FILES
  mercurial/setdiscovery.py

CHANGE DETAILS

diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
--- a/mercurial/setdiscovery.py
+++ b/mercurial/setdiscovery.py
@@ -52,6 +52,7 @@
 )
 from . import (
 error,
+policy,
 util,
 )
 
@@ -269,6 +270,10 @@
 sample.update(takefrom[:more])
 return sample
 
+partialdiscovery = policy.importrust('discovery',
+ member='PartialDiscovery',
+ default=partialdiscovery)
+
 def findcommonheads(ui, local, remote,
 initialsamplesize=100,
 fullsamplesize=200,



To: gracinet, #hg-reviewers
Cc: durin42, pulkit, marmoute, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-08-14 Thread durin42 (Augie Fackler)
durin42 added a comment.


  I'm not thrilled with this - it's a lot of code, and I think there are some 
simpler options that haven't been explored. For example, clients could use the 
logexchange infrastructure to prime the sampling process and do a lot better.
  
  That said, it seems "fine" and I'm okay to land it with the understanding 
that if it becomes a maintenance burden we'll almost certainly rip it out in 
favor of the pure-Python codepath. I'm queueing it and will send a follow-up 
for one of my review comments.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: durin42, pulkit, marmoute, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-07-17 Thread pulkit (Pulkit Goyal)
pulkit added a comment.


  I tested this series on our repository and found the following speedup in a 
case when there are 6k heads and 15k csets missing:
  
  Before:
  
! wall 3.281062 comb 3.28 user 3.20 sys 0.08 (best of 25)
! wall 4.941540 comb 4.94 user 4.86 sys 0.08 (max of 25)
! wall 4.181741 comb 4.180800 user 4.103600 sys 0.077200 (avg of 25)
! wall 4.281518 comb 4.29 user 4.18 sys 0.11 (median of 25)
  
  After:
  
! wall 2.589386 comb 2.59 user 2.49 sys 0.10 (best of 25)
! wall 4.658467 comb 4.66 user 4.51 sys 0.15 (max of 25)
! wall 3.624166 comb 3.625200 user 3.544400 sys 0.080800 (avg of 25)
! wall 3.633663 comb 3.64 user 3.54 sys 0.10 (median of 25)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: pulkit, marmoute, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-07-04 Thread marmoute (Pierre-Yves David)
marmoute added a comment.


  In the couple of last version, we already saved minutes in real life use case 
simply by improving pure CPU processing time client side. Can you elaborate on 
what other types of evidences you need to convince you there exist CPU bound 
case for discovery?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: marmoute, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-07-04 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  In D6430#96225 , @marmoute wrote:
  
  > The ×2.5 speedup is the kind of things that motivate this series. Even if 
the mozilla-unified/mozilla-try is just and example it triggers the kind of 
pathological case we encounter in real life: large undecided set. We keep 
finding such pathological case from time to time, and will keep finding them. 
In addition there are case with legitimate large undedicated set (the mozilla 
example for one).
  > Having this faster code significantly reduce the impact of these 
pathological cases.
  
  I still don't see how those cases are CPU-bound.
  
  I won't object if someone else who believes in this queues the series, 
though. I'll even add accept stamps. But I want to see that some other reviewer 
believes in it first.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: marmoute, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-07-04 Thread marmoute (Pierre-Yves David)
marmoute added a comment.


  The ×2.5 speedup is the kind of things that motivate this series. Even if the 
mozilla-unified/mozilla-try is just and example it triggers the kind of 
pathological case we encounter in real life: large undecided set. We keep 
finding such pathological case from time to time, and will keep finding them. 
In addition there are case with legitimate large undedicated set (the mozilla 
example for one).
  Having this faster code significantly reduce the impact of these pathological 
cases.
  
  In addition as Georges pointed out, having this logic in Rust is a good step 
toward having a fast path detecting no-op pull. Something else we are 
interrested in.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: marmoute, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-06-30 Thread gracinet (Georges Racinet)
gracinet added a comment.


  Nobody has reacted yet to @martinvonz call for other opinions, can someone 
look at it?
  
  Let me restate that the series makes a good difference when the sampling part 
of the discovery process actually kicks in, and it does that roughly three 
times faster. In other cases, none of the new Rust code is actually executed, 
the main difference lies in data interchange at the Rust - Python interface. 
(recall that all performance comparisons are against a Rust-enhanced Mercurial, 
in particular with the Rust `MissingAncestors` object)
  
  The size of the undecided set can behave in a chaotic manner, especially in 
discoveries that involve continuous integration repositories with lots of 
merging and branching. The mozilla-unified / mozilla-try is but an example of 
that. Yes, we've seen discoveries go havoc at our customer's, with the size of 
the undecided set exploding. What the series does performance-wise is help keep 
these cases to an acceptable time.
  
  @martinvonz: I understand your concern about double maintenance. Let me 
reassure you that we are willing to maintain that, and it will cost us less 
than tracking down the perfect real-life example. About the 6.8%, that's  a 
case where the undecided set is small, it's there to demonstrate that despite 
the focus on the sampling is still a bit faster on other code paths.
  
  This series has also potential for :
  
  - further performance improvements if that's needed in the future; the 
obvious candidate would be a more aggressive integration with the 
`MissingAncestors` object, especially in `addcommons()`
  - pure Rust push/pull machinery, such as fastpaths doing the discovery 
entirely in Rust and bailing out if there's nothing to actually send or 
receive, smart status able to give discovery information in near realtime etc.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-06-16 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  > This example is not a goal in itself,
  
  Hmm, I would actually consider that the strongest selling point of this 
series. The other case (the 6.8% improvement) doesn't seem enough to justify 
all this code.
  
  > but it showcases several different
  > areas in which the process can become slow, due to different factors, and
  > how this full Rust version can help.
  
  I don't see this as much of an argument for discovery in Rust. I only care 
about practically measurable differences. The only real improvement was that 
"pull into mozilla-try repo from mozilla-unified" case.
  
  I'm fine with taking this series if we think that that use case is important 
enough to justify the extra maintenance of having a second implementation of 
the discovery code. I wouldn't think that use case is very common, but I'm 
guessing that the fact that you spent time on this series means that you have 
customers with repos with similar characteristics.
  
  What do others think?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-06-13 Thread gracinet (Georges Racinet)
gracinet added a comment.


  Instability fixed, and I took the opportunity to rebase again to leverage 
`policy.importrust`, that's been queued yesterday (rHGf7385ed775a8 
)
  
  I added new performance measurements for the big pathological case 
(mozilla-try / mozilla-unified) `with respectsize=False`. Speedup, compared to 
parent commit is still about x2.5 for me, not exactly for the same reasons, 
that's interesting.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-06-13 Thread gracinet (Georges Racinet)
gracinet edited the summary of this revision.
gracinet updated this revision to Diff 15487.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D6430?vs=15473&id=15487

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

AFFECTED FILES
  mercurial/setdiscovery.py

CHANGE DETAILS

diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
--- a/mercurial/setdiscovery.py
+++ b/mercurial/setdiscovery.py
@@ -52,6 +52,7 @@
 )
 from . import (
 error,
+policy,
 util,
 )
 
@@ -269,6 +270,10 @@
 sample.update(takefrom[:more])
 return sample
 
+partialdiscovery = policy.importrust('discovery',
+ member='PartialDiscovery',
+ default=partialdiscovery)
+
 def findcommonheads(ui, local, remote,
 initialsamplesize=100,
 fullsamplesize=200,



To: gracinet, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-06-12 Thread gracinet (Georges Racinet)
gracinet added a comment.


  Update of the whole series done but, ended up in an inconsistent state, and I 
have to go. Please don't act on it until I give the signal it's ready (sorry 
for inconvenience)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-06-12 Thread gracinet (Georges Racinet)
gracinet edited the summary of this revision.
gracinet updated this revision to Diff 15473.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D6430?vs=15230&id=15473

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6430/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

AFFECTED FILES
  mercurial/setdiscovery.py

CHANGE DETAILS

diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
--- a/mercurial/setdiscovery.py
+++ b/mercurial/setdiscovery.py
@@ -46,6 +46,11 @@
 import random
 
 from .i18n import _
+try:
+from .rustext import discovery as rustdisco
+except ImportError:
+rustdisco = None
+
 from .node import (
 nullid,
 nullrev,
@@ -390,6 +395,8 @@
 
 # full blown discovery
 
+if rustdisco is not None:
+partialdiscovery = rustdisco.PartialDiscovery
 randomize = ui.configbool('devel', 'discovery.randomize')
 disco = partialdiscovery(local, ownheads, remote.limitedarguments,
  randomize=randomize)



To: gracinet, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-06-03 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  I think this series needs to be updated for 
https://phab.mercurial-scm.org/D2647. I'm curious to see the timing numbers 
after that patch (I can get those myself once the series is updated if you're 
tired of running the perf commands).

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

To: gracinet, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6430: rust-discovery: using from Python code

2019-05-22 Thread gracinet (Georges Racinet)
gracinet created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  As previously done in other topics, the Rust version is used if it's been
  built.
  
  The version fully in Rust of the partialdiscovery class has the performance
  advantage over the Python version (actually using the Rust MissingAncestor) if
  the undecided set is big enough. Otherwise no sampling occurs, and the
  discovery is reasonably fast anyway.
  
  Note: it's hard to predict the size of the initial undecided set, it can
  depend on the kind of topological changes between the local and remote graphs.
  The point of the Rust version is to make the bad cases acceptable.
  
  More specifically, the performance advantages are:
  
  - faster sampling, especially takefullsample()
  - much faster addmissings() in almost all cases (see commit message in 
grandparent of the present changeset)
  - no conversion cost of the undecided set at the interface between Rust and 
Python
  
  Measurements with big undecided sets
  
  
  For an extreme example, discovery between mozilla-try and mozilla-unified
  (over one million undecided revisions, same case as in 
https://phab.mercurial-scm.org/rHGdbd0fcca6dfca3b0b90ee7c1af30b0a0765e5ea5), we
  get roughly a x3 better performance:
  
  Growing sample size (5%): time goes down from 210 to 72 seconds.
  Constant sample size: time down from 1853 to 659 seconds.
  
  Measurements with small undecided sets
  --
  
  In cases the undecided set is small enough than no sampling occurs,
  the Rust version has a disadvantage at init if `targetheads` is really big
  (some time is lost in the translation to Rust data structures),
  and that is compensated by the faster `addmissings()`.
  
  On a private repository with over one million commits, we still get a minor
  improvement, of 6.8%:
  
Before ! wall 0.593585 comb 0.59 user 0.55 sys 0.04 (median of 
50)
After  ! wall 0.553035 comb 0.55 user 0.52 sys 0.03 (median of 
50)
  
  What's interesting in that case is the first addinfo() at 180ms for Rust and
  233ms for Python+C, mostly due to add_missings and the children cache
  computation being done in less than 0.2ms on the Rust side vs over 40ms on the
  Python side.
  
  The worst case we have on hand is with mozilla-try, prepared with
  discovery-helper.sh for 10 heads and depth 10, time goes up 2.2% on the 
median.
  In this case `targetheads` is really huge with 165842 server heads.
  
Before ! wall 0.823884 comb 0.81 user 0.79 sys 0.02 (median of 
50)
After  ! wall 0.842607 comb 0.84 user 0.80 sys 0.04 (median of 
50)
  
  If that would be considered a problem, more adjustments can be made, which are
  prematurate at this stage: cooking special variants of methods of the inner
  MissingAncestors object, retrieving local heads directly from Rust to avoid
  the cost of conversion. Effort would probably be better spent at this point
  improving the surroundings if needed.
  
  Here's another data point with a smaller repository, pypy, where performance
  is almost identical
  
  Before ! wall 0.015121 comb 0.03 user 0.02 sys 0.01 (median of 
186)
   After  ! wall 0.015009 comb 0.01 user 0.01 sys 0.00 (median of 
184)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6430

AFFECTED FILES
  mercurial/setdiscovery.py

CHANGE DETAILS

diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
--- a/mercurial/setdiscovery.py
+++ b/mercurial/setdiscovery.py
@@ -46,6 +46,11 @@
 import random
 
 from .i18n import _
+try:
+from .rustext import discovery as rustdisco
+except ImportError:
+rustdisco = None
+
 from .node import (
 nullid,
 nullrev,
@@ -388,8 +393,11 @@
 # full blown discovery
 
 randomize = ui.configbool('devel', 'discovery.randomize')
-disco = partialdiscovery(local, ownheads, randomize=randomize)
-
+if rustdisco is not None:
+disco = rustdisco.PartialDiscovery(local.changelog.index, ownheads,
+   randomize=randomize)
+else:
+disco = partialdiscovery(local, ownheads, randomize=randomize)
 # treat remote heads (and maybe own heads) as a first implicit sample
 # response
 disco.addcommons(knownsrvheads)



To: gracinet, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel