D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-09 Thread spectral (Kyle Lippincott)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGd80380ba8e7d: changegroup: use any node, not min(), in 
treemanifest's generatemanifests (authored by spectral, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D1351?vs=3388&id=3389

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

AFFECTED FILES
  mercurial/changegroup.py

CHANGE DETAILS

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -692,7 +692,7 @@
 # Callback for the manifest, used to collect linkrevs for filelog
 # revisions.
 # Returns the linkrev node (collected in lookupcl).
-def makelookupmflinknode(dir):
+def makelookupmflinknode(dir, nodes):
 if fastpathlinkrev:
 assert not dir
 return mfs.__getitem__
@@ -713,7 +713,7 @@
 the client before you can trust the list of files and
 treemanifests to send.
 """
-clnode = tmfnodes[dir][x]
+clnode = nodes[x]
 mdata = mfl.get(dir, x).readfast(shallow=True)
 for p, n, fl in mdata.iterentries():
 if fl == 't': # subdirectory manifest
@@ -733,15 +733,13 @@
 
 size = 0
 while tmfnodes:
-dir = min(tmfnodes)
-nodes = tmfnodes[dir]
+dir, nodes = tmfnodes.popitem()
 prunednodes = self.prune(dirlog(dir), nodes, commonrevs)
 if not dir or prunednodes:
 for x in self._packmanifests(dir, prunednodes,
- makelookupmflinknode(dir)):
+ makelookupmflinknode(dir, nodes)):
 size += len(x)
 yield x
-del tmfnodes[dir]
 self._verbosenote(_('%8.i (manifests)\n') % size)
 yield self._manifestsdone()
 



To: spectral, #hg-reviewers, indygreg
Cc: indygreg, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-09 Thread spectral (Kyle Lippincott)
spectral marked 4 inline comments as done.
spectral added a comment.


  run-tests.py found no issues with this version, my manual testing (using 
PYTHONHASHSEED to adjust the ordering) also encountered no issues

INLINE COMMENTS

> indygreg wrote in changegroup.py:738
> Can we use ``dict.popitem()`` instead? That will pop a random key-value pair. 
> I just don't know if the key needs to remain in the dict until later in the 
> function...

profiling seems to indicate this is a little faster as well as being a bit 
cleaner, so thanks for making me check :)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

To: spectral, #hg-reviewers
Cc: indygreg, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-09 Thread spectral (Kyle Lippincott)
spectral updated this revision to Diff 3388.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D1351?vs=3358&id=3388

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

AFFECTED FILES
  mercurial/changegroup.py

CHANGE DETAILS

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -692,7 +692,7 @@
 # Callback for the manifest, used to collect linkrevs for filelog
 # revisions.
 # Returns the linkrev node (collected in lookupcl).
-def makelookupmflinknode(dir):
+def makelookupmflinknode(dir, nodes):
 if fastpathlinkrev:
 assert not dir
 return mfs.__getitem__
@@ -713,7 +713,7 @@
 the client before you can trust the list of files and
 treemanifests to send.
 """
-clnode = tmfnodes[dir][x]
+clnode = nodes[x]
 mdata = mfl.get(dir, x).readfast(shallow=True)
 for p, n, fl in mdata.iterentries():
 if fl == 't': # subdirectory manifest
@@ -733,15 +733,13 @@
 
 size = 0
 while tmfnodes:
-dir = min(tmfnodes)
-nodes = tmfnodes[dir]
+dir, nodes = tmfnodes.popitem()
 prunednodes = self.prune(dirlog(dir), nodes, commonrevs)
 if not dir or prunednodes:
 for x in self._packmanifests(dir, prunednodes,
- makelookupmflinknode(dir)):
+ makelookupmflinknode(dir, nodes)):
 size += len(x)
 yield x
-del tmfnodes[dir]
 self._verbosenote(_('%8.i (manifests)\n') % size)
 yield self._manifestsdone()
 



To: spectral, #hg-reviewers
Cc: indygreg, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz added inline comments.

INLINE COMMENTS

> spectral wrote in changegroup.py:738
> I think that it needs to remain, makelookupmflinknode(dir) relies on it 
> (L716).  I haven't attempted to popitem and pass that to makelookupmflinknode 
> instead, let me try that out now...

Oh, yuck. Can perhaps pass "nodes" into makelookupmflinknode() along with 
"dir"? It will still update it on line 721, but at least one of those unclear 
sideeffects go away.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

To: spectral, #hg-reviewers
Cc: indygreg, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-09 Thread spectral (Kyle Lippincott)
spectral added inline comments.

INLINE COMMENTS

> indygreg wrote in changegroup.py:738
> Can we use ``dict.popitem()`` instead? That will pop a random key-value pair. 
> I just don't know if the key needs to remain in the dict until later in the 
> function...

I think that it needs to remain, makelookupmflinknode(dir) relies on it (L716). 
 I haven't attempted to popitem and pass that to makelookupmflinknode instead, 
let me try that out now...

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

To: spectral, #hg-reviewers
Cc: indygreg, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  In https://phab.mercurial-scm.org/D1351#22473, @indygreg wrote:
  
  > Should I be concerned about the lack of test fallout? This new behavior is 
non-deterministic. Do we not have testing for this code or is existing testing 
not low-level enough to uncover behavior changes resulting from this change?
  
  
  I'm not worried about it. It only makes a difference for treemanifests, and 
the change is just about the order in changegroup and (therefore) the order in 
which we write revlogs. We don't have much testing of treemanifests and we 
rarely check the exact format in a changegroup. We do print out a debug message 
on line 482 that could be used to see a difference before and after this patch, 
but I just checked that test-treemanifest.t doesn't pass --verbose. Still, I 
wouldn't mind if some Facebooker tried to run their treemanifest tests (in 
hgexperimental) against a version with this patch applied.

INLINE COMMENTS

> indygreg wrote in changegroup.py:738
> Can we use ``dict.popitem()`` instead? That will pop a random key-value pair. 
> I just don't know if the key needs to remain in the dict until later in the 
> function...

Good idea. I'm pretty sure that should be safe (but perhaps tests will prove me 
wrong).

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

To: spectral, #hg-reviewers
Cc: indygreg, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-09 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  Should I be concerned about the lack of test fallout? This new behavior is 
non-deterministic. Do we not have testing for this code or is existing testing 
not low-level enough to uncover behavior changes resulting from this change?

INLINE COMMENTS

> changegroup.py:738
> +# element.
> +dir = next(iter(tmfnodes))
>  nodes = tmfnodes[dir]

Can we use ``dict.popitem()`` instead? That will pop a random key-value pair. I 
just don't know if the key needs to remain in the dict until later in the 
function...

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

To: spectral, #hg-reviewers
Cc: indygreg, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-08 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  As the author of that line of code, this looks good to me. I'll let someone 
else queue it since I was involved in the internal discussion and I don't want 
to feel like I'm queuing my own code (even though I wasn't involved in the 
actual coding).

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

To: spectral, #hg-reviewers
Cc: martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1351: changegroup: use any node, not min(), in treemanifest's generatemanifests

2017-11-08 Thread spectral (Kyle Lippincott)
spectral created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This is fixing quadratic behavior, which is probably not noticeable in the
  common case, but if a very large directory gets added here, it can get pretty
  bad. This was noticed because we had some pushes that spent >25s in 
changegroup
  generation calling min() here, according to profiling.
  
  The original reasoning for min() being used in 
https://phab.mercurial-scm.org/rHG829d369fc5a89f4c290013271c6e5dff2aea63de was 
that, at that
  point in the series, we were adding almost everything to tmfnodes during the
  first iteration through the loop , so we needed to avoid sending child
  directories before parents. Later changes made it so that the child 
directories
  were added only when we visited the parent directory (not all of them on the
  first iteration), so this is no longer necessary - there won't be any child
  directories in tmfnodes before the parents have been sent.
  
  This does mean that the manifests are now exchanged unordered, whereas
  previously we would essentially do [a, b, b/c, b/c/d, e], we now can send a, 
b,
  and e in any order; b/c must still follow b, and b/c/d must still follow b/c.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1351

AFFECTED FILES
  mercurial/changegroup.py

CHANGE DETAILS

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -733,7 +733,9 @@
 
 size = 0
 while tmfnodes:
-dir = min(tmfnodes)
+# Pick some element from tmfnodes, this is not necessarily the 
'min'
+# element.
+dir = next(iter(tmfnodes))
 nodes = tmfnodes[dir]
 prunednodes = self.prune(dirlog(dir), nodes, commonrevs)
 if not dir or prunednodes:



To: spectral, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel