Re: Ideas and mentors for GSoC 2018.

2018-03-08 Thread Rishabh Madan
On Wed, Jan 17, 2018 at 1:01 AM, Pulkit Goyal <7895pul...@gmail.com> wrote:

> Hey everyone,
>
> We are taking part in Google Summer of Code from past two years under
> Python Software Foundation and we wish to continue the same this year.
>
> If you have any idea or WIP, which you think can be a good student
> project for this summers, please share.
>
> Also, if you have sometime available during the summers and interested
> to mentor a student in the program, it will be great. Please let me
> know.
>

If it's not too late, I'd like to mentor for the project on improving
commit graph in hgweb.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2728: rebase: also include commit of collapsed commits in single transaction

2018-03-08 Thread martinvonz (Martin von Zweigbergk)
martinvonz planned changes to this revision.
martinvonz added a comment.


  Please don't queue yet. I'm going to add tests.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2728

To: martinvonz, #hg-reviewers, durham, quark
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-08 Thread mharbison72 (Matt Harbison)
mharbison72 added a comment.


  This patch seems to deadlock on Windows when running 
test-ssh-proto-unbundle.t.  There are two python processes idling:
  
  c:\Python27\python.exe c:/Users/Matt/projects/hg/hg --verbose debugwireproto 
--localssh --noreadstderr
  CWD: 
C:\Users\Matt\AppData\Local\Temp\hgtests.1mtivg\child1\test-ssh-proto-unbundle.t\no-bundle1\
  
  and
  
  c:\Python27\python.exe c:/Users/Matt/projects/hg/hg -R 
C:\Users\Matt\AppData\Local\Temp\hgtests.1mtivg\child1\test-ssh-proto-unbundle.t\no-bundle1
 debugserve --sshstdio
  CWD: 
C:\Users\Matt\AppData\Local\Temp\hgtests.1mtivg\child1\test-ssh-proto-unbundle.t\no-bundle1\
  
  Past experience says that something needs to be flushed- stdout (and maybe 
stderr?) are full buffered on Windows IIRC.  The following failure with this 
patch might also point to that:
  
--- c:/Users/Matt/projects/hg/tests/test-ssh-proto.t
+++ c:/Users/Matt/projects/hg/tests/test-ssh-proto.t.err
@@ -100,12 +100,12 @@
   $ hg debugserve --sshstdio --logiofd 1 << EOF
   > hello
   > EOF
-  o> write(4) -> 4:
-  o> 384\n
-  o> write(384) -> 384:
-  o> capabilities: lookup branchmap pushkey known getbundle 
unbundlehash batch changegroupsubset streamreqs=generaldelta,revlogv1 
$USUAL_BUNDLE2_CAPS_SERVER$unbundle=HG10GZ,HG10BZ,HG10UN\n
   384
   capabilities: lookup branchmap pushkey known getbundle unbundlehash 
batch changegroupsubset streamreqs=generaldelta,revlogv1 
$USUAL_BUNDLE2_CAPS_SERVER$ unbundle=HG10GZ,HG10BZ,HG10UN
+  o> write(4) -> 4:
+  o> 384\n
+  o> write(384) -> 384:
+  o> capabilities: lookup branchmap pushkey known getbundle 
unbundlehash batch changegroupsubset streamreqs=generaldelta,revlogv1 
$USUAL_BUNDLE2_CAPS_SERVER$ unbundle=HG10GZ,HG10BZ,HG10UN\n
   o> flush() -> None

   $ hg debugserve --sshstdio --logiofile $TESTTMP/io << EOF

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2748: hgweb: remove wsgirequest.read()

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This was just a proxy to self.inp.read(). This method serves little
  value. Let's nuke it.
  
  Callers in the wire protocol server have been updated accordingly.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2748

AFFECTED FILES
  mercurial/hgweb/request.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -83,7 +83,7 @@
 postlen = int(self._req.headers.get(b'X-HgArgs-Post', 0))
 if postlen:
 args.update(urlreq.parseqs(
-self._wsgireq.read(postlen), keep_blank_values=True))
+self._wsgireq.inp.read(postlen), keep_blank_values=True))
 return args
 
 argvalue = decodevaluefromheaders(self._req, b'X-HgArg')
@@ -97,7 +97,7 @@
 # If httppostargs is used, we need to read Content-Length
 # minus the amount that was consumed by args.
 length -= int(self._req.headers.get(b'X-HgArgs-Post', 0))
-for s in util.filechunkiter(self._wsgireq, limit=length):
+for s in util.filechunkiter(self._wsgireq.inp, limit=length):
 fp.write(s)
 
 @contextlib.contextmanager
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -249,9 +249,6 @@
 def __iter__(self):
 return iter([])
 
-def read(self, count=-1):
-return self.inp.read(count)
-
 def drain(self):
 '''need to read all data from request, httplib is half-duplex'''
 length = int(self.env.get('CONTENT_LENGTH') or 0)



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2749: hgweb: remove wsgirequest.__iter__

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This was added in 
https://phab.mercurial-scm.org/rHGd0db3462d568908e3f9789076f006e7502dd2ab2 in 
2006. I can't find a justification
  for this method in PEP 0333. The tests all pass without this method.
  So let's nuke it.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2749

AFFECTED FILES
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -246,9 +246,6 @@
 self.server_write = None
 self.headers = []
 
-def __iter__(self):
-return iter([])
-
 def drain(self):
 '''need to read all data from request, httplib is half-duplex'''
 length = int(self.env.get('CONTENT_LENGTH') or 0)



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2746: wireprotoserver: remove unused argument from _handlehttperror()

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2746

AFFECTED FILES
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -192,7 +192,7 @@
 if req.dispatchpath:
 res = _handlehttperror(
 hgwebcommon.ErrorResponse(hgwebcommon.HTTP_NOT_FOUND), wsgireq,
-req, cmd)
+req)
 
 return True, res
 
@@ -206,7 +206,7 @@
 try:
 res = _callhttp(repo, wsgireq, req, proto, cmd)
 except hgwebcommon.ErrorResponse as e:
-res = _handlehttperror(e, wsgireq, req, cmd)
+res = _handlehttperror(e, wsgireq, req)
 
 return True, res
 
@@ -313,7 +313,7 @@
 return []
 raise error.ProgrammingError('hgweb.protocol internal failure', rsp)
 
-def _handlehttperror(e, wsgireq, req, cmd):
+def _handlehttperror(e, wsgireq, req):
 """Called when an ErrorResponse is raised during HTTP request 
processing."""
 
 # Clients using Python's httplib are stateful: the HTTP client



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2747: hgweb: remove unused methods on wsgirequest

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  writelines() isn't used in our code base.
  
  close() was a no-op. It is an optional method per PEP 0333.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2747

AFFECTED FILES
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -306,16 +306,9 @@
 if inst[0] != errno.ECONNRESET:
 raise
 
-def writelines(self, lines):
-for line in lines:
-self.write(line)
-
 def flush(self):
 return None
 
-def close(self):
-return None
-
 def wsgiapplication(app_maker):
 '''For compatibility with old CGI scripts. A plain hgweb() or hgwebdir()
 can and should now be used as a WSGI application.'''



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2057: rust implementation of hg status

2018-03-08 Thread Ivzhh (Sheng Mao)
Ivzhh added a comment.


  Hi everyone,
  
  Thank you for your encouragements and comments! I will follow up with all 
comments and update the code soon.
  
  @indygreg It is a great idea to test on Mozilla repo, actually I found 
several things interesting:
  
  1. I found a bug in my code (shame on me): because I did not use byte 
literal, and I made a typo. This triggers problem in Mozilla unified repo
  2. A regexp pattern in hgignore in Mozilla unified repo is not supported by 
rust's regex crate, a.k.a. "(?!)". I choose to ignore these unsupported 
patterns.
  3. My version is slower in this repo: 70s (hg) and 90s (mine). CodeXL reveals 
that the mpatch::collect() function uses 63% of the running time. I think I 
need to optimize it somehow.
  
  I totally agree with @kevincox that I did not sort well on 
char/u8/str/String/Path/PathBuf. The first bug is caused by this. I need to 
improve them.
  
  Thank you everyone!

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2057

To: Ivzhh, #hg-reviewers, kevincox
Cc: glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2743: wireprotoserver: access headers through parsed request

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Now that we can access headers via the parsed request object, let's
  do that.
  
  Since the new object uses bytes, hyphens, and is case-insensitive, a
  bit of code around normalizing values has been removed. I think
  the new code is much more intuitive because it more closely matches
  what is going out over the wire.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2743

AFFECTED FILES
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -36,26 +36,26 @@
 SSHV1 = wireprototypes.SSHV1
 SSHV2 = wireprototypes.SSHV2
 
-def decodevaluefromheaders(wsgireq, headerprefix):
+def decodevaluefromheaders(req, headerprefix):
 """Decode a long value from multiple HTTP request headers.
 
 Returns the value as a bytes, not a str.
 """
 chunks = []
 i = 1
-prefix = headerprefix.upper().replace(r'-', r'_')
 while True:
-v = wsgireq.env.get(r'HTTP_%s_%d' % (prefix, i))
+v = req.headers.get(b'%s-%d' % (headerprefix, i))
 if v is None:
 break
 chunks.append(pycompat.bytesurl(v))
 i += 1
 
 return ''.join(chunks)
 
 class httpv1protocolhandler(wireprototypes.baseprotocolhandler):
-def __init__(self, wsgireq, ui, checkperm):
+def __init__(self, wsgireq, req, ui, checkperm):
 self._wsgireq = wsgireq
+self._req = req
 self._ui = ui
 self._checkperm = checkperm
 
@@ -80,24 +80,24 @@
 
 def _args(self):
 args = util.rapply(pycompat.bytesurl, self._wsgireq.form.copy())
-postlen = int(self._wsgireq.env.get(r'HTTP_X_HGARGS_POST', 0))
+postlen = int(self._req.headers.get(b'X-HgArgs-Post', 0))
 if postlen:
 args.update(urlreq.parseqs(
 self._wsgireq.read(postlen), keep_blank_values=True))
 return args
 
-argvalue = decodevaluefromheaders(self._wsgireq, r'X-HgArg')
+argvalue = decodevaluefromheaders(self._req, b'X-HgArg')
 args.update(urlreq.parseqs(argvalue, keep_blank_values=True))
 return args
 
 def forwardpayload(self, fp):
-if r'HTTP_CONTENT_LENGTH' in self._wsgireq.env:
-length = int(self._wsgireq.env[r'HTTP_CONTENT_LENGTH'])
+if b'Content-Length' in self._req.headers:
+length = int(self._req.headers[b'Content-Length'])
 else:
 length = int(self._wsgireq.env[r'CONTENT_LENGTH'])
 # If httppostargs is used, we need to read Content-Length
 # minus the amount that was consumed by args.
-length -= int(self._wsgireq.env.get(r'HTTP_X_HGARGS_POST', 0))
+length -= int(self._req.headers.get(b'X-HgArgs-Post', 0))
 for s in util.filechunkiter(self._wsgireq, limit=length):
 fp.write(s)
 
@@ -193,32 +193,32 @@
 if req.dispatchpath:
 res = _handlehttperror(
 hgwebcommon.ErrorResponse(hgwebcommon.HTTP_NOT_FOUND), wsgireq,
-cmd)
+req, cmd)
 
 return True, res
 
-proto = httpv1protocolhandler(wsgireq, repo.ui,
+proto = httpv1protocolhandler(wsgireq, req, repo.ui,
   lambda perm: checkperm(rctx, wsgireq, perm))
 
 # The permissions checker should be the only thing that can raise an
 # ErrorResponse. It is kind of a layer violation to catch an hgweb
 # exception here. So consider refactoring into a exception type that
 # is associated with the wire protocol.
 try:
-res = _callhttp(repo, wsgireq, proto, cmd)
+res = _callhttp(repo, wsgireq, req, proto, cmd)
 except hgwebcommon.ErrorResponse as e:
-res = _handlehttperror(e, wsgireq, cmd)
+res = _handlehttperror(e, wsgireq, req, cmd)
 
 return True, res
 
-def _httpresponsetype(ui, wsgireq, prefer_uncompressed):
+def _httpresponsetype(ui, req, prefer_uncompressed):
 """Determine the appropriate response type and compression settings.
 
 Returns a tuple of (mediatype, compengine, engineopts).
 """
 # Determine the response media type and compression engine based
 # on the request parameters.
-protocaps = decodevaluefromheaders(wsgireq, r'X-HgProto').split(' ')
+protocaps = decodevaluefromheaders(req, 'X-HgProto').split(' ')
 
 if '0.2' in protocaps:
 # All clients are expected to support uncompressed data.
@@ -251,7 +251,7 @@
 opts = {'level': ui.configint('server', 'zliblevel')}
 return HGTYPE, util.compengines['zlib'], opts
 
-def _callhttp(repo, wsgireq, proto, cmd):
+def _callhttp(repo, wsgireq, req, proto, cmd):
 def genversion2(gen, engine, engineopts):
 # application/mercurial-0.2 always sends a payload header
 # identifying the 

D2745: hgweb: store and use request method on parsed request

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  PEP 0333 says that REQUEST_METHOD is always defined.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2745

AFFECTED FILES
  mercurial/hgweb/request.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -324,7 +324,7 @@
 # the HTTP response. In other words, it helps prevent deadlocks
 # on clients using httplib.
 
-if (wsgireq.env[r'REQUEST_METHOD'] == r'POST' and
+if (req.method == 'POST' and
 # But not if Expect: 100-continue is being used.
 (req.headers.get('Expect', '').lower() != '100-continue')):
 wsgireq.drain()
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -63,6 +63,8 @@
 class parsedrequest(object):
 """Represents a parsed WSGI request / static HTTP request parameters."""
 
+# Request method.
+method = attr.ib()
 # Full URL for this request.
 url = attr.ib()
 # URL without any path components. Just ://.
@@ -207,7 +209,8 @@
 if 'CONTENT_LENGTH' in env and 'HTTP_CONTENT_LENGTH' not in env:
 headers['Content-Length'] = env['CONTENT_LENGTH']
 
-return parsedrequest(url=fullurl, baseurl=baseurl,
+return parsedrequest(method=env['REQUEST_METHOD'],
+ url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
  advertisedbaseurl=advertisedbaseurl,
  apppath=apppath,



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2740: wireprotoserver: move all wire protocol handling logic out of hgweb

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Previous patches from several days ago worked to isolate processing
  of HTTP wire protocol requests to wireprotoserver. We still had a
  little logic in hgweb. If feels like the right time to finish the
  job.
  
  This commit moves WSGI request servicing from hgweb to wireprotoserver.
  The ugly dict holding the parsed request is no more. I think the new
  code is cleaner.
  
  As part of this, we now process wire protocol requests before the
  block to obtain the "query" variable. This makes it clear that this
  wonky "query" variable is not used by the wire protocol.
  
  The wonkiest part about this code is the HTTP 404. I'm actually not
  sure what all is going on here. It looks like the code is trying to
  prevent URL with path components that specify a command from not
  working. That part I grok. What I don't grok is why we need to send
  a 404. I would think it would be OK to no-op and let another handler
  try to service the request. But if we do this, we get some subrepo
  test failures. So it looks like something is expecting the HTTP 404
  and reacting to it in a specific way. It /might/ be possible to
  change the behavior here. But it isn't something I'm comfortable
  doing because I don't understand the problem space.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2740

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -150,24 +150,29 @@
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def parsehttprequest(rctx, wsgireq, req, checkperm):
-"""Parse the HTTP request for a wire protocol request.
+def handlewsgirequest(rctx, wsgireq, req, checkperm):
+"""Possibly process a wire protocol request.
 
-If the current request appears to be a wire protocol request, this
-function returns a dict with details about that request, including
-an ``abstractprotocolserver`` instance suitable for handling the
-request. Otherwise, ``None`` is returned.
+If the current request is a wire protocol request, the request is
+processed by this function.
 
 ``wsgireq`` is a ``wsgirequest`` instance.
 ``req`` is a ``parsedrequest`` instance.
+
+Returns a 2-tuple of (bool, response) where the 1st element indicates
+whether the request was handled and the 2nd element is a return
+value for a WSGI application (often a generator of bytes).
 """
+# Avoid cycle involving hg module.
+from .hgweb import common as hgwebcommon
+
 repo = rctx.repo
 
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
 if 'cmd' not in req.querystringdict:
-return None
+return False, None
 
 cmd = req.querystringdict['cmd'][0]
 
@@ -179,17 +184,32 @@
 # known wire protocol commands and it is less confusing for machine
 # clients.
 if not iscmd(cmd):
-return None
+return False, None
+
+# The "cmd" query string argument is only valid on the root path of the
+# repo. e.g. ``/?cmd=foo``, ``/repo?cmd=foo``. URL paths within the repo
+# like ``/blah?cmd=foo`` are not allowed. So don't recognize the request
+# in this case. We send an HTTP 404 for backwards compatibility reasons.
+if req.dispatchpath:
+res = _handlehttperror(
+hgwebcommon.ErrorResponse(hgwebcommon.HTTP_NOT_FOUND), wsgireq,
+cmd)
+
+return True, res
 
 proto = httpv1protocolhandler(wsgireq, repo.ui,
   lambda perm: checkperm(rctx, wsgireq, perm))
 
-return {
-'cmd': cmd,
-'proto': proto,
-'dispatch': lambda: _callhttp(repo, wsgireq, proto, cmd),
-'handleerror': lambda ex: _handlehttperror(ex, wsgireq, cmd),
-}
+# The permissions checker should be the only thing that can raise an
+# ErrorResponse. It is kind of a layer violation to catch an hgweb
+# exception here. So consider refactoring into a exception type that
+# is associated with the wire protocol.
+try:
+res = _callhttp(repo, wsgireq, proto, cmd)
+except hgwebcommon.ErrorResponse as e:
+res = _handlehttperror(e, wsgireq, cmd)
+
+return True, res
 
 def _httpresponsetype(ui, wsgireq, prefer_uncompressed):
 """Determine the appropriate response type and compression settings.
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -318,25 +318,16 @@
if h[0] != 'Content-Security-Policy']
 

D2744: hgweb: handle CONTENT_LENGTH

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  PEP-0333 says CONTENT_LENGTH may be set. I /think/ WSGI servers are
  allowed to invent this key even if the client didn't send it.
  
  We had code in wireprotoserver looking for this key. So let's
  just automagically convert this key to an HTTP request header
  when parsing the request.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2744

AFFECTED FILES
  mercurial/hgweb/request.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -91,10 +91,9 @@
 return args
 
 def forwardpayload(self, fp):
-if b'Content-Length' in self._req.headers:
-length = int(self._req.headers[b'Content-Length'])
-else:
-length = int(self._wsgireq.env[r'CONTENT_LENGTH'])
+# TODO Content-Length may not always be defined.
+length = int(self._req.headers[b'Content-Length'])
+
 # If httppostargs is used, we need to read Content-Length
 # minus the amount that was consumed by args.
 length -= int(self._req.headers.get(b'X-HgArgs-Post', 0))
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -200,6 +200,13 @@
 
 headers = wsgiheaders.Headers(headers)
 
+# This is kind of a lie because the HTTP header wasn't explicitly
+# sent. But for all intents and purposes it should be OK to lie about
+# this, since a consumer will either either value to determine how many
+# bytes are available to read.
+if 'CONTENT_LENGTH' in env and 'HTTP_CONTENT_LENGTH' not in env:
+headers['Content-Length'] = env['CONTENT_LENGTH']
+
 return parsedrequest(url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
  advertisedbaseurl=advertisedbaseurl,



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2738: hgweb: only recognize wire protocol commands from query string (BC)

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Previously, we attempted to parse the wire protocol command from
  `req.form`. Data could have come from the query string or POST
  form data.
  
  The wire protocol states that the command must be declared in the
  query string. And AFAICT all Mercurial releases from at least 1.0
  send the command in the query string.
  
  So let's actual require this behavior.
  
  This is technically BC. But I'm not sure how anyone in the wild
  would encounter this. POST has historically been used for sending
  bundle data. So there's no opportunity to encode arguments there.
  And the experimental HTTP POST args also takes over the body. So
  the only way someone would be impacted by this is if they wrote
  a custom client that both used POST for everything and sent arguments
  via the HTTP body. I don't believe such a client exists.
  
  .. bc::
  
The HTTP wire protocol server no longer accepts the ``cmd``
argument to control which command to run via HTTP POST bodies.
The ``cmd`` argument must be specified on the URL query string.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2738

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -150,25 +150,26 @@
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def parsehttprequest(rctx, wsgireq, query, checkperm):
+def parsehttprequest(rctx, wsgireq, req, checkperm):
 """Parse the HTTP request for a wire protocol request.
 
 If the current request appears to be a wire protocol request, this
 function returns a dict with details about that request, including
 an ``abstractprotocolserver`` instance suitable for handling the
 request. Otherwise, ``None`` is returned.
 
 ``wsgireq`` is a ``wsgirequest`` instance.
+``req`` is a ``parsedrequest`` instance.
 """
 repo = rctx.repo
 
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
-if 'cmd' not in wsgireq.form:
+if 'cmd' not in req.querystringdict:
 return None
 
-cmd = wsgireq.form['cmd'][0]
+cmd = req.querystringdict['cmd'][0]
 
 # The "cmd" request parameter is used by both the wire protocol and hgweb.
 # While not all wire protocol commands are available for all transports,
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -330,7 +330,7 @@
 
 # Route it to a wire protocol handler if it looks like a wire protocol
 # request.
-protohandler = wireprotoserver.parsehttprequest(rctx, wsgireq, query,
+protohandler = wireprotoserver.parsehttprequest(rctx, wsgireq, req,
 self.check_perm)
 
 if protohandler:



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2742: hgweb: parse and store HTTP request headers

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  WSGI transmits HTTP request headers as HTTP_* environment variables.
  We teach our parser about these and hook up a dict-like data
  structure that supports case insensitive header manipulation.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2742

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -11,6 +11,7 @@
 import cgi
 import errno
 import socket
+import wsgiref.headers as wsgiheaders
 #import wsgiref.validate
 
 from .common import (
@@ -85,6 +86,9 @@
 querystringlist = attr.ib()
 # Dict of query string arguments. Values are lists with at least 1 item.
 querystringdict = attr.ib()
+# wsgiref.headers.Headers instance. Operates like a dict with case
+# insensitive keys.
+headers = attr.ib()
 
 def parserequestfromenv(env):
 """Parse URL components from environment variables.
@@ -186,15 +190,26 @@
 else:
 querystringdict[k] = [v]
 
+# HTTP_* keys contain HTTP request headers. The Headers structure should
+# perform case normalization for us. We just rewrite underscore to dash
+# so keys match what likely went over the wire.
+headers = []
+for k, v in env.iteritems():
+if k.startswith('HTTP_'):
+headers.append((k[len('HTTP_'):].replace('_', '-'), v))
+
+headers = wsgiheaders.Headers(headers)
+
 return parsedrequest(url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
  advertisedbaseurl=advertisedbaseurl,
  apppath=apppath,
  dispatchparts=dispatchparts, 
dispatchpath=dispatchpath,
  havepathinfo='PATH_INFO' in env,
  querystring=querystring,
  querystringlist=querystringlist,
- querystringdict=querystringdict)
+ querystringdict=querystringdict,
+ headers=headers)
 
 class wsgirequest(object):
 """Higher-level API for a WSGI request.
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -351,7 +351,7 @@
 if args:
 wsgireq.form['file'] = args
 
-ua = wsgireq.env.get('HTTP_USER_AGENT', '')
+ua = req.headers.get('User-Agent', '')
 if cmd == 'rev' and 'mercurial' in ua:
 wsgireq.form['style'] = ['raw']
 



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2741: wireprotoserver: remove broken optimization for non-httplib client

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  There was an experimental non-httplib client in core for several
  years. It was removed a week or so ago.
  
  We kept the optimization for this client in the server code. I'm
  not sure if that was intended or not. But it doesn't matter: the
  code was wrong.
  
  Because the code was accessing a WSGI environment dict, it needed to
  access the HTTP_X_HGHTTP2 key to actually read the HTTP header. So
  the code deleted by this commit wasn't actually doing anything
  meaningful. Doh.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2741

AFFECTED FILES
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -328,10 +328,7 @@
 if (wsgireq.env[r'REQUEST_METHOD'] == r'POST' and
 # But not if Expect: 100-continue is being used.
 (wsgireq.env.get('HTTP_EXPECT',
- '').lower() != '100-continue') or
-# Or the non-httplib HTTP library is being advertised by
-# the client.
-wsgireq.env.get('X-HgHttp2', '')):
+ '').lower() != '100-continue')):
 wsgireq.drain()
 else:
 wsgireq.headers.append((r'Connection', r'Close'))



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2736: hgweb: use the parsed application path directly

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Previously, we assigned a custom system string with a trailing slash
  to wsgirequest.url.
  
  The addition of the trailing slash felt arbitrary and seems to go
  against how things typically work in WSGI.
  
  We also want our URLs to be bytes, not system strings.
  
  And, assigning a custom attribute to wsgirequest felt wrong.
  
  This commit fixes all those things by removing the trailing
  slash from the app path, changing consumers to use that variable
  and to use it without a trailing slash, and removing the custom
  attribute from wsgirequest.
  
  We preserve the trailing slash on {url}. Also, makebreadcrumb
  strips the trailing slash. So no change to it was needed.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2736

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -146,14 +146,13 @@
 # root. We also exclude its path components from PATH_INFO when resolving
 # the dispatch path.
 
-# TODO the use of trailing slashes in apppath is arguably wrong. We need it
-# to appease low-level parts of hgweb_mod for now.
 apppath = env['SCRIPT_NAME']
-if not apppath.endswith('/'):
-apppath += '/'
 
 if env.get('REPO_NAME'):
-apppath += env.get('REPO_NAME') + '/'
+if not apppath.endswith('/'):
+apppath += '/'
+
+apppath += env.get('REPO_NAME')
 
 if 'PATH_INFO' in env:
 dispatchparts = env['PATH_INFO'].strip('/').split('/')
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -148,7 +148,7 @@
 logourl = self.config('web', 'logourl')
 logoimg = self.config('web', 'logoimg')
 staticurl = (self.config('web', 'staticurl')
- or pycompat.sysbytes(wsgireq.url) + 'static/')
+ or req.apppath + '/static/')
 if not staticurl.endswith('/'):
 staticurl += '/'
 
@@ -170,24 +170,24 @@
 if not self.reponame:
 self.reponame = (self.config('web', 'name', '')
  or wsgireq.env.get('REPO_NAME')
- or wsgireq.url.strip(r'/') or self.repo.root)
+ or req.apppath or self.repo.root)
 
 def websubfilter(text):
 return templatefilters.websub(text, self.websubtable)
 
 # create the templater
 # TODO: export all keywords: defaults = templatekw.keywords.copy()
 defaults = {
-'url': pycompat.sysbytes(wsgireq.url),
+'url': req.apppath + '/',
 'logourl': logourl,
 'logoimg': logoimg,
 'staticurl': staticurl,
 'urlbase': req.advertisedbaseurl,
 'repo': self.reponame,
 'encoding': encoding.encoding,
 'motd': motd,
 'sessionvars': sessionvars,
-'pathdef': makebreadcrumb(pycompat.sysbytes(wsgireq.url)),
+'pathdef': makebreadcrumb(req.apppath),
 'style': style,
 'nonce': self.nonce,
 }
@@ -318,8 +318,6 @@
if h[0] != 'Content-Security-Policy']
 wsgireq.headers.append(('Content-Security-Policy', rctx.csp))
 
-wsgireq.url = pycompat.sysstr(req.apppath)
-
 if r'PATH_INFO' in wsgireq.env:
 parts = wsgireq.env[r'PATH_INFO'].strip(r'/').split(r'/')
 repo_parts = wsgireq.env.get(r'REPO_NAME', r'').split(r'/')



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2739: hgweb: use parsed request to construct query parameters

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The way hgweb routes requests is kind of bonkers. If PATH_INFO is
  set, we take the URL path after the repository. Otherwise, we take
  the first part of the query string before "&" and the part before
  ";" in that.
  
  We then kinda/sorta treat this as a path and route based on that.
  
  This commit ports that code to use the parsed request object. This
  required a new attribute on the parsed request to indicate whether
  there is any PATH_INFO.
  
  The new code still feels a bit convoluted for my liking. But we'll
  need to rewrite more of the code before a better solution becomes
  apparant. This code feels strictly better since we're no longer
  doing low-level WSGI manipulation during routing.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2739

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -76,6 +76,9 @@
 dispatchparts = attr.ib()
 # URL path component (no query string) used for dispatch.
 dispatchpath = attr.ib()
+# Whether there is a path component to this request. This can be true
+# when ``dispatchpath`` is empty due to REPO_NAME muckery.
+havepathinfo = attr.ib()
 # Raw query string (part after "?" in URL).
 querystring = attr.ib()
 # List of 2-tuples of query string arguments.
@@ -188,6 +191,7 @@
  advertisedbaseurl=advertisedbaseurl,
  apppath=apppath,
  dispatchparts=dispatchparts, 
dispatchpath=dispatchpath,
+ havepathinfo='PATH_INFO' in env,
  querystring=querystring,
  querystringlist=querystringlist,
  querystringdict=querystringdict)
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -318,15 +318,10 @@
if h[0] != 'Content-Security-Policy']
 wsgireq.headers.append(('Content-Security-Policy', rctx.csp))
 
-if r'PATH_INFO' in wsgireq.env:
-parts = wsgireq.env[r'PATH_INFO'].strip(r'/').split(r'/')
-repo_parts = wsgireq.env.get(r'REPO_NAME', r'').split(r'/')
-if parts[:len(repo_parts)] == repo_parts:
-parts = parts[len(repo_parts):]
-query = r'/'.join(parts)
+if req.havepathinfo:
+query = req.dispatchpath
 else:
-query = wsgireq.env[r'QUERY_STRING'].partition(r'&')[0]
-query = query.partition(r';')[0]
+query = req.querystring.partition('&')[0].partition(';')[0]
 
 # Route it to a wire protocol handler if it looks like a wire protocol
 # request.
@@ -344,7 +339,7 @@
 
 # translate user-visible url structure to internal structure
 
-args = query.split(r'/', 2)
+args = query.split('/', 2)
 if 'cmd' not in wsgireq.form and args and args[0]:
 cmd = args.pop(0)
 style = cmd.rfind('-')



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2730: hgweb: ensure all wsgi environment values are str

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Previously, we had a few entries that were bytes on Python 3.
  PEP-0333 states that all entries must be the native str type
  (bytes on Python 2, str on Python 3).
  
  This required a number of changes to hgweb_mod to unbreak
  things on Python 3. I suspect there still may be some regressions.
  
  I'm going to introduce a data structure that represents a parsed
  WSGI request in upcoming commits. This will hold bytes and will
  allow us to stop using raw literals throughout the WSGI code.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2730

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/server.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/server.py b/mercurial/hgweb/server.py
--- a/mercurial/hgweb/server.py
+++ b/mercurial/hgweb/server.py
@@ -124,8 +124,8 @@
 env[r'SERVER_NAME'] = self.server.server_name
 env[r'SERVER_PORT'] = str(self.server.server_port)
 env[r'REQUEST_URI'] = self.path
-env[r'SCRIPT_NAME'] = self.server.prefix
-env[r'PATH_INFO'] = path[len(self.server.prefix):]
+env[r'SCRIPT_NAME'] = pycompat.sysstr(self.server.prefix)
+env[r'PATH_INFO'] = pycompat.sysstr(path[len(self.server.prefix):])
 env[r'REMOTE_HOST'] = self.client_address[0]
 env[r'REMOTE_ADDR'] = self.client_address[0]
 if query:
@@ -154,7 +154,7 @@
 env[hkey] = hval
 env[r'SERVER_PROTOCOL'] = self.request_version
 env[r'wsgi.version'] = (1, 0)
-env[r'wsgi.url_scheme'] = self.url_scheme
+env[r'wsgi.url_scheme'] = pycompat.sysstr(self.url_scheme)
 if env.get(r'HTTP_EXPECT', '').lower() == '100-continue':
 self.rfile = common.continuereader(self.rfile, self.wfile.write)
 
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -159,7 +159,8 @@
 urlbase = r'%s://%s%s' % (proto, req.env[r'SERVER_NAME'], port)
 logourl = self.config('web', 'logourl')
 logoimg = self.config('web', 'logoimg')
-staticurl = self.config('web', 'staticurl') or req.url + 'static/'
+staticurl = (self.config('web', 'staticurl')
+ or pycompat.sysbytes(req.url) + 'static/')
 if not staticurl.endswith('/'):
 staticurl += '/'
 
@@ -182,24 +183,24 @@
 if not self.reponame:
 self.reponame = (self.config('web', 'name', '')
  or req.env.get('REPO_NAME')
- or req.url.strip('/') or self.repo.root)
+ or req.url.strip(r'/') or self.repo.root)
 
 def websubfilter(text):
 return templatefilters.websub(text, self.websubtable)
 
 # create the templater
 # TODO: export all keywords: defaults = templatekw.keywords.copy()
 defaults = {
-'url': req.url,
+'url': pycompat.sysbytes(req.url),
 'logourl': logourl,
 'logoimg': logoimg,
 'staticurl': staticurl,
 'urlbase': urlbase,
 'repo': self.reponame,
 'encoding': encoding.encoding,
 'motd': motd,
 'sessionvars': sessionvars,
-'pathdef': makebreadcrumb(req.url),
+'pathdef': makebreadcrumb(pycompat.sysbytes(req.url)),
 'style': style,
 'nonce': self.nonce,
 }
@@ -333,17 +334,17 @@
 # use SCRIPT_NAME, PATH_INFO and QUERY_STRING as well as our REPO_NAME
 
 req.url = req.env[r'SCRIPT_NAME']
-if not req.url.endswith('/'):
-req.url += '/'
+if not req.url.endswith(r'/'):
+req.url += r'/'
 if req.env.get('REPO_NAME'):
 req.url += req.env[r'REPO_NAME'] + r'/'
 
 if r'PATH_INFO' in req.env:
-parts = req.env[r'PATH_INFO'].strip('/').split('/')
+parts = req.env[r'PATH_INFO'].strip(r'/').split(r'/')
 repo_parts = req.env.get(r'REPO_NAME', r'').split(r'/')
 if parts[:len(repo_parts)] == repo_parts:
 parts = parts[len(repo_parts):]
-query = '/'.join(parts)
+query = r'/'.join(parts)
 else:
 query = req.env[r'QUERY_STRING'].partition(r'&')[0]
 query = query.partition(r';')[0]
@@ -364,7 +365,7 @@
 
 # translate user-visible url structure to internal structure
 
-args = query.split('/', 2)
+args = query.split(r'/', 2)
 if 'cmd' not in req.form and args and args[0]:
 cmd = args.pop(0)
 style = cmd.rfind('-')



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org

D2735: hgweb: use computed base URL from parsed request

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Let's not reinvent URL construction in a function that runs the
  templater.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2735

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -142,21 +142,9 @@
 if typ in allowed or self.configbool('web', 'allow%s' % typ):
 yield {'type': typ, 'extension': spec[2], 'node': nodeid}
 
-def templater(self, wsgireq):
+def templater(self, wsgireq, req):
 # determine scheme, port and server name
 # this is needed to create absolute urls
-
-proto = wsgireq.env.get('wsgi.url_scheme')
-if proto == 'https':
-proto = 'https'
-default_port = '443'
-else:
-proto = 'http'
-default_port = '80'
-
-port = wsgireq.env[r'SERVER_PORT']
-port = port != default_port and (r':' + port) or r''
-urlbase = r'%s://%s%s' % (proto, wsgireq.env[r'SERVER_NAME'], port)
 logourl = self.config('web', 'logourl')
 logoimg = self.config('web', 'logoimg')
 staticurl = (self.config('web', 'staticurl')
@@ -194,7 +182,7 @@
 'logourl': logourl,
 'logoimg': logoimg,
 'staticurl': staticurl,
-'urlbase': urlbase,
+'urlbase': req.advertisedbaseurl,
 'repo': self.reponame,
 'encoding': encoding.encoding,
 'motd': motd,
@@ -396,7 +384,7 @@
 # process the web interface request
 
 try:
-tmpl = rctx.templater(wsgireq)
+tmpl = rctx.templater(wsgireq, req)
 ctype = tmpl('mimetype', encoding=encoding.encoding)
 ctype = templater.stringify(ctype)
 



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2737: hgweb: teach WSGI parser about query strings

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Currently, req.form uses cgi.parse() to populate form data. Depending
  on the request, form data can come from POST multipart/form-data,
  application/x-www-form-urlencoded, or the URL query string.
  
  Putting all these things into one data structure makes it difficult
  to reason about how exactly parameters got to the request. It can
  lead to wonkiness such as pulling parameters from both the URL and
  POST data.
  
  This commit teaches our WSGI request parser about argument data
  in query strings. We populate fields containing the query string
  data and only the query string data so it can't be confused with
  POST data.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2737

AFFECTED FILES
  mercurial/hgweb/request.py
  mercurial/urllibcompat.py

CHANGE DETAILS

diff --git a/mercurial/urllibcompat.py b/mercurial/urllibcompat.py
--- a/mercurial/urllibcompat.py
+++ b/mercurial/urllibcompat.py
@@ -48,6 +48,7 @@
 "urlunparse",
 ))
 urlreq._registeralias(urllib.parse, "parse_qs", "parseqs")
+urlreq._registeralias(urllib.parse, "parse_qsl", "parseqsl")
 urlreq._registeralias(urllib.parse, "unquote_to_bytes", "unquote")
 import urllib.request
 urlreq._registeraliases(urllib.request, (
@@ -159,6 +160,7 @@
 "urlunparse",
 ))
 urlreq._registeralias(urlparse, "parse_qs", "parseqs")
+urlreq._registeralias(urlparse, "parse_qsl", "parseqsl")
 urlerr._registeraliases(urllib2, (
 "HTTPError",
 "URLError",
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -78,6 +78,10 @@
 dispatchpath = attr.ib()
 # Raw query string (part after "?" in URL).
 querystring = attr.ib()
+# List of 2-tuples of query string arguments.
+querystringlist = attr.ib()
+# Dict of query string arguments. Values are lists with at least 1 item.
+querystringdict = attr.ib()
 
 def parserequestfromenv(env):
 """Parse URL components from environment variables.
@@ -168,12 +172,25 @@
 
 querystring = env.get('QUERY_STRING', '')
 
+# We store as a list so we have ordering information. We also store as
+# a dict to facilitate fast lookup.
+querystringlist = util.urlreq.parseqsl(querystring, keep_blank_values=True)
+
+querystringdict = {}
+for k, v in querystringlist:
+if k in querystringdict:
+querystringdict[k].append(v)
+else:
+querystringdict[k] = [v]
+
 return parsedrequest(url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
  advertisedbaseurl=advertisedbaseurl,
  apppath=apppath,
  dispatchparts=dispatchparts, 
dispatchpath=dispatchpath,
- querystring=querystring)
+ querystring=querystring,
+ querystringlist=querystringlist,
+ querystringdict=querystringdict)
 
 class wsgirequest(object):
 """Higher-level API for a WSGI request.



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2733: hgweb: always use "?" when writing session vars

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This code resolves a string to insert in URLs as part of a
  query string. Essentially, it resolves the {sessionvars}
  template keyword, which is used by hgweb templates to build
  a URL as a string.
  
  The whole approach here feels wrong because there's no way of
  knowing when this code runs how the final URL will look. There
  could be additional URL fragments added before this template
  keyword that add a query string component.
  
  Furthermore, I don't think there's *any* for req.url to have
  a query string. That's because the code that populates this
  variable only takes SCRIPT_NAME and REPO_NAME into account. The
  "?" character it is searching for would only be added if some
  code attempted to add QUERY_STRING to the URL. Hacking the code
  up to raise if "?" is present in the URL yields a clean test
  suite run. I'm not sure if we broke this code or if it has
  always been broken.
  
  Anyway, this commit removes support for emitting "&" as the
  first character in {sessionvars} and makes it always emit "?",
  which is what it was always doing before AFAICT.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2733

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/hgwebdir_mod.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py
--- a/mercurial/hgweb/hgwebdir_mod.py
+++ b/mercurial/hgweb/hgwebdir_mod.py
@@ -509,8 +509,7 @@
 if style == styles[0]:
 vars['style'] = style
 
-start = r'&' if url[-1] == r'?' else r'?'
-sessionvars = webutil.sessionvars(vars, start)
+sessionvars = webutil.sessionvars(vars, r'?')
 logourl = config('web', 'logourl')
 logoimg = config('web', 'logoimg')
 staticurl = config('web', 'staticurl') or url + 'static/'
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -177,8 +177,7 @@
 if style == styles[0]:
 vars['style'] = style
 
-start = '&' if wsgireq.url[-1] == r'?' else '?'
-sessionvars = webutil.sessionvars(vars, start)
+sessionvars = webutil.sessionvars(vars, '?')
 
 if not self.reponame:
 self.reponame = (self.config('web', 'name', '')



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2732: hgweb: rename req to wsgireq

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  We will soon introduce a parsed WSGI request object so we don't
  have to concern ourselves with low-level WSGI matters. Prepare
  for multiple request objects by renaming the existing one so it
  is clear it deals with WSGI.
  
  We also remove a symbol import to avoid even more naming confusion.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2732

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/hgwebdir_mod.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -36,26 +36,26 @@
 SSHV1 = wireprototypes.SSHV1
 SSHV2 = wireprototypes.SSHV2
 
-def decodevaluefromheaders(req, headerprefix):
+def decodevaluefromheaders(wsgireq, headerprefix):
 """Decode a long value from multiple HTTP request headers.
 
 Returns the value as a bytes, not a str.
 """
 chunks = []
 i = 1
 prefix = headerprefix.upper().replace(r'-', r'_')
 while True:
-v = req.env.get(r'HTTP_%s_%d' % (prefix, i))
+v = wsgireq.env.get(r'HTTP_%s_%d' % (prefix, i))
 if v is None:
 break
 chunks.append(pycompat.bytesurl(v))
 i += 1
 
 return ''.join(chunks)
 
 class httpv1protocolhandler(wireprototypes.baseprotocolhandler):
-def __init__(self, req, ui, checkperm):
-self._req = req
+def __init__(self, wsgireq, ui, checkperm):
+self._wsgireq = wsgireq
 self._ui = ui
 self._checkperm = checkperm
 
@@ -79,26 +79,26 @@
 return [data[k] for k in keys]
 
 def _args(self):
-args = util.rapply(pycompat.bytesurl, self._req.form.copy())
-postlen = int(self._req.env.get(r'HTTP_X_HGARGS_POST', 0))
+args = util.rapply(pycompat.bytesurl, self._wsgireq.form.copy())
+postlen = int(self._wsgireq.env.get(r'HTTP_X_HGARGS_POST', 0))
 if postlen:
 args.update(urlreq.parseqs(
-self._req.read(postlen), keep_blank_values=True))
+self._wsgireq.read(postlen), keep_blank_values=True))
 return args
 
-argvalue = decodevaluefromheaders(self._req, r'X-HgArg')
+argvalue = decodevaluefromheaders(self._wsgireq, r'X-HgArg')
 args.update(urlreq.parseqs(argvalue, keep_blank_values=True))
 return args
 
 def forwardpayload(self, fp):
-if r'HTTP_CONTENT_LENGTH' in self._req.env:
-length = int(self._req.env[r'HTTP_CONTENT_LENGTH'])
+if r'HTTP_CONTENT_LENGTH' in self._wsgireq.env:
+length = int(self._wsgireq.env[r'HTTP_CONTENT_LENGTH'])
 else:
-length = int(self._req.env[r'CONTENT_LENGTH'])
+length = int(self._wsgireq.env[r'CONTENT_LENGTH'])
 # If httppostargs is used, we need to read Content-Length
 # minus the amount that was consumed by args.
-length -= int(self._req.env.get(r'HTTP_X_HGARGS_POST', 0))
-for s in util.filechunkiter(self._req, limit=length):
+length -= int(self._wsgireq.env.get(r'HTTP_X_HGARGS_POST', 0))
+for s in util.filechunkiter(self._wsgireq, limit=length):
 fp.write(s)
 
 @contextlib.contextmanager
@@ -118,9 +118,9 @@
 
 def client(self):
 return 'remote:%s:%s:%s' % (
-self._req.env.get('wsgi.url_scheme') or 'http',
-urlreq.quote(self._req.env.get('REMOTE_HOST', '')),
-urlreq.quote(self._req.env.get('REMOTE_USER', '')))
+self._wsgireq.env.get('wsgi.url_scheme') or 'http',
+urlreq.quote(self._wsgireq.env.get('REMOTE_HOST', '')),
+urlreq.quote(self._wsgireq.env.get('REMOTE_USER', '')))
 
 def addcapabilities(self, repo, caps):
 caps.append('httpheader=%d' %
@@ -150,25 +150,25 @@
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def parsehttprequest(rctx, req, query, checkperm):
+def parsehttprequest(rctx, wsgireq, query, checkperm):
 """Parse the HTTP request for a wire protocol request.
 
 If the current request appears to be a wire protocol request, this
 function returns a dict with details about that request, including
 an ``abstractprotocolserver`` instance suitable for handling the
 request. Otherwise, ``None`` is returned.
 
-``req`` is a ``wsgirequest`` instance.
+``wsgireq`` is a ``wsgirequest`` instance.
 """
 repo = rctx.repo
 
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
-if 'cmd' not in req.form:
+if 'cmd' not in wsgireq.form:
 return None
 
-cmd = req.form['cmd'][0]
+cmd = wsgireq.form['cmd'][0]
 
 # The "cmd" request parameter is used by both the wire 

D2734: hgweb: parse WSGI request into a data structure

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Currently, our WSGI applications (hgweb_mod and hgwebdir_mod) process
  the raw WSGI request instance themselves. This means they have to
  talk in terms of system strings. And they need to know details
  about what's in the WSGI request. And in the case of hgweb_mod, it
  is doing some very funky things with URL parsing to impact
  dispatching. The code is difficult to read and maintain.
  
  This commit introduces parsing of the WSGI request into a higher-level
  and easier-to-reason-about data structure.
  
  To prove it works, we hook it up to hgweb_mod and use it for populating
  the relative URL on the request instance.
  
  We hold off on using it in more places because the logic in hgweb_mod
  is crazy and I don't want to involve those changes with review of
  the parsing code.
  
  The URL construction code has variations that use the HTTP: Host header
  (the canonical WSGI way of reconstructing the URL) and with the use
  of SERVER_NAME. We need to differentiate because hgweb is currently
  using SERVER_NAME for URL construction.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2734

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/hgwebdir_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -11,13 +11,17 @@
 import cgi
 import errno
 import socket
+#import wsgiref.validate
 
 from .common import (
 ErrorResponse,
 HTTP_NOT_MODIFIED,
 statusmessage,
 )
 
+from ..thirdparty import (
+attr,
+)
 from .. import (
 pycompat,
 util,
@@ -54,6 +58,124 @@
 pycompat.bytesurl(i.strip()) for i in v]
 return bytesform
 
+@attr.s(frozen=True)
+class parsedrequest(object):
+"""Represents a parsed WSGI request / static HTTP request parameters."""
+
+# Full URL for this request.
+url = attr.ib()
+# URL without any path components. Just ://.
+baseurl = attr.ib()
+# Advertised URL. Like ``url`` and ``baseurl`` but uses SERVER_NAME instead
+# of HTTP: Host header for hostname. This is likely what clients used.
+advertisedurl = attr.ib()
+advertisedbaseurl = attr.ib()
+# WSGI application path.
+apppath = attr.ib()
+# List of path parts to be used for dispatch.
+dispatchparts = attr.ib()
+# URL path component (no query string) used for dispatch.
+dispatchpath = attr.ib()
+# Raw query string (part after "?" in URL).
+querystring = attr.ib()
+
+def parserequestfromenv(env):
+"""Parse URL components from environment variables.
+
+WSGI defines request attributes via environment variables. This function
+parses the environment variables into a data structure.
+"""
+# PEP-0333 defines the WSGI spec and is a useful reference for this code.
+
+# We first validate that the incoming object conforms with the WSGI spec.
+# We only want to be dealing with spec-conforming WSGI implementations.
+# TODO enable this once we fix internal violations.
+#wsgiref.validate.check_environ(env)
+
+# PEP-0333 states that environment keys and values are native strings
+# (bytes on Python 2 and str on Python 3). The code points for the Unicode
+# strings on Python 3 must be between \0-\000FF. We deal with bytes
+# in Mercurial, so mass convert string keys and values to bytes.
+if pycompat.ispy3:
+env = {k.encode('latin-1'): v for k, v in env.iteritems()}
+env = {k: v.encode('latin-1') if isinstance(v, str) else v
+   for k, v in env.iteritems()}
+
+# https://www.python.org/dev/peps/pep-0333/#environ-variables defines
+# the environment variables.
+# https://www.python.org/dev/peps/pep-0333/#url-reconstruction defines
+# how URLs are reconstructed.
+fullurl = env['wsgi.url_scheme'] + '://'
+advertisedfullurl = fullurl
+
+def addport(s):
+if env['wsgi.url_scheme'] == 'https':
+if env['SERVER_PORT'] != '443':
+s += ':' + env['SERVER_PORT']
+else:
+if env['SERVER_PORT'] != '80':
+s += ':' + env['SERVER_PORT']
+
+return s
+
+if env.get('HTTP_HOST'):
+fullurl += env['HTTP_HOST']
+else:
+fullurl += env['SERVER_NAME']
+addport(fullurl)
+
+advertisedfullurl += env['SERVER_NAME']
+advertisedfullurl = addport(advertisedfullurl)
+
+baseurl = fullurl
+advertisedbaseurl = advertisedfullurl
+
+fullurl += util.urlreq.quote(env.get('SCRIPT_NAME', ''))
+advertisedfullurl += util.urlreq.quote(env.get('SCRIPT_NAME', ''))
+fullurl += util.urlreq.quote(env.get('PATH_INFO', ''))
+advertisedfullurl += util.urlreq.quote(env.get('PATH_INFO', ''))
+
+if env.get('QUERY_STRING'):
+fullurl 

D2731: hgweb: validate WSGI environment dict

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The wsgiref.validate module contains useful functions for validating
  that various WSGI data structures are proper.
  
  This commit adds validation of the environment dict to our built-in
  HTTP server, which turns an HTTP request into an environment dict.
  
  The check discovered that we weren't always setting QUERY_STRING,
  which would cause the cgi module to fall back to sys.argv. So we
  change things to always set QUERY_STRING.
  
  The check passes on Python 2 and 3.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2731

AFFECTED FILES
  mercurial/hgweb/server.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/server.py b/mercurial/hgweb/server.py
--- a/mercurial/hgweb/server.py
+++ b/mercurial/hgweb/server.py
@@ -13,6 +13,7 @@
 import socket
 import sys
 import traceback
+import wsgiref.validate
 
 from ..i18n import _
 
@@ -128,8 +129,7 @@
 env[r'PATH_INFO'] = pycompat.sysstr(path[len(self.server.prefix):])
 env[r'REMOTE_HOST'] = self.client_address[0]
 env[r'REMOTE_ADDR'] = self.client_address[0]
-if query:
-env[r'QUERY_STRING'] = query
+env[r'QUERY_STRING'] = query or r''
 
 if pycompat.ispy3:
 if self.headers.get_content_type() is None:
@@ -166,6 +166,8 @@
   socketserver.ForkingMixIn)
 env[r'wsgi.run_once'] = 0
 
+wsgiref.validate.check_environ(env)
+
 self.saved_status = None
 self.saved_headers = []
 self.length = None



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2729: copyfile: preserve stat info (mtime, etc.) when doing copies/renames

2018-03-08 Thread spectral (Kyle Lippincott)
spectral created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2729

AFFECTED FILES
  hgext/largefiles/overrides.py
  mercurial/cmdutil.py
  tests/test-rename.t

CHANGE DETAILS

diff --git a/tests/test-rename.t b/tests/test-rename.t
--- a/tests/test-rename.t
+++ b/tests/test-rename.t
@@ -657,3 +657,15 @@
   [255]
   $ hg status -C
 
+check that stat information such as mtime is preserved - it's unclear whether
+the `touch` and `stat` commands are portable, so we mimic them using python.
+Not all platforms support precision of even one-second granularity, so we allow
+a rather generous fudge factor here; 1234567890 is 2009, and the primary thing
+we care about is that it's not the machine's current time; hopefully it's 
really
+unlikely for a machine to have such a broken clock that this test fails. :)
+
+  $ mkdir mtime
+  $ python -c 'import os; p="mtime/f"; t=1234567890; open(p, "w").close(); 
os.utime(p, (t, t))'
+  $ hg ci -qAm 'add mtime dir'
+  $ hg mv -q mtime mtime2
+  $ python -c 'import os, sys; p="mtime2/f"; sys.exit(int(not 
(os.stat(p).st_mtime < 1234567999)))'
diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -1186,7 +1186,7 @@
 os.rename(src, tmp)
 os.rename(tmp, target)
 else:
-util.copyfile(src, target)
+util.copyfile(src, target, copystat=True)
 srcexists = True
 except IOError as inst:
 if inst.errno == errno.ENOENT:
diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -667,15 +667,15 @@
 try:
 origcopyfile = util.copyfile
 copiedfiles = []
-def overridecopyfile(src, dest):
+def overridecopyfile(src, dest, *args, **kwargs):
 if (lfutil.shortname in src and
 dest.startswith(repo.wjoin(lfutil.shortname))):
 destlfile = dest.replace(lfutil.shortname, '')
 if not opts['force'] and os.path.exists(destlfile):
 raise IOError('',
 _('destination largefile already exists'))
 copiedfiles.append((src, dest))
-origcopyfile(src, dest)
+origcopyfile(src, dest, *args, **kwargs)
 
 util.copyfile = overridecopyfile
 result += orig(ui, repo, listpats, opts, rename)



To: spectral, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2728: rebase: also include commit of collapsed commits in single transaction

2018-03-08 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  When rebase.singletransaction is set, we still used to create a second
  transaction when committing with --collapse. It's simpler to create a
  single transaction. I don't see a reason to create that second
  transaction, and neither do the test cases.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2728

AFFECTED FILES
  hgext/rebase.py
  tests/test-rebase-transaction.t

CHANGE DETAILS

diff --git a/tests/test-rebase-transaction.t b/tests/test-rebase-transaction.t
--- a/tests/test-rebase-transaction.t
+++ b/tests/test-rebase-transaction.t
@@ -32,7 +32,6 @@
 - the end.
   $ hg rebase --debug -b D -d Z | grep 'status stored'
   rebase status stored
-  rebase status stored
   $ hg tglog
   o  5: D
   |
diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -573,16 +573,12 @@
 keepbranches=self.keepbranchesf,
 date=self.date, wctx=self.wctx)
 else:
-dsguard = None
-if ui.configbool('rebase', 'singletransaction'):
-dsguard = dirstateguard.dirstateguard(repo, 'rebase')
-with util.acceptintervention(dsguard):
-newnode = concludenode(repo, revtoreuse, p1, self.external,
-commitmsg=commitmsg,
-extrafn=_makeextrafn(self.extrafns),
-editor=editor,
-keepbranches=self.keepbranchesf,
-date=self.date)
+newnode = concludenode(repo, revtoreuse, p1, self.external,
+commitmsg=commitmsg,
+extrafn=_makeextrafn(self.extrafns),
+editor=editor,
+keepbranches=self.keepbranchesf,
+date=self.date)
 if newnode is not None:
 newrev = repo[newnode].rev()
 for oldrev in self.state:
@@ -864,8 +860,7 @@
 dsguard = dirstateguard.dirstateguard(repo, 'rebase')
 with util.acceptintervention(dsguard):
 rbsrt._performrebase(tr)
-
-rbsrt._finishrebase()
+rbsrt._finishrebase()
 
 def _definedestmap(ui, repo, rbsrt, destf=None, srcf=None, basef=None,
revf=None, destspace=None):



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


mercurial@36765: 5 new changesets

2018-03-08 Thread Mercurial Commits
5 new changesets in mercurial:

https://www.mercurial-scm.org/repo/hg/rev/09f320067591
changeset:   36761:09f320067591
user:Jun Wu 
date:Sun Mar 04 00:07:04 2018 -0800
summary: xdiff: remove whitespace related feature

https://www.mercurial-scm.org/repo/hg/rev/b5bb0f99064d
changeset:   36762:b5bb0f99064d
user:Jun Wu 
date:Sun Mar 04 00:17:49 2018 -0800
summary: xdiff: remove unused structure, functions, and constants

https://www.mercurial-scm.org/repo/hg/rev/90f8fe72446c
changeset:   36763:90f8fe72446c
user:Jun Wu 
date:Tue Mar 06 18:41:08 2018 -0800
summary: xdiff: remove xemit related logic

https://www.mercurial-scm.org/repo/hg/rev/3cf40112efb7
changeset:   36764:3cf40112efb7
user:Jun Wu 
date:Tue Mar 06 18:51:11 2018 -0800
summary: xdiff: remove xmerge related logic

https://www.mercurial-scm.org/repo/hg/rev/04d64163039a
changeset:   36765:04d64163039a
bookmark:@
tag: tip
user:Jun Wu 
date:Tue Mar 06 19:31:17 2018 -0800
summary: fuzz: fix xdiff build

-- 
Repository URL: https://www.mercurial-scm.org/repo/hg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2703: phabricator: follow-up phab auth improvements with backwards compat mode

2018-03-08 Thread durin42 (Augie Fackler)
durin42 marked an inline comment as done.
durin42 added inline comments.

INLINE COMMENTS

> mharbison72 wrote in phabricator.py:107
> This wants a `repo.ui`

Ouch. Amended the change for that, thanks.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2703

To: durin42, #hg-reviewers, indygreg
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2703: phabricator: follow-up phab auth improvements with backwards compat mode

2018-03-08 Thread mharbison72 (Matt Harbison)
mharbison72 added inline comments.

INLINE COMMENTS

> phabricator.py:107
> +"""
> +token = ui.config('phabricator', 'token')
> +if token:

This wants a `repo.ui`

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2703

To: durin42, #hg-reviewers, indygreg
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2727: bookmarks: test for exchanging long bookmark names (issue5165)

2018-03-08 Thread durin42 (Augie Fackler)
durin42 created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  As far as I can tell a test for a long bookmark name never actually
  got added when this was fixed. Let's document that a 300 byte bookmark
  is something we're supporting (this patch started out life over a year
  ago as a way for me to validate the problem, and I recently found it.)
  
  I think the nonzero exits from the push operations are only because no
  new changesets get exchanged. Please correct me if I'm wrong on that. :)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2727

AFFECTED FILES
  tests/test-bookmarks-pushpull.t

CHANGE DETAILS

diff --git a/tests/test-bookmarks-pushpull.t b/tests/test-bookmarks-pushpull.t
--- a/tests/test-bookmarks-pushpull.t
+++ b/tests/test-bookmarks-pushpull.t
@@ -1030,6 +1030,34 @@
   no changes found
   [1]
 
+Pushing a really long bookmark should work fine (issue5165)
+===
+
+#if b2-binary
+  >>> open('longname', 'w').write('wat' * 100)
+  $ hg book `cat longname`
+  $ hg push -B `cat longname` ../unchanged-b
+  pushing to ../unchanged-b
+  searching for changes
+  no changes found
+  exporting bookmark (wat){100} (re)
+  [1]
+  $ hg -R ../unchanged-b book --delete `cat longname`
+
+Test again but forcing bundle2 exchange to make sure that doesn't regress.
+
+  $ hg push -B `cat longname` ../unchanged-b --config 
devel.legacy.exchange=bundle1
+  pushing to ../unchanged-b
+  searching for changes
+  no changes found
+  exporting bookmark (wat){100} (re)
+  [1]
+  $ hg -R ../unchanged-b book --delete `cat longname`
+  $ hg book --delete `cat longname`
+  $ hg co @
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  (activating bookmark @)
+#endif
 
 Check hook preventing push (issue4455)
 ==



To: durin42, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2676: tests: stop over-specifying tempfile name

2018-03-08 Thread durin42 (Augie Fackler)
durin42 updated this revision to Diff 6732.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2676?vs=6633=6732

REVISION DETAIL
  https://phab.mercurial-scm.org/D2676

AFFECTED FILES
  contrib/python3-whitelist
  tests/test-merge-tools.t

CHANGE DETAILS

diff --git a/tests/test-merge-tools.t b/tests/test-merge-tools.t
--- a/tests/test-merge-tools.t
+++ b/tests/test-merge-tools.t
@@ -1558,7 +1558,7 @@
   $ hg update -q -C 2
   $ hg merge -y -r tip --tool echo --config merge-tools.echo.args='$base 
$local $other $output'
   merging f and f.txt to f.txt
-  */f~base.?? $TESTTMP/f.txt.orig */f~other.??.txt $TESTTMP/f.txt 
(glob)
+  */f~base.* $TESTTMP/f.txt.orig */f~other.*.txt $TESTTMP/f.txt (glob)
   0 files updated, 1 files merged, 0 files removed, 0 files unresolved
   (branch merge, don't forget to commit)
 
diff --git a/contrib/python3-whitelist b/contrib/python3-whitelist
--- a/contrib/python3-whitelist
+++ b/contrib/python3-whitelist
@@ -208,6 +208,7 @@
 test-merge-revert2.t
 test-merge-subrepos.t
 test-merge-symlinks.t
+test-merge-tools.t
 test-merge-types.t
 test-merge1.t
 test-merge10.t



To: durin42, pulkit, #hg-reviewers
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2676: tests: stop over-specifying tempfile name

2018-03-08 Thread durin42 (Augie Fackler)
durin42 added a comment.


  In https://phab.mercurial-scm.org/D2676#43853, @indygreg wrote:
  
  > I'm -0 on this. I'd prefer to change the temp file code to be consistent 
between Python versions. Preferably standardizing on the Python 3 version.
  
  
  It looks like these are coming from filemerge._maketempfiles, which is using 
tempfile.mkstemp(). I don't see a way to influence the number of wildcard 
characters there.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2676

To: durin42, pulkit, #hg-reviewers
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2703: phabricator: follow-up phab auth improvements with backwards compat mode

2018-03-08 Thread durin42 (Augie Fackler)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGd289ea66a356: phabricator: follow-up phab auth improvements 
with backwards compat mode (authored by durin42, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2703?vs=6677=6731

REVISION DETAIL
  https://phab.mercurial-scm.org/D2703

AFFECTED FILES
  contrib/phabricator.py

CHANGE DETAILS

diff --git a/contrib/phabricator.py b/contrib/phabricator.py
--- a/contrib/phabricator.py
+++ b/contrib/phabricator.py
@@ -99,6 +99,17 @@
 process('', params)
 return util.urlreq.urlencode(flatparams)
 
+def readlegacytoken(repo):
+"""Transitional support for old phabricator tokens.
+
+Remove before the 4.6 release.
+"""
+token = ui.config('phabricator', 'token')
+if token:
+repo.ui.warn(_('phabricator.token is deprecated - please '
+   'migrate to the phabricator.auth section.\n'))
+return token
+
 def readurltoken(repo):
 """return conduit url, token and make sure they exist
 
@@ -128,8 +139,10 @@
 break
 
 if not token:
-raise error.Abort(_('Can\'t find conduit token associated to %s')
-  % (url,))
+token = readlegacytoken(repo)
+if not token:
+raise error.Abort(_('Can\'t find conduit token associated to %s')
+  % (url,))
 
 return url, token
 



To: durin42, #hg-reviewers, indygreg
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D1919: phabricator: specify API tokens per host, rather than per repo

2018-03-08 Thread tom.prince (Tom Prince)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG4397909f82d3: phabricator: specify API tokens per host, 
rather than per repo (authored by tom.prince, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D1919?vs=4960=6730

REVISION DETAIL
  https://phab.mercurial-scm.org/D1919

AFFECTED FILES
  contrib/phabricator.py

CHANGE DETAILS

diff --git a/contrib/phabricator.py b/contrib/phabricator.py
--- a/contrib/phabricator.py
+++ b/contrib/phabricator.py
@@ -33,6 +33,11 @@
 # if you need to specify advanced options that is not easily supported by
 # the internal library.
 curlcmd = curl --connect-timeout 2 --retry 3 --silent
+
+[phabricator.auth]
+example.url = https://phab.example.com/
+# API token. Get it from https://$HOST/conduit/login/
+example.token = cli-
 """
 
 from __future__ import absolute_import
@@ -100,14 +105,33 @@
 Currently read from [phabricator] config section. In the future, it might
 make sense to read from .arcconfig and .arcrc as well.
 """
-values = []
-section = 'phabricator'
-for name in ['url', 'token']:
-value = repo.ui.config(section, name)
-if not value:
-raise error.Abort(_('config %s.%s is required') % (section, name))
-values.append(value)
-return values
+url = repo.ui.config('phabricator', 'url')
+if not url:
+raise error.Abort(_('config %s.%s is required')
+  % ('phabricator', 'url'))
+
+groups = {}
+for key, val in repo.ui.configitems('phabricator.auth'):
+if '.' not in key:
+repo.ui.warn(_("ignoring invalid [phabricator.auth] key '%s'\n")
+ % key)
+continue
+group, setting = key.rsplit('.', 1)
+groups.setdefault(group, {})[setting] = val
+
+token = None
+for group, auth in groups.iteritems():
+if url != auth.get('url'):
+continue
+token = auth.get('token')
+if token:
+break
+
+if not token:
+raise error.Abort(_('Can\'t find conduit token associated to %s')
+  % (url,))
+
+return url, token
 
 def callconduit(repo, name, params):
 """call Conduit API, params is a dict. return json.loads result, or None"""



To: tom.prince, #hg-reviewers, durin42
Cc: quark, indygreg, durin42, pulkit, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2057: rust implementation of hg status

2018-03-08 Thread kevincox (Kevin Cox)
kevincox added a comment.


  Rust has platform independent types `PathBuf` 
 and `` 
 for paths and 
`OsString`  and 
``  for 
strings (owned and references respectively. They do have os-specific extensions 
but as long as you don't use them it should be cross platform. That being said, 
if you are serializing and deserializing them you may need to write some 
platform dependant code.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2057

To: Ivzhh, #hg-reviewers, kevincox
Cc: glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 4] py3: read/write plain lock file in binary mode

2018-03-08 Thread Pulkit Goyal
Queued the series. Many thanks!

On Tue, Mar 6, 2018 at 6:24 PM, Yuya Nishihara  wrote:
> # HG changeset patch
> # User Yuya Nishihara 
> # Date 1520205676 18000
> #  Sun Mar 04 18:21:16 2018 -0500
> # Node ID faea088aebdbda2e32e5339cbb6df51283fe2f56
> # Parent  4c71a26a4009d88590c9ae3d64a5912fd556d82e
> py3: read/write plain lock file in binary mode
>
> A lock file shouldn't contain '\n', so this isn't a BC.
>
> diff --git a/contrib/python3-whitelist b/contrib/python3-whitelist
> --- a/contrib/python3-whitelist
> +++ b/contrib/python3-whitelist
> @@ -373,6 +373,7 @@ test-subrepo-deep-nested-change.t
>  test-subrepo.t
>  test-symlinks.t
>  test-tag.t
> +test-tags.t
>  test-treemanifest.t
>  test-unamend.t
>  test-uncommit.t
> diff --git a/mercurial/util.py b/mercurial/util.py
> --- a/mercurial/util.py
> +++ b/mercurial/util.py
> @@ -1689,7 +1689,8 @@ def makelock(info, pathname):
>  except AttributeError: # no symlink in os
>  pass
>
> -ld = os.open(pathname, os.O_CREAT | os.O_WRONLY | os.O_EXCL)
> +flags = os.O_CREAT | os.O_WRONLY | os.O_EXCL | getattr(os, 'O_BINARY', 0)
> +ld = os.open(pathname, flags)
>  os.write(ld, info)
>  os.close(ld)
>
> @@ -1701,7 +1702,7 @@ def readlock(pathname):
>  raise
>  except AttributeError: # no symlink in os
>  pass
> -fp = posixfile(pathname)
> +fp = posixfile(pathname, 'rb')
>  r = fp.read()
>  fp.close()
>  return r
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2 V2] hgweb: add a hook for processing LFS Batch API requests

2018-03-08 Thread Gregory Szorc
On Thu, Mar 8, 2018 at 5:06 AM, Matt Harbison  wrote:

>
> On Mar 8, 2018, at 1:17 AM, Gregory Szorc  wrote:
>
> On Wed, Mar 7, 2018 at 8:49 PM, Matt Harbison 
> wrote:
>
>> # HG changeset patch
>> # User Matt Harbison 
>> # Date 1519274700 18000
>> #  Wed Feb 21 23:45:00 2018 -0500
>> # Node ID 89382cb20bb19e089513b2ce69ef8acfa1f523fd
>> # Parent  6dab3bdb1f00932a80ffa07f80ba240c3a4b48df
>> hgweb: add a hook for processing LFS Batch API requests
>>
>> There really isn't a clean way to give LFS a crack at intercepting the
>> requests
>> without hardcoding some LFS knowledge in the core.  The rationale for
>> this URI
>> is that the spec for the Batch API[1] defines the URL as the LFS server
>> url +
>> '/objects/batch'.  The default git URLs are:
>>
>> Git remote: https://git-server.com/foo/bar
>> LFS server: https://git-server.com/foo/bar.git/info/lfs
>> Batch API: https://git-server.com/foo/bar.git/info/lfs/objects/batch
>>
>> '.git/' seems like it's not something a user would normally track.  If we
>> adhere
>> to how git defines the URLs, then the hg-git extension should be able to
>> talk to
>> a git based server without any additional work.
>>
>> I'm not sure if checking the User-Agent is a good idea, but this needs a
>> specialized client, and it seems like everyone else is doing it
>> (3d48ae1aaa5e,
>> e7bb5fc4570c).  We can always back this off if it becomes a nuisance.  A
>> web
>> browser will see "400: bad method", the same as it would before this
>> change.
>>
>> I'm not sure if None or 'pull' is the proper permission check, but the
>> only
>> difference is whether or not `hgweb.allowpull` is checked.  Since nothing
>> of
>> particular interest is transferred here, and the next phase handles the
>> read or
>> write, treating this like web interface request seems fine.
>>
>> [1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
>>
>> diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
>> --- a/mercurial/hgweb/hgweb_mod.py
>> +++ b/mercurial/hgweb/hgweb_mod.py
>> @@ -90,6 +90,12 @@
>>  urlel = os.path.dirname(urlel)
>>  return reversed(breadcrumb)
>>
>> +def _processlfsbatchreq(repo, req):
>> +"""A hook for the LFS extension to wrap that handles requests to the
>> Batch
>> +API, and returns the appropriate JSON response.
>> +"""
>> +raise ErrorResponse(HTTP_NOT_FOUND)
>> +
>>  class requestcontext(object):
>>  """Holds state/context for an individual request.
>>
>> @@ -376,6 +382,21 @@
>>  except ErrorResponse as inst:
>>  return protohandler['handleerror'](inst)
>>
>> +# Route LFS Batch API requests to the appropriate handler
>> +
>> +if req.env.get(r'HTTP_USER_AGENT', r'').startswith(r'git-lfs/'):
>>
>
> I don't like filtering by the user-agent. It is recommended to not do
> this. Unless it will cause problems or tons of code complexity to properly
> lock out actual Git clients who may get confused by this, my vote is to
> remove it and just rely on path filtering.
>
>
> Ok, I can ditch it.  It was more about locking out web browsers and other
> things not POSTing a JSON request, without changing the current behavior
> they see.  But I guess there are other ways to remotely figure out the
> minimum server version.
>
> +try:
>> +path = req.env.get(r'PATH_INFO')
>> +if path == '/.git/info/lfs/objects/batch':
>>
>
> Relying on PATH_INFO is super annoying. But our code kind of sucks :/
>
> My only suggestion would be to define `parts` in the `else` block of the
> `if r'PATH_INFO' in req.env:` above and check `parts == ['.git', 'info',
> 'lfs', 'objects', 'batch']`. That's still a big ugly though.
>
>
> OoC, what’s the point of this?  I saw it in the code above this, but it
> seems like string handling is easier than list-of-string handling.  This
> looks like an easy enough change though.
>

It turns out my HTTP protocol work has spurred me to author a proper fix
for this madness. Expect some patches from me today to completely overhaul
this URL code.


>
> Whatever happens, my guess is I will eventually write some patches to
> clean the URL parsing code up.
>
>
>> +self.check_perm(rctx, req, None)
>>
>
> Shouldn't this be `pull` instead of `None`? Otherwise, LFS won't honor
> web.* to restrict data access.
>
>
> I went back and forth, but landed on this because it doesn't seem like a
> pull, and still checks web.deny_read and web.allow_read.  It looks like
> None just doesn’t check web.allowpull.  This would be checked prior to the
> actual file transfer in the next step.
>
> Is there a concept of a write-only repo?  Like a try server where you can
> only push stuff, but not pull all of the anonymous heads.  If not, then I
> think I can change it.
>

No there isn't. I'm not sure how that would be implemented, 

D2057: rust implementation of hg status

2018-03-08 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  In https://phab.mercurial-scm.org/D2057#43989, @durin42 wrote:
  
  > In https://phab.mercurial-scm.org/D2057#43988, @kevincox wrote:
  >
  > > In https://phab.mercurial-scm.org/D2057#43987, @durin42 wrote:
  > >
  > > > Mercurial tries to be principled about always treating filenames as 
bytes. AIUI https://www.mercurial-scm.org/wiki/WindowsUTF8Plan is still the 
plan of record there?
  > >
  > >
  > > Reading that page it seems to claim that filenames should be utf8, not 
bytes. If utf8, this is what the code does, but if it is bytes that definitely 
won't work.
  >
  >
  > IIRC it's bytes everyplace except Windows, where we pretend utf8 is real?
  >
  > We may have to make adjustments to this plan on macOS with APFS, but I'm 
not sure about that yet.
  
  
  I think we want to express a path as a dedicated type which has different 
underlying storage depending on the platform (bytes on Linux, UTF-16 on 
Windows). All filesystem operations should take a `Path` instance to operate 
on. This is the only way to cleanly round trip filenames between the OS, the 
application, and back to the OS. That leaves us with the hard problem of 
normalizing Mercurial's storage representation of paths (bytes) with the 
operating system's. But this world is strictly better than today, where we lose 
path data from the OS because we use POSIX APIs.
  
  FWIW, Python 3 rewrote the I/O layer to use Win32 APIs everywhere. Combined 
with the `pathlib` types, I'm pretty sure Python 3 can round trip paths on 
Windows. I also think Rust's path type(s) have OS-dependent functionality.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2057

To: Ivzhh, #hg-reviewers, kevincox
Cc: glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2057: rust implementation of hg status

2018-03-08 Thread durin42 (Augie Fackler)
durin42 added a comment.


  In https://phab.mercurial-scm.org/D2057#43988, @kevincox wrote:
  
  > In https://phab.mercurial-scm.org/D2057#43987, @durin42 wrote:
  >
  > > Mercurial tries to be principled about always treating filenames as 
bytes. AIUI https://www.mercurial-scm.org/wiki/WindowsUTF8Plan is still the 
plan of record there?
  >
  >
  > Reading that page it seems to claim that filenames should be utf8, not 
bytes. If utf8, this is what the code does, but if it is bytes that definitely 
won't work.
  
  
  IIRC it's bytes everyplace except Windows, where we pretend utf8 is real?
  
  We may have to make adjustments to this plan on macOS with APFS, but I'm not 
sure about that yet.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2057

To: Ivzhh, #hg-reviewers, kevincox
Cc: glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2057: rust implementation of hg status

2018-03-08 Thread kevincox (Kevin Cox)
kevincox added a comment.


  In https://phab.mercurial-scm.org/D2057#43987, @durin42 wrote:
  
  > Mercurial tries to be principled about always treating filenames as bytes. 
AIUI https://www.mercurial-scm.org/wiki/WindowsUTF8Plan is still the plan of 
record there?
  
  
  Reading that page it seems to claim that filenames should be utf8, not bytes. 
If utf8, this is what the code does, but if it is bytes that definitely won't 
work.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2057

To: Ivzhh, #hg-reviewers, kevincox
Cc: glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2057: rust implementation of hg status

2018-03-08 Thread durin42 (Augie Fackler)
durin42 added a comment.


  In https://phab.mercurial-scm.org/D2057#43892, @kevincox wrote:
  
  > On a higher level, all of these code appears to be treating file names as 
strings. This isn't really true and will disallow some valid file names. Maybe 
we should stick with bytes throughout. Of course this makes windows filepaths 
difficult because they are actually (utf16) strings.
  
  
  Mercurial tries to be principled about always treating filenames as bytes. 
AIUI https://www.mercurial-scm.org/wiki/WindowsUTF8Plan is still the plan of 
record there?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2057

To: Ivzhh, #hg-reviewers, kevincox
Cc: glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2057: rust implementation of hg status

2018-03-08 Thread kevincox (Kevin Cox)
kevincox requested changes to this revision.
kevincox added a comment.
This revision now requires changes to proceed.


  I will try to finish the review later, but it might be quicker if you 
incorporate some of the changes first since a lot of them are repeated many 
times. Overall it looks good, there are a couple of things that i would 
highlight to make the code easier to read.
  
  - Prefer more descriptive variable names.
  - If you can, avoid "pointer" arithmetic. Cursors and slices are nice and 
have great convenience methods.
  - Group your flow control and filtering more.
  - Try to keep your types straight. In rust using the right types helps catch 
errors. So be aware of `char` vs `u8` vs `String` vs `Vec` vs `Vec`.
  
  On a higher level, all of these code appears to be treating file names as 
strings. This isn't really true and will disallow some valid file names. Maybe 
we should stick with bytes throughout. Of course this makes windows filepaths 
difficult because they are actually (utf16) strings.

INLINE COMMENTS

> indygreg wrote in build.rs:1
> I see this file was copied. There's nothing wrong with that. But does this 
> mean we will need a custom build.rs for each Rust package doing Python? If 
> that's the case, then I would prefer to isolate all our rust-cpython code to 
> a single package, if possible. I'm guessing that could be challenging due to 
> crossing create boundaries. I'm sure there are placed where we don't want to 
> expose symbols outside the crate.
> 
> I'm curious how others feel about this.

If this is going to be reused I would move it into it's own crate. It seems 
like everything here could be boiled down to a single function call in main.

> base85.rs:21
> +});
> +}
> +

I prefer something like this: 
https://play.rust-lang.org/?gist=5ca18a5b95335600e911b8f9310ea5c7=stable

I doubt lazy_static is too slow. Otherwise we can stay like this until const 
functions get implemented.

Either way note:

- I changed the type of B85CHARS to an array as opposed to an array ref.
- The loop condition is much nicer.

> base85.rs:23
> +
> +pub fn b85encode(py: Python, text: , pad: i32) -> PyResult {
> +let text = text.as_bytes();

Would it be possible to separate the decode from the python objects. I'm 
thinking two helper functions.

  fn b85_required_len(text: ) -> usize
  fn b85_encode(text: , pad: i32, out:  [u8]) -> Result<()>

> base85.rs:23
> +
> +pub fn b85encode(py: Python, text: , pad: i32) -> PyResult {
> +let text = text.as_bytes();

`` can only hold valid utf8 data? Does it make more sense to have `&[u8]` 
here for a list of bytes?

> base85.rs:23
> +
> +pub fn b85encode(py: Python, text: , pad: i32) -> PyResult {
> +let text = text.as_bytes();

IIUC pad is only ever checked `== 0`. Can it be made into a bool?

> base85.rs:45
> +
> +let mut ptext = [..];
> +let mut len = { ptext.len() };

`ptext` isn't very descriptive.

> base85.rs:46
> +let mut ptext = [..];
> +let mut len = { ptext.len() };
> +let mut dst_off: usize = 0;

Why the braces here?

> base85.rs:47
> +let mut len = { ptext.len() };
> +let mut dst_off: usize = 0;
> +

I suspect this type annotation isn't required.

> base85.rs:47
> +let mut len = { ptext.len() };
> +let mut dst_off: usize = 0;
> +

It might be best to use a `std::io::Cursor` 
 and let it drack 
`dst_off` for your.

> base85.rs:52
> +break;
> +}
> +

while !ptext.is_empty()

> base85.rs:54
> +
> +let mut acc: u32 = 0;
> +

I would prefer the name `chunk` or even `accum` is a lot mode obvious to me 
than `acc`.

> base85.rs:56
> +
> +for i in [24, 16, 8, 0].iter() {
> +let ch = ptext[0] as u32;

for i in &[24, 16, 8, 0]

> base85.rs:58
> +let ch = ptext[0] as u32;
> +acc |= ch << i;
> +

I would just combine these into one line as the name `ch` isn't adding much.

> base85.rs:63
> +
> +if len == 0 {
> +break;

Tracking len manually is a smell. Why not drop it and use `ptest.is_empty()`.

> base85.rs:91
> +pub fn b85decode(py: Python, text: ) -> PyResult {
> +let b85dec = unsafe { B85DEC };
> +

This is probably worth a comment that this is safe because D85DEC is required 
to be initialized before this function is called.

> base85.rs:152
> +
> +acc *= 85;
> +

Let rust do the overflow checking.

  acc = acc.checked_mul(85)
  .ok_or_else(|| {
  PyErr::new::(
  py,
  format!("bad base85 character at position {}", i))
  })?;

> changelog.rs:24
> +
> +assert_eq!(content[NodeId::hex_len()], '\n' as u8);
> +

Passing a message as a third argument is really useful.

> changelog.rs:31
> +
> +content.drain(..NodeId::hex_len());
> +

If you aren't using the value I would prefer `truncate(NodeId::hex_len())`

> changelog.rs:33
> +
> +let msg 

D2409: graft: add no-commit mode (issue5631)

2018-03-08 Thread khanchi97 (Sushil khanchi)
khanchi97 added a comment.


  pulkit: Can you please review it? I have made the requested changes.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2409

To: khanchi97, #hg-reviewers
Cc: pulkit, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2 V2] hgweb: add a hook for processing LFS Batch API requests

2018-03-08 Thread Matt Harbison

> On Mar 8, 2018, at 1:17 AM, Gregory Szorc  wrote:
> 
>> On Wed, Mar 7, 2018 at 8:49 PM, Matt Harbison  wrote:
>> # HG changeset patch
>> # User Matt Harbison 
>> # Date 1519274700 18000
>> #  Wed Feb 21 23:45:00 2018 -0500
>> # Node ID 89382cb20bb19e089513b2ce69ef8acfa1f523fd
>> # Parent  6dab3bdb1f00932a80ffa07f80ba240c3a4b48df
>> hgweb: add a hook for processing LFS Batch API requests
>> 
>> There really isn't a clean way to give LFS a crack at intercepting the 
>> requests
>> without hardcoding some LFS knowledge in the core.  The rationale for this 
>> URI
>> is that the spec for the Batch API[1] defines the URL as the LFS server url +
>> '/objects/batch'.  The default git URLs are:
>> 
>> Git remote: https://git-server.com/foo/bar
>> LFS server: https://git-server.com/foo/bar.git/info/lfs
>> Batch API: https://git-server.com/foo/bar.git/info/lfs/objects/batch
>> 
>> '.git/' seems like it's not something a user would normally track.  If we 
>> adhere
>> to how git defines the URLs, then the hg-git extension should be able to 
>> talk to
>> a git based server without any additional work.
>> 
>> I'm not sure if checking the User-Agent is a good idea, but this needs a
>> specialized client, and it seems like everyone else is doing it 
>> (3d48ae1aaa5e,
>> e7bb5fc4570c).  We can always back this off if it becomes a nuisance.  A web
>> browser will see "400: bad method", the same as it would before this change.
>> 
>> I'm not sure if None or 'pull' is the proper permission check, but the only
>> difference is whether or not `hgweb.allowpull` is checked.  Since nothing of
>> particular interest is transferred here, and the next phase handles the read 
>> or
>> write, treating this like web interface request seems fine.
>> 
>> [1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
>> 
>> diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
>> --- a/mercurial/hgweb/hgweb_mod.py
>> +++ b/mercurial/hgweb/hgweb_mod.py
>> @@ -90,6 +90,12 @@
>>  urlel = os.path.dirname(urlel)
>>  return reversed(breadcrumb)
>> 
>> +def _processlfsbatchreq(repo, req):
>> +"""A hook for the LFS extension to wrap that handles requests to the 
>> Batch
>> +API, and returns the appropriate JSON response.
>> +"""
>> +raise ErrorResponse(HTTP_NOT_FOUND)
>> +
>>  class requestcontext(object):
>>  """Holds state/context for an individual request.
>> 
>> @@ -376,6 +382,21 @@
>>  except ErrorResponse as inst:
>>  return protohandler['handleerror'](inst)
>> 
>> +# Route LFS Batch API requests to the appropriate handler
>> +
>> +if req.env.get(r'HTTP_USER_AGENT', r'').startswith(r'git-lfs/'):
> 
> I don't like filtering by the user-agent. It is recommended to not do this. 
> Unless it will cause problems or tons of code complexity to properly lock out 
> actual Git clients who may get confused by this, my vote is to remove it and 
> just rely on path filtering.
>  
>> +try:
>> +path = req.env.get(r'PATH_INFO')
>> +if path == '/.git/info/lfs/objects/batch':
> 
> Relying on PATH_INFO is super annoying. But our code kind of sucks :/
> 
> My only suggestion would be to define `parts` in the `else` block of the `if 
> r'PATH_INFO' in req.env:` above and check `parts == ['.git', 'info', 'lfs', 
> 'objects', 'batch']`. That's still a big ugly though.

Also, what’s the str.startswith() equivalent for this list, to work with the 
next patch?

> Whatever happens, my guess is I will eventually write some patches to clean 
> the URL parsing code up.
>  
>> +self.check_perm(rctx, req, None)
> 
> Shouldn't this be `pull` instead of `None`? Otherwise, LFS won't honor web.* 
> to restrict data access.
>  
>> +return _processlfsbatchreq(rctx.repo, req)
>> +else:
>> +raise ErrorResponse(HTTP_NOT_FOUND)
>> +except ErrorResponse as inst:
>> +req.respond(inst, 'text/plain; charset=utf-8')
>> +# No body, since only lfs clients are allowed here
>> +return ['']
>> +
>>  # translate user-visible url structure to internal structure
>> 
>>  args = query.split('/', 2)
>> ___
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> 
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2722: url: add HTTP handler that uses a proxied socket

2018-03-08 Thread Michaelexics (MichaelexicsVP)
Michaelexics added a comment.


  Romantic Piano by Andrew_Studio | AudioJungle http://workle.website/4d 
  https://audiojungle.net/item/cinematic-trailer/20815683?ref=MomentumOfMelody 
  https://audiojungle.net/user/momentumofmelody/portfolio?ref=MomentumOfMelody 
https://audiojungle.net/user/ie_sound/portfolio?ref=IE_Sound 
https://audiojungle.net/user/elijah_studio/portfolio?ref=Elijah_Studio 
  https://audiojungle.net/user/commercial_music/portfolio?ref=Commercial_Music

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2722

To: indygreg, #hg-reviewers
Cc: Michaelexics, Jeffreydoomb, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] Fix for Bug #5807

2018-03-08 Thread Yuya Nishihara
On Thu, 8 Mar 2018 13:42:42 +0100, Sascha Nemecek wrote:
> On 08/03/18 13:16, Yuya Nishihara wrote:
> > On Wed, 7 Mar 2018 22:48:01 +0100, Sascha Nemecek wrote:
> >> Am 2018-03-02 um 05:14 schrieb Yuya Nishihara:
> >>> On Thu, 1 Mar 2018 11:06:59 +0100, Sascha Nemecek wrote:
>  # HG changeset patch
>  # User Sascha Nemecek 
>  # Date 1519831479 -3600
>  #  Wed Feb 28 16:24:39 2018 +0100
>  # Node ID 42ddf4ee4f91d76f19ca0c3efc4c8e4c1c6fa96c
>  # Parent  1bd132a021dd00f96604e33a8fb5306d37e56007
>  Don't close 'fp' (= 'ui.fout') stream to prevent 'ValueError: I/O
>  operation on closed file' (Bug #5807).
> 
>  Regression of changeset 30261:6bed17ba00a1
>  (https://www.mercurial-scm.org/repo/hg/rev/6bed17ba00a1)
> 
>  diff -r 1bd132a021dd -r 42ddf4ee4f91 hgext/convert/subversion.py
>  --- a/hgext/convert/subversion.pyWed Feb 21 14:36:42 2018 +0530
>  +++ b/hgext/convert/subversion.pyWed Feb 28 16:24:39 2018 +0100
>  @@ -149,7 +149,7 @@
> pickle.dump(str(inst), fp, protocol)
> else:
> pickle.dump(None, fp, protocol)
>  -fp.close()
>  +fp.flush()
> # With large history, cleanup process goes crazy and suddenly
> # consumes *huge* amount of memory. The output file being closed,
> # there is no need for clean termination.
> >>>
> >>> I don't think fp.close() was the source of the problem. Here the process
> >>> _exit()s so no cleanup would be run.
> >>
> >> Of course, the problem might lie deeper. Empirically tested, the
> >> fp.close() triggered the reported error output. From my observation, the
> >> fp.close() affected stdout of the convert function and also pdb.
> >> Therefore I came to the conclusion that the fp.close() / ui.fout.close()
> >> was the culprit.
> > 
> > If this patch magically fixes the issue, that's fine. It's also harmless
> > to do fflush() instead of fclose() here. I'm just saying that the process
> > exits immediately after the fp.close() with no cleanup, so I have no idea
> > why that matters.
> > 
> >-fp.close()
> >+fp.flush()
> > os._exit(0)
> 
> My guess/interpretation is, that this happens in a child process. When 
> closing ui.fout, the parent and all other process run into closed doors.

Nah, file descriptors should be duplicated to child processes. And _exit()
closes all of them in the parent process anyway.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2696: cleanup: use stat_result[stat.ST_MTIME] instead of stat_result.st_mtime

2018-03-08 Thread durin42 (Augie Fackler)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGffa3026d4196: cleanup: use stat_result[stat.ST_MTIME] 
instead of stat_result.st_mtime (authored by durin42, committed by ).

CHANGED PRIOR TO COMMIT
  https://phab.mercurial-scm.org/D2696?vs=6679=6728#toc

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2696?vs=6679=6728

REVISION DETAIL
  https://phab.mercurial-scm.org/D2696

AFFECTED FILES
  hgext/extdiff.py
  hgext/shelve.py
  mercurial/chgserver.py
  mercurial/context.py
  mercurial/debugcommands.py
  mercurial/dirstate.py
  mercurial/hg.py
  mercurial/hgweb/common.py
  mercurial/posix.py
  mercurial/util.py
  tests/svn-safe-append.py
  tests/test-atomictempfile.py
  tests/test-context.py
  tests/test-filecache.py

CHANGE DETAILS

diff --git a/tests/test-filecache.py b/tests/test-filecache.py
--- a/tests/test-filecache.py
+++ b/tests/test-filecache.py
@@ -1,5 +1,6 @@
 from __future__ import absolute_import, print_function
 import os
+import stat
 import subprocess
 import sys
 
@@ -200,7 +201,7 @@
 fp.close()
 
 oldstat = os.stat(filename)
-if oldstat.st_ctime != oldstat.st_mtime:
+if oldstat[stat.ST_CTIME] != oldstat[stat.ST_MTIME]:
 # subsequent changing never causes ambiguity
 continue
 
@@ -219,16 +220,17 @@
 fp.write('BAR')
 
 newstat = os.stat(filename)
-if oldstat.st_ctime != newstat.st_ctime:
+if oldstat[stat.ST_CTIME] != newstat[stat.ST_CTIME]:
 # timestamp ambiguity was naturally avoided while repetition
 continue
 
 # st_mtime should be advanced "repetition * 2" times, because
 # all changes occurred at same time (in sec)
-expected = (oldstat.st_mtime + repetition * 2) & 0x7fff
-if newstat.st_mtime != expected:
-print("'newstat.st_mtime %s is not %s (as %s + %s * 2)" %
-  (newstat.st_mtime, expected, oldstat.st_mtime, repetition))
+expected = (oldstat[stat.ST_MTIME] + repetition * 2) & 0x7fff
+if newstat[stat.ST_MTIME] != expected:
+print("'newstat[stat.ST_MTIME] %s is not %s (as %s + %s * 2)" %
+  (newstat[stat.ST_MTIME], expected,
+   oldstat[stat.ST_MTIME], repetition))
 
 # no more examination is needed regardless of result
 break
diff --git a/tests/test-context.py b/tests/test-context.py
--- a/tests/test-context.py
+++ b/tests/test-context.py
@@ -1,5 +1,6 @@
 from __future__ import absolute_import, print_function
 import os
+import stat
 from mercurial.node import hex
 from mercurial import (
 context,
@@ -170,7 +171,8 @@
 # touch 00manifest.i mtime so storecache could expire.
 # repo.__dict__['manifestlog'] is deleted by transaction releasefn.
 st = repo.svfs.stat('00manifest.i')
-repo.svfs.utime('00manifest.i', (st.st_mtime + 1, st.st_mtime + 1))
+repo.svfs.utime('00manifest.i',
+(st[stat.ST_MTIME] + 1, st[stat.ST_MTIME] + 1))
 
 # read the file just committed
 try:
diff --git a/tests/test-atomictempfile.py b/tests/test-atomictempfile.py
--- a/tests/test-atomictempfile.py
+++ b/tests/test-atomictempfile.py
@@ -3,6 +3,7 @@
 import glob
 import os
 import shutil
+import stat
 import tempfile
 import unittest
 
@@ -66,7 +67,7 @@
 for i in xrange(5):
 atomicwrite(False)
 oldstat = os.stat(self._filename)
-if oldstat.st_ctime != oldstat.st_mtime:
+if oldstat[stat.ST_CTIME] != oldstat[stat.ST_MTIME]:
 # subsequent changing never causes ambiguity
 continue
 
@@ -77,14 +78,14 @@
 for j in xrange(repetition):
 atomicwrite(True)
 newstat = os.stat(self._filename)
-if oldstat.st_ctime != newstat.st_ctime:
+if oldstat[stat.ST_CTIME] != newstat[stat.ST_CTIME]:
 # timestamp ambiguity was naturally avoided while repetition
 continue
 
 # st_mtime should be advanced "repetition" times, because
 # all atomicwrite() occurred at same time (in sec)
-self.assertTrue(newstat.st_mtime ==
-((oldstat.st_mtime + repetition) & 0x7fff))
+oldtime = (oldstat[stat.ST_MTIME] + repetition) & 0x7fff
+self.assertTrue(newstat[stat.ST_MTIME] == oldtime)
 # no more examination is needed, if assumption above is true
 break
 else:
diff --git a/tests/svn-safe-append.py b/tests/svn-safe-append.py
--- a/tests/svn-safe-append.py
+++ b/tests/svn-safe-append.py
@@ -6,23 +6,23 @@
 Without this svn will not detect workspace changes."""
 
 import os
+import stat
 import sys
 
 text = sys.argv[1]
 fname = sys.argv[2]
 
 f = open(fname, "ab")
 try:
-before = os.fstat(f.fileno()).st_mtime
+before = 

D2697: util: stop calling os.stat_float_times()

2018-03-08 Thread durin42 (Augie Fackler)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG86ba6e3eba4e: util: stop calling os.stat_float_times() 
(authored by durin42, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2697?vs=6657=6729

REVISION DETAIL
  https://phab.mercurial-scm.org/D2697

AFFECTED FILES
  mercurial/util.py

CHANGE DETAILS

diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -176,11 +176,6 @@
 
 _notset = object()
 
-# disable Python's problematic floating point timestamps (issue4836)
-# (Python hypocritically says you shouldn't change this behavior in
-# libraries, and sure enough Mercurial is not a library.)
-os.stat_float_times(False)
-
 def safehasattr(thing, attr):
 return getattr(thing, attr, _notset) is not _notset
 



To: durin42, #hg-reviewers, yuja
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2 V2] hgweb: add a hook for processing LFS Batch API requests

2018-03-08 Thread Matt Harbison

> On Mar 8, 2018, at 1:17 AM, Gregory Szorc  wrote:
> 
>> On Wed, Mar 7, 2018 at 8:49 PM, Matt Harbison  wrote:
>> # HG changeset patch
>> # User Matt Harbison 
>> # Date 1519274700 18000
>> #  Wed Feb 21 23:45:00 2018 -0500
>> # Node ID 89382cb20bb19e089513b2ce69ef8acfa1f523fd
>> # Parent  6dab3bdb1f00932a80ffa07f80ba240c3a4b48df
>> hgweb: add a hook for processing LFS Batch API requests
>> 
>> There really isn't a clean way to give LFS a crack at intercepting the 
>> requests
>> without hardcoding some LFS knowledge in the core.  The rationale for this 
>> URI
>> is that the spec for the Batch API[1] defines the URL as the LFS server url +
>> '/objects/batch'.  The default git URLs are:
>> 
>> Git remote: https://git-server.com/foo/bar
>> LFS server: https://git-server.com/foo/bar.git/info/lfs
>> Batch API: https://git-server.com/foo/bar.git/info/lfs/objects/batch
>> 
>> '.git/' seems like it's not something a user would normally track.  If we 
>> adhere
>> to how git defines the URLs, then the hg-git extension should be able to 
>> talk to
>> a git based server without any additional work.
>> 
>> I'm not sure if checking the User-Agent is a good idea, but this needs a
>> specialized client, and it seems like everyone else is doing it 
>> (3d48ae1aaa5e,
>> e7bb5fc4570c).  We can always back this off if it becomes a nuisance.  A web
>> browser will see "400: bad method", the same as it would before this change.
>> 
>> I'm not sure if None or 'pull' is the proper permission check, but the only
>> difference is whether or not `hgweb.allowpull` is checked.  Since nothing of
>> particular interest is transferred here, and the next phase handles the read 
>> or
>> write, treating this like web interface request seems fine.
>> 
>> [1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
>> 
>> diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
>> --- a/mercurial/hgweb/hgweb_mod.py
>> +++ b/mercurial/hgweb/hgweb_mod.py
>> @@ -90,6 +90,12 @@
>>  urlel = os.path.dirname(urlel)
>>  return reversed(breadcrumb)
>> 
>> +def _processlfsbatchreq(repo, req):
>> +"""A hook for the LFS extension to wrap that handles requests to the 
>> Batch
>> +API, and returns the appropriate JSON response.
>> +"""
>> +raise ErrorResponse(HTTP_NOT_FOUND)
>> +
>>  class requestcontext(object):
>>  """Holds state/context for an individual request.
>> 
>> @@ -376,6 +382,21 @@
>>  except ErrorResponse as inst:
>>  return protohandler['handleerror'](inst)
>> 
>> +# Route LFS Batch API requests to the appropriate handler
>> +
>> +if req.env.get(r'HTTP_USER_AGENT', r'').startswith(r'git-lfs/'):
> 
> I don't like filtering by the user-agent. It is recommended to not do this. 
> Unless it will cause problems or tons of code complexity to properly lock out 
> actual Git clients who may get confused by this, my vote is to remove it and 
> just rely on path filtering.

Ok, I can ditch it.  It was more about locking out web browsers and other 
things not POSTing a JSON request, without changing the current behavior they 
see.  But I guess there are other ways to remotely figure out the minimum 
server version.

>> +try:
>> +path = req.env.get(r'PATH_INFO')
>> +if path == '/.git/info/lfs/objects/batch':
> 
> Relying on PATH_INFO is super annoying. But our code kind of sucks :/
> 
> My only suggestion would be to define `parts` in the `else` block of the `if 
> r'PATH_INFO' in req.env:` above and check `parts == ['.git', 'info', 'lfs', 
> 'objects', 'batch']`. That's still a big ugly though.

OoC, what’s the point of this?  I saw it in the code above this, but it seems 
like string handling is easier than list-of-string handling.  This looks like 
an easy enough change though.

> Whatever happens, my guess is I will eventually write some patches to clean 
> the URL parsing code up.
>  
>> +self.check_perm(rctx, req, None)
> 
> Shouldn't this be `pull` instead of `None`? Otherwise, LFS won't honor web.* 
> to restrict data access.

I went back and forth, but landed on this because it doesn't seem like a pull, 
and still checks web.deny_read and web.allow_read.  It looks like None just 
doesn’t check web.allowpull.  This would be checked prior to the actual file 
transfer in the next step.

Is there a concept of a write-only repo?  Like a try server where you can only 
push stuff, but not pull all of the anonymous heads.  If not, then I think I 
can change it.

>> +return _processlfsbatchreq(rctx.repo, req)
>> +else:
>> +raise ErrorResponse(HTTP_NOT_FOUND)
>> +except ErrorResponse as inst:
>> +req.respond(inst, 'text/plain; charset=utf-8')
>> +# No body, since only lfs clients are allowed here
>> 

D2695: osutil: implement minimal __getitem__ compatibility on our custom listdir type

2018-03-08 Thread durin42 (Augie Fackler)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGf3c314020beb: osutil: implement minimal __getitem__ 
compatibility on our custom listdir type (authored by durin42, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2695?vs=6678=6727

REVISION DETAIL
  https://phab.mercurial-scm.org/D2695

AFFECTED FILES
  mercurial/cext/osutil.c
  mercurial/policy.py

CHANGE DETAILS

diff --git a/mercurial/policy.py b/mercurial/policy.py
--- a/mercurial/policy.py
+++ b/mercurial/policy.py
@@ -69,7 +69,7 @@
 (r'cext', r'bdiff'): 3,
 (r'cext', r'diffhelpers'): 1,
 (r'cext', r'mpatch'): 1,
-(r'cext', r'osutil'): 3,
+(r'cext', r'osutil'): 4,
 (r'cext', r'parsers'): 4,
 }
 
diff --git a/mercurial/cext/osutil.c b/mercurial/cext/osutil.c
--- a/mercurial/cext/osutil.c
+++ b/mercurial/cext/osutil.c
@@ -121,6 +121,27 @@
o->ob_type->tp_free(o);
 }
 
+static PyObject *listdir_stat_getitem(PyObject *self, PyObject *key)
+{
+   long index = PyLong_AsLong(key);
+   if (index == -1 && PyErr_Occurred()) {
+   return NULL;
+   }
+   if (index != 8) {
+   PyErr_Format(PyExc_IndexError, "osutil.stat objects only "
+  "support stat.ST_MTIME in "
+  "__getitem__");
+   return NULL;
+   }
+   return listdir_stat_st_mtime(self, NULL);
+}
+
+static PyMappingMethods listdir_stat_type_mapping_methods = {
+   (lenfunc)NULL, /* mp_length */
+   (binaryfunc)listdir_stat_getitem,   /* mp_subscript */
+   (objobjargproc)NULL,/* mp_ass_subscript */
+};
+
 static PyTypeObject listdir_stat_type = {
PyVarObject_HEAD_INIT(NULL, 0) /* header */
"osutil.stat", /*tp_name*/
@@ -134,7 +155,7 @@
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
-   0, /*tp_as_mapping*/
+   _stat_type_mapping_methods, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
@@ -1352,7 +1373,7 @@
{NULL, NULL}
 };
 
-static const int version = 3;
+static const int version = 4;
 
 #ifdef IS_PY3K
 static struct PyModuleDef osutil_module = {



To: durin42, #hg-reviewers, indygreg, yuja
Cc: mharbison72, indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2688: hgweb: adapt to socket._fileobject changes in Python 3

2018-03-08 Thread durin42 (Augie Fackler)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGbf9a04d78084: hgweb: adapt to socket._fileobject changes in 
Python 3 (authored by durin42, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2688?vs=6673=6726

REVISION DETAIL
  https://phab.mercurial-scm.org/D2688

AFFECTED FILES
  mercurial/hgweb/server.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/server.py b/mercurial/hgweb/server.py
--- a/mercurial/hgweb/server.py
+++ b/mercurial/hgweb/server.py
@@ -257,8 +257,8 @@
 
 def setup(self):
 self.connection = self.request
-self.rfile = socket._fileobject(self.request, "rb", self.rbufsize)
-self.wfile = socket._fileobject(self.request, "wb", self.wbufsize)
+self.rfile = self.request.makefile(r"rb", self.rbufsize)
+self.wfile = self.request.makefile(r"wb", self.wbufsize)
 
 try:
 import threading



To: durin42, #hg-reviewers, yuja
Cc: yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2696: cleanup: use stat_result[stat.ST_MTIME] instead of stat_result.st_mtime

2018-03-08 Thread yuja (Yuya Nishihara)
yuja added a comment.


  Queued, thanks.
  
  Maybe we'll need to update cffi code as a follow-up.

INLINE COMMENTS

> chgserver.py:552
>  return (stat.st_ino == self._socketstat.st_ino and
> -stat.st_mtime == self._socketstat.st_mtime)
> +stat.st_mtime == self._socketstat[stat.ST_MTIME])
>  except OSError:

Fixed name conflicts and st_mtime oversight.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2696

To: durin42, #hg-reviewers, yuja
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] Fix for Bug #5807

2018-03-08 Thread Sascha Nemecek

On 08/03/18 13:16, Yuya Nishihara wrote:

On Wed, 7 Mar 2018 22:48:01 +0100, Sascha Nemecek wrote:

Am 2018-03-02 um 05:14 schrieb Yuya Nishihara:

On Thu, 1 Mar 2018 11:06:59 +0100, Sascha Nemecek wrote:

# HG changeset patch
# User Sascha Nemecek 
# Date 1519831479 -3600
#  Wed Feb 28 16:24:39 2018 +0100
# Node ID 42ddf4ee4f91d76f19ca0c3efc4c8e4c1c6fa96c
# Parent  1bd132a021dd00f96604e33a8fb5306d37e56007
Don't close 'fp' (= 'ui.fout') stream to prevent 'ValueError: I/O
operation on closed file' (Bug #5807).

Regression of changeset 30261:6bed17ba00a1
(https://www.mercurial-scm.org/repo/hg/rev/6bed17ba00a1)

diff -r 1bd132a021dd -r 42ddf4ee4f91 hgext/convert/subversion.py
--- a/hgext/convert/subversion.py   Wed Feb 21 14:36:42 2018 +0530
+++ b/hgext/convert/subversion.py   Wed Feb 28 16:24:39 2018 +0100
@@ -149,7 +149,7 @@
   pickle.dump(str(inst), fp, protocol)
   else:
   pickle.dump(None, fp, protocol)
-fp.close()
+fp.flush()
   # With large history, cleanup process goes crazy and suddenly
   # consumes *huge* amount of memory. The output file being closed,
   # there is no need for clean termination.


I don't think fp.close() was the source of the problem. Here the process
_exit()s so no cleanup would be run.


Of course, the problem might lie deeper. Empirically tested, the
fp.close() triggered the reported error output. From my observation, the
fp.close() affected stdout of the convert function and also pdb.
Therefore I came to the conclusion that the fp.close() / ui.fout.close()
was the culprit.


If this patch magically fixes the issue, that's fine. It's also harmless
to do fflush() instead of fclose() here. I'm just saying that the process
exits immediately after the fp.close() with no cleanup, so I have no idea
why that matters.

   -fp.close()
   +fp.flush()
os._exit(0)


My guess/interpretation is, that this happens in a child process. When 
closing ui.fout, the parent and all other process run into closed doors.



Best regards,
Sascha
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] xdiff: fix trivial build warnings on Windows

2018-03-08 Thread Matt Harbison

> On Mar 8, 2018, at 7:33 AM, Yuya Nishihara  wrote:
> 
>> On Tue, 6 Mar 2018 19:12:26 -0800, Jun Wu wrote:
>> Yeah, xdiff needs a migration from using "long", "int"s to "size_t" etc.
>> The git community has chosen to disallow diff >1GB files because of the
>> overflow concern [1].
>> 
>> [1]: 
>> https://github.com/git/git/commit/dcd1742e56ebb944c4ff62346da4548e1e3be675
> 
> So, should we queue this now or leave warnings to denote things that should
> be cleaned up?

I think the subsequent work that Jun did fixed a bunch of the warnings.  The 
only one that stood out last time I looked was the signed/unsigned comparison.  
This can be dropped if Jun is still working on it.  I didn’t realize he was.

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2 V2] hgweb: add a hook for processing LFS file transfer requests

2018-03-08 Thread Matt Harbison

> On Mar 8, 2018, at 1:22 AM, Gregory Szorc  wrote:
> 
>> On Wed, Mar 7, 2018 at 8:49 PM, Matt Harbison  wrote:
>> # HG changeset patch
>> # User Matt Harbison 
>> # Date 1519275362 18000
>> #  Wed Feb 21 23:56:02 2018 -0500
>> # Node ID 5544688dec388a7c8a988c074aab659f059f549f
>> # Parent  89382cb20bb19e089513b2ce69ef8acfa1f523fd
>> hgweb: add a hook for processing LFS file transfer requests 
>> 
>> As part of this, the PUT request needs to be handled to upload files.  Unlike
>> the requests to the Batch API, this URI is internally controlled, and 
>> provided
>> to the client in the Batch API.  So without any interoperability concerns, 
>> the
>> URI starts with '/.hg', and reflects where the files are actually stored.
>> 
>> The permission check is deferred to the processing function, because this
>> request is used for both uploads and downloads.
>> 
>> diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
>> --- a/mercurial/hgweb/hgweb_mod.py
>> +++ b/mercurial/hgweb/hgweb_mod.py
>> @@ -96,6 +96,15 @@
>>  """
>>  raise ErrorResponse(HTTP_NOT_FOUND)
>> 
>> +def _processlfstransfer(repo, req):
>> +"""A hook for the LFS extension to wrap that handles file transfer 
>> requests.
>> +
>> +The caller MUST call ``req.checkperm()`` with 'push' or 'pull' after it
>> +determines whether this is an upload or a download, prior to accessing 
>> any
>> +repository data.
>> +"""
>> +raise ErrorResponse(HTTP_NOT_FOUND)
>> +
>>  class requestcontext(object):
>>  """Holds state/context for an individual request.
>> 
>> @@ -382,14 +391,20 @@
>>  except ErrorResponse as inst:
>>  return protohandler['handleerror'](inst)
>> 
>> -# Route LFS Batch API requests to the appropriate handler
>> +# Route LFS Batch API and transfer requests to the appropriate 
>> handler
>> 
>>  if req.env.get(r'HTTP_USER_AGENT', r'').startswith(r'git-lfs/'):
>>  try:
>> -path = req.env.get(r'PATH_INFO')
>> +path = req.env.get(r'PATH_INFO', '')
>>  if path == '/.git/info/lfs/objects/batch':
>>  self.check_perm(rctx, req, None)
>>  return _processlfsbatchreq(rctx.repo, req)
>> +elif path.startswith('/.hg/store/lfs/objects'):
> 
> This scares me a bit because we're leaking internal storage paths into the 
> URI space. Must we do this? What's relying on this behavior?

I don’t think anything relies on this.  It’s just a matter of picking a URI 
that can never be a tracked item, path to a subrepo, path in a webconf file, or 
anything in the wire protocol domain.  Starting with '.hg' seemed safest from 
that POV.

I’m open to suggestions.  We could keep the .git prefix, but that seems weird. 
I only did that in the first patch for compatibility with git servers, and auto 
detecting the URL.

>> +# NB: This function is responsible for doing the 
>> appropriate
>> +# permission checks after determining if this is an 
>> upload
>> +# or a download.
>> +req.checkperm = lambda op: self.check_perm(rctx, req, 
>> op)
>> +return _processlfstransfer(rctx.repo, req)
>>  else:
>>  raise ErrorResponse(HTTP_NOT_FOUND)
>>  except ErrorResponse as inst:
>> diff --git a/mercurial/hgweb/server.py b/mercurial/hgweb/server.py
>> --- a/mercurial/hgweb/server.py
>> +++ b/mercurial/hgweb/server.py
>> @@ -111,6 +111,9 @@
>>  self.log_error(r"Exception happened during processing "
>> r"request '%s':%s%s", self.path, newline, tb)
>> 
>> +def do_PUT(self):
>> +self.do_POST()
>> +
>>  def do_GET(self):
>>  self.do_POST()
>> 
>> ___
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> 
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] xdiff: fix trivial build warnings on Windows

2018-03-08 Thread Yuya Nishihara
On Tue, 6 Mar 2018 19:12:26 -0800, Jun Wu wrote:
> Yeah, xdiff needs a migration from using "long", "int"s to "size_t" etc.
> The git community has chosen to disallow diff >1GB files because of the
> overflow concern [1].
> 
> [1]: 
> https://github.com/git/git/commit/dcd1742e56ebb944c4ff62346da4548e1e3be675

So, should we queue this now or leave warnings to denote things that should
be cleaned up?
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] Fix for Bug #5807

2018-03-08 Thread Yuya Nishihara
On Wed, 7 Mar 2018 22:48:01 +0100, Sascha Nemecek wrote:
> Am 2018-03-02 um 05:14 schrieb Yuya Nishihara:
> > On Thu, 1 Mar 2018 11:06:59 +0100, Sascha Nemecek wrote:
> >> # HG changeset patch
> >> # User Sascha Nemecek 
> >> # Date 1519831479 -3600
> >> #  Wed Feb 28 16:24:39 2018 +0100
> >> # Node ID 42ddf4ee4f91d76f19ca0c3efc4c8e4c1c6fa96c
> >> # Parent  1bd132a021dd00f96604e33a8fb5306d37e56007
> >> Don't close 'fp' (= 'ui.fout') stream to prevent 'ValueError: I/O 
> >> operation on closed file' (Bug #5807).
> >>
> >> Regression of changeset 30261:6bed17ba00a1 
> >> (https://www.mercurial-scm.org/repo/hg/rev/6bed17ba00a1)
> >>
> >> diff -r 1bd132a021dd -r 42ddf4ee4f91 hgext/convert/subversion.py
> >> --- a/hgext/convert/subversion.py  Wed Feb 21 14:36:42 2018 +0530
> >> +++ b/hgext/convert/subversion.py  Wed Feb 28 16:24:39 2018 +0100
> >> @@ -149,7 +149,7 @@
> >>   pickle.dump(str(inst), fp, protocol)
> >>   else:
> >>   pickle.dump(None, fp, protocol)
> >> -fp.close()
> >> +fp.flush()
> >>   # With large history, cleanup process goes crazy and suddenly
> >>   # consumes *huge* amount of memory. The output file being closed,
> >>   # there is no need for clean termination.
> > 
> > I don't think fp.close() was the source of the problem. Here the process
> > _exit()s so no cleanup would be run.
> 
> Of course, the problem might lie deeper. Empirically tested, the
> fp.close() triggered the reported error output. From my observation, the
> fp.close() affected stdout of the convert function and also pdb.
> Therefore I came to the conclusion that the fp.close() / ui.fout.close()
> was the culprit.

If this patch magically fixes the issue, that's fine. It's also harmless
to do fflush() instead of fclose() here. I'm just saying that the process
exits immediately after the fp.close() with no cleanup, so I have no idea
why that matters.

  -fp.close()
  +fp.flush()
   os._exit(0)
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2722: url: add HTTP handler that uses a proxied socket

2018-03-08 Thread Jeffreydoomb (JeffreydoombYW)
Jeffreydoomb added a comment.


  Цветочницы http://workle.website/5-

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2722

To: indygreg, #hg-reviewers
Cc: Jeffreydoomb, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel