Re: [PATCH] forget: add --dry-run mode

2018-03-09 Thread Anton Shestakov
On Sat, 10 Mar 2018 15:42:04 +0800
Anton Shestakov  wrote:

> On Sat, 10 Mar 2018 12:46:41 +0530
> Sushil khanchi  wrote:
> 
> > # HG changeset patch
> > # User Sushil khanchi 
> > # Date 1520665399 -19800
> > #  Sat Mar 10 12:33:19 2018 +0530
> > # Node ID 430c6b5123ee72d3a209882495302e43b26cc988
> > # Parent  4c71a26a4009d88590c9ae3d64a5912fd556d82e
> > forget: add --dry-run mode
> > 
> > diff -r 4c71a26a4009 -r 430c6b5123ee mercurial/cmdutil.py
> > --- a/mercurial/cmdutil.py  Sun Mar 04 21:16:36 2018 -0500
> > +++ b/mercurial/cmdutil.py  Sat Mar 10 12:33:19 2018 +0530
> > @@ -1996,7 +1996,7 @@
> >  for subpath in ctx.substate:
> >  ctx.sub(subpath).addwebdirpath(serverpath, webconf)
> >  
> > -def forget(ui, repo, match, prefix, explicitonly):
> > +def forget(ui, repo, match, prefix, explicitonly, **opts):
> >  join = lambda f: os.path.join(prefix, f)
> >  bad = []
> >  badfn = lambda x, y: bad.append(x) or match.bad(x, y)
> > @@ -2039,9 +2039,10 @@
> >  if ui.verbose or not match.exact(f):
> >  ui.status(_('removing %s\n') % match.rel(f))
> >  
> > -rejected = wctx.forget(forget, prefix)
> > -bad.extend(f for f in rejected if f in match.files())
> > -forgot.extend(f for f in forget if f not in rejected)
> > +if not opts.get('dry_run'):
> 
> You want to add r to the string, similar to other code in cmdutil that
> handles opts.

Looks like only add() is doing this so far:
https://www.mercurial-scm.org/repo/hg/rev/a77e61b45384
But it still looks like you want r'' here.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH] ui: remove any combinations of CR|LF from prompt response

2018-03-09 Thread Yuya Nishihara
# HG changeset patch
# User Yuya Nishihara 
# Date 1520664609 -32400
#  Sat Mar 10 15:50:09 2018 +0900
# Node ID c38b2b364df79a9defc3520f19207ce47abcc7d8
# Parent  9ddc9aa26801bac571bd3413a8aed900c2d2efb8
ui: remove any combinations of CR|LF from prompt response

On Windows, we have to accept both CR+LF and LF. This patch simply makes
any trailing CRs and LFs removed from a user input instead of doing stricter
parsing, as an input must be a readable text.

diff --git a/mercurial/ui.py b/mercurial/ui.py
--- a/mercurial/ui.py
+++ b/mercurial/ui.py
@@ -1296,8 +1296,7 @@ class ui(object):
 line = self.fin.readline()
 if not line:
 raise EOFError
-if line.endswith(pycompat.oslinesep):
-line = line[:-len(pycompat.oslinesep)]
+line = line.rstrip(pycompat.oslinesep)
 
 return line
 
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 8] hgk: stop using util.bytesinput() to read a single line from stdin

2018-03-09 Thread Yuya Nishihara
On Fri, 9 Mar 2018 09:25:10 -0800, Gregory Szorc wrote:
> If you have this stdio stuff paged into your brain, you may want to look at
> hook.py and what it is doing with stdio. Essentially, it is doing dup() and
> dup2() to temporarily redirect stdout to stderr such that the wire protocol
> can intercept output and forward it to the wire protocol client (or the
> CLI).

and hook output never be interleaved into the wire protocol channel.

> See hook.redirect(True) in the wire protocol server code and follow
> the trail from there.
> 
> In the case of shell hooks, I believe those processes inherit the parent's
> file descriptors. Which after hook.py mucks with the file descriptors, is
> actually the stderr stream.

I think these dup()s are no longer needed for shell hooks since a subprocess
output is sent to ui.fout which is actually ui.ferr in ssh session. They're
still valid for in-process hooks, though.

> I question the appropriateness of the approach. I think it would be better
> to e.g. send ui.ferr to shell hooks and to temporarily muck with
> sys.stdout/sys.stderr when running Python hooks. But this has implications
> and I haven't thought it through. I'd *really* like to see us not have to
> do the dup()/dup2() dance in the hooks because that is mucking with global
> state and can make it hard to debug server processes.

Yeah. I doubt if the current code would work well in threaded hgweb.

I think we can do a similar trick to commandserver, which basically nullifes
the global sys.stdin/stdout to protect IPC channels from being corrupted by
third-party extensions, and use ui.fin/fout thoroughly.

https://www.mercurial-scm.org/repo/hg/file/4.5.2/mercurial/commandserver.py#l334

Regarding the wire protocol and hooks, this means:

 - do dup()/dup2() business globally, not locally in hook.py
 - isolate wire-protocol streams from ui.fin/fout
   (ssh-wire: dup()ed stdin/stdout, ui.fin -> stdin -> null, stdout -> null,
ui.fout -> ui.ferr -> stderr)
 - in-process hook must use ui instead of print() because print()s will
   be sent to /dev/null
 - shell hook output is redirected to stderr as long as ui.fout points to
   stderr (the stderr will be dup()ed to stdout after fork)
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 6 of 8] ui: do not use rawinput() when we have to replace sys.stdin/stdout

2018-03-09 Thread Yuya Nishihara
On Fri, 09 Mar 2018 23:52:16 -0500, Matt Harbison wrote:
> On Fri, 09 Mar 2018 07:35:39 -0500, Yuya Nishihara  wrote:
> 
> > # HG changeset patch
> > # User Yuya Nishihara 
> > # Date 1520325533 21600
> > #  Tue Mar 06 02:38:53 2018 -0600
> > # Node ID ad7ff97565b261d82952acc9f941e5dd99f11374
> > # Parent  63a13b91e1ab4d9fa0a713935be58794b9cadab5
> > ui: do not use rawinput() when we have to replace sys.stdin/stdout
> 
> Windows really doesn't like this[1].  The simplest example out of that  
> might be:
> 
> --- c:/Users/Matt/projects/hg/tests/test-merge-tools.t
> +++ c:/Users/Matt/projects/hg/tests/test-merge-tools.t.err
> @@ -556,6 +556,9 @@
> > u
> > EOF
> keep (l)ocal [working copy], take (o)ther [merge rev], or leave  
> (u)nresolved for f? u
> +
> +  unrecognized response
> +  keep (l)ocal [working copy], take (o)ther [merge rev], or leave  
> (u)nresolved for f?

Good catch. It appears that the new code is too strict on line ending.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] forget: add --dry-run mode

2018-03-09 Thread Anton Shestakov
On Sat, 10 Mar 2018 12:46:41 +0530
Sushil khanchi  wrote:

> # HG changeset patch
> # User Sushil khanchi 
> # Date 1520665399 -19800
> #  Sat Mar 10 12:33:19 2018 +0530
> # Node ID 430c6b5123ee72d3a209882495302e43b26cc988
> # Parent  4c71a26a4009d88590c9ae3d64a5912fd556d82e
> forget: add --dry-run mode
> 
> diff -r 4c71a26a4009 -r 430c6b5123ee mercurial/cmdutil.py
> --- a/mercurial/cmdutil.pySun Mar 04 21:16:36 2018 -0500
> +++ b/mercurial/cmdutil.pySat Mar 10 12:33:19 2018 +0530
> @@ -1996,7 +1996,7 @@
>  for subpath in ctx.substate:
>  ctx.sub(subpath).addwebdirpath(serverpath, webconf)
>  
> -def forget(ui, repo, match, prefix, explicitonly):
> +def forget(ui, repo, match, prefix, explicitonly, **opts):
>  join = lambda f: os.path.join(prefix, f)
>  bad = []
>  badfn = lambda x, y: bad.append(x) or match.bad(x, y)
> @@ -2039,9 +2039,10 @@
>  if ui.verbose or not match.exact(f):
>  ui.status(_('removing %s\n') % match.rel(f))
>  
> -rejected = wctx.forget(forget, prefix)
> -bad.extend(f for f in rejected if f in match.files())
> -forgot.extend(f for f in forget if f not in rejected)
> +if not opts.get('dry_run'):

You want to add r to the string, similar to other code in cmdutil that
handles opts. It's a python3-compatibility thing.

(Maybe https://www.mercurial-scm.org/wiki/Python3 needs to be expanded
to mention it explicitly)

> +rejected = wctx.forget(forget, prefix)
> +bad.extend(f for f in rejected if f in match.files())
> +forgot.extend(f for f in forget if f not in rejected)

Does this mean that `bad` is not .extend()ed with --dry-run? It's
checked in commands.forget() to determine exit code. Do we want to have
the same exit code with or without --dry-run? I think we do.

>  return bad, forgot
>  
>  def files(ui, ctx, m, fm, fmt, subrepos):
> diff -r 4c71a26a4009 -r 430c6b5123ee mercurial/commands.py
> --- a/mercurial/commands.py   Sun Mar 04 21:16:36 2018 -0500
> +++ b/mercurial/commands.py   Sat Mar 10 12:33:19 2018 +0530
> @@ -2036,7 +2036,10 @@
>  with ui.formatter('files', opts) as fm:
>  return cmdutil.files(ui, ctx, m, fm, fmt, opts.get('subrepos'))
>  
> -@command('^forget', walkopts, _('[OPTION]... FILE...'), inferrepo=True)
> +@command(
> +'^forget',
> +[('', 'dry-run', None, _('only print output'))]
> ++ walkopts, _('[OPTION]... FILE...'), inferrepo=True)

Having _(...) and inferrepo on separate lines would be more consistent
with other commands.

>  def forget(ui, repo, *pats, **opts):
>  """forget the specified files on the next commit
>  
> @@ -2071,7 +2074,7 @@
>  raise error.Abort(_('no files specified'))
>  
>  m = scmutil.match(repo[None], pats, opts)
> -rejected = cmdutil.forget(ui, repo, m, prefix="", explicitonly=False)[0]
> +rejected = cmdutil.forget(ui, repo, m, prefix="", explicitonly=False, 
> **opts)[0]
>  return rejected and 1 or 0
>  
>  @command(
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2588: commit: adds multiline commit message support(issue5616)

2018-03-09 Thread pulkit (Pulkit Goyal)
pulkit added a comment.


  In https://phab.mercurial-scm.org/D2588#44645, @sangeet259 wrote:
  
  > Since the current code just overwrites message each time with the newer. 
What can be done to avoid losing the earlier message values?
  
  
  You should look how we handle multiple `--rev` flags. That will help you in 
this case.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2588

To: sangeet259, #hg-reviewers
Cc: durin42, tom.prince, yuja, pulkit, jeffpc, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH] forget: add --dry-run mode

2018-03-09 Thread Sushil khanchi
# HG changeset patch
# User Sushil khanchi 
# Date 1520665399 -19800
#  Sat Mar 10 12:33:19 2018 +0530
# Node ID 430c6b5123ee72d3a209882495302e43b26cc988
# Parent  4c71a26a4009d88590c9ae3d64a5912fd556d82e
forget: add --dry-run mode

diff -r 4c71a26a4009 -r 430c6b5123ee mercurial/cmdutil.py
--- a/mercurial/cmdutil.py  Sun Mar 04 21:16:36 2018 -0500
+++ b/mercurial/cmdutil.py  Sat Mar 10 12:33:19 2018 +0530
@@ -1996,7 +1996,7 @@
 for subpath in ctx.substate:
 ctx.sub(subpath).addwebdirpath(serverpath, webconf)
 
-def forget(ui, repo, match, prefix, explicitonly):
+def forget(ui, repo, match, prefix, explicitonly, **opts):
 join = lambda f: os.path.join(prefix, f)
 bad = []
 badfn = lambda x, y: bad.append(x) or match.bad(x, y)
@@ -2039,9 +2039,10 @@
 if ui.verbose or not match.exact(f):
 ui.status(_('removing %s\n') % match.rel(f))
 
-rejected = wctx.forget(forget, prefix)
-bad.extend(f for f in rejected if f in match.files())
-forgot.extend(f for f in forget if f not in rejected)
+if not opts.get('dry_run'):
+rejected = wctx.forget(forget, prefix)
+bad.extend(f for f in rejected if f in match.files())
+forgot.extend(f for f in forget if f not in rejected)
 return bad, forgot
 
 def files(ui, ctx, m, fm, fmt, subrepos):
diff -r 4c71a26a4009 -r 430c6b5123ee mercurial/commands.py
--- a/mercurial/commands.py Sun Mar 04 21:16:36 2018 -0500
+++ b/mercurial/commands.py Sat Mar 10 12:33:19 2018 +0530
@@ -2036,7 +2036,10 @@
 with ui.formatter('files', opts) as fm:
 return cmdutil.files(ui, ctx, m, fm, fmt, opts.get('subrepos'))
 
-@command('^forget', walkopts, _('[OPTION]... FILE...'), inferrepo=True)
+@command(
+'^forget',
+[('', 'dry-run', None, _('only print output'))]
++ walkopts, _('[OPTION]... FILE...'), inferrepo=True)
 def forget(ui, repo, *pats, **opts):
 """forget the specified files on the next commit
 
@@ -2071,7 +2074,7 @@
 raise error.Abort(_('no files specified'))
 
 m = scmutil.match(repo[None], pats, opts)
-rejected = cmdutil.forget(ui, repo, m, prefix="", explicitonly=False)[0]
+rejected = cmdutil.forget(ui, repo, m, prefix="", explicitonly=False, 
**opts)[0]
 return rejected and 1 or 0
 
 @command(
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2774: hgweb: remove support for retrieving parameters from POST form data

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Previously, we called out to cgi.parse(), which for POST requests
  parsed multipart/form-data and application/x-www-form-urlencoded
  Content-Type requests for form data, combined it with query string
  parameters, returned a union of the values.
  
  As far as I know, nothing in Mercurial actually uses this mechanism
  to submit data to the HTTP server. The wire protocol has its own
  mechanism for passing parameters. And the web interface only does
  GET requests. Removing support for parsing POST data doesn't break
  any tests.
  
  Another reason to not like this feature is that cgi.parse() may
  modify the QUERY_STRING environment variable as a side-effect.
  In addition, it merges both POST data and the query string into
  one data structure. This prevents consumers from knowing whether
  a variable came from the query string or POST data. That can matter
  for some operations.
  
  I suspect we use cgi.parse() because back when this code was
  initially implemented, it was the function that was readily
  available. In other words, I don't think there was conscious
  choice to support POST data: we just got it because cgi.parse()
  supported it.
  
  Since nothing uses the feature and it is untested, let's remove
  support for parsing POST form data. We can add it back in easily
  enough if we need it in the future.
  
  .. bc::
  
Hgweb no longer reads form data in POST requests from
multipart/form-data and application/x-www-form-urlencoded
requests. Arguments should be specified as URL path components
or in the query string in the URL instead.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2774

AFFECTED FILES
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -8,7 +8,6 @@
 
 from __future__ import absolute_import
 
-import cgi
 import errno
 import socket
 import wsgiref.headers as wsgiheaders
@@ -258,15 +257,12 @@
 self.multiprocess = wsgienv[r'wsgi.multiprocess']
 self.run_once = wsgienv[r'wsgi.run_once']
 self.env = wsgienv
-self.form = normalize(cgi.parse(inp,
-self.env,
-keep_blank_values=1))
+self.req = parserequestfromenv(wsgienv, inp)
+self.form = normalize(self.req.querystringdict)
 self._start_response = start_response
 self.server_write = None
 self.headers = []
 
-self.req = parserequestfromenv(wsgienv, inp)
-
 def respond(self, status, type, filename=None, body=None):
 if not isinstance(type, str):
 type = pycompat.sysstr(type)



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2775: hgweb: create dedicated type for WSGI responses

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  We have refactored the request side of WSGI processing into a dedicated
  type. Now let's do the same thing for the response side.
  
  We invent a ``wsgiresponse`` type. It takes an instance of a
  request (for consulation) and the WSGI application's "start_response"
  handler.
  
  The type basically allows setting the HTTP status line, response
  headers, and the response body.
  
  The WSGI application calls sendresponse() to start sending output.
  Output is emitted as a generator to be fed through the WSGI application.
  According to PEP-, this is the preferred way for output to be
  transmitted. (Our legacy ``wsgirequest`` exposed a write() to send
  data. We do not wish to support this API because it isn't recommended
  by PEP-.)
  
  The wire protocol code has been ported to use the new API.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2775

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/request.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -149,18 +149,18 @@
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def handlewsgirequest(rctx, wsgireq, req, checkperm):
+def handlewsgirequest(rctx, wsgireq, req, res, checkperm):
 """Possibly process a wire protocol request.
 
 If the current request is a wire protocol request, the request is
 processed by this function.
 
 ``wsgireq`` is a ``wsgirequest`` instance.
 ``req`` is a ``parsedrequest`` instance.
+``res`` is a ``wsgiresponse`` instance.
 
-Returns a 2-tuple of (bool, response) where the 1st element indicates
-whether the request was handled and the 2nd element is a return
-value for a WSGI application (often a generator of bytes).
+Returns a bool indicating if the request was serviced. If set, the caller
+should stop processing the request, as a response has already been issued.
 """
 # Avoid cycle involving hg module.
 from .hgweb import common as hgwebcommon
@@ -171,7 +171,7 @@
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
 if 'cmd' not in req.querystringdict:
-return False, None
+return False
 
 cmd = req.querystringdict['cmd'][0]
 
@@ -183,18 +183,19 @@
 # known wire protocol commands and it is less confusing for machine
 # clients.
 if not iscmd(cmd):
-return False, None
+return False
 
 # The "cmd" query string argument is only valid on the root path of the
 # repo. e.g. ``/?cmd=foo``, ``/repo?cmd=foo``. URL paths within the repo
 # like ``/blah?cmd=foo`` are not allowed. So don't recognize the request
 # in this case. We send an HTTP 404 for backwards compatibility reasons.
 if req.dispatchpath:
-res = _handlehttperror(
-hgwebcommon.ErrorResponse(hgwebcommon.HTTP_NOT_FOUND), wsgireq,
-req)
-
-return True, res
+res.status = hgwebcommon.statusmessage(404)
+res.headers['Content-Type'] = HGTYPE
+# TODO This is not a good response to issue for this request. This
+# is mostly for BC for now.
+res.setbodybytes('0\n%s\n' % b'commands not available at this URL')
+return True
 
 proto = httpv1protocolhandler(wsgireq, req, repo.ui,
   lambda perm: checkperm(rctx, wsgireq, perm))
@@ -204,11 +205,16 @@
 # exception here. So consider refactoring into a exception type that
 # is associated with the wire protocol.
 try:
-res = _callhttp(repo, wsgireq, req, proto, cmd)
+_callhttp(repo, wsgireq, req, res, proto, cmd)
 except hgwebcommon.ErrorResponse as e:
-res = _handlehttperror(e, wsgireq, req)
+for k, v in e.headers:
+res.headers[k] = v
+res.status = hgwebcommon.statusmessage(e.code, pycompat.bytestr(e))
+# TODO This response body assumes the failed command was
+# "unbundle." That assumption is not always valid.
+res.setbodybytes('0\n%s\n' % pycompat.bytestr(e))
 
-return True, res
+return True
 
 def _httpresponsetype(ui, req, prefer_uncompressed):
 """Determine the appropriate response type and compression settings.
@@ -250,7 +256,10 @@
 opts = {'level': ui.configint('server', 'zliblevel')}
 return HGTYPE, util.compengines['zlib'], opts
 
-def _callhttp(repo, wsgireq, req, proto, cmd):
+def _callhttp(repo, wsgireq, req, res, proto, cmd):
+# Avoid cycle involving hg module.
+from .hgweb import common as hgwebcommon
+
 def genversion2(gen, engine, engineopts):
 # application/mercurial-0.2 always sends a payload header
 # identifying the compression engine.
@@ -262,26 +271,35 @@
   

D2773: hgweb: remove support for short query string based aliases (BC)

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg updated this revision to Diff 6820.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2773?vs=6818=6820

REVISION DETAIL
  https://phab.mercurial-scm.org/D2773

AFFECTED FILES
  mercurial/hgweb/request.py
  tests/test-hgweb-raw.t

CHANGE DETAILS

diff --git a/tests/test-hgweb-raw.t b/tests/test-hgweb-raw.t
--- a/tests/test-hgweb-raw.t
+++ b/tests/test-hgweb-raw.t
@@ -17,7 +17,7 @@
   $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid
 
   $ cat hg.pid >> $DAEMON_PIDS
-  $ (get-with-headers.py localhost:$HGPORT 
'?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw' content-type 
content-length content-disposition) >getoutput.txt
+  $ (get-with-headers.py localhost:$HGPORT 
'raw-file/bf0ff59095c9/sub/some%20text%25.txt' content-type content-length 
content-disposition) >getoutput.txt
 
   $ killdaemons.py hg.pid
 
@@ -32,14 +32,14 @@
   It is very boring to read, but computers don't
   care about things like that.
   $ cat access.log error.log
-  $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw 
HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /raw-file/bf0ff59095c9/sub/some%20text%25.txt 
HTTP/1.1" 200 - (glob)
 
   $ rm access.log error.log
   $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid \
   > --config web.guessmime=True
 
   $ cat hg.pid >> $DAEMON_PIDS
-  $ (get-with-headers.py localhost:$HGPORT 
'?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw' content-type 
content-length content-disposition) >getoutput.txt
+  $ (get-with-headers.py localhost:$HGPORT 
'raw-file/bf0ff59095c9/sub/some%20text%25.txt' content-type content-length 
content-disposition) >getoutput.txt
   $ killdaemons.py hg.pid
 
   $ cat getoutput.txt
@@ -53,6 +53,6 @@
   It is very boring to read, but computers don't
   care about things like that.
   $ cat access.log error.log
-  $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw 
HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /raw-file/bf0ff59095c9/sub/some%20text%25.txt 
HTTP/1.1" 200 - (glob)
 
   $ cd ..
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -27,37 +27,6 @@
 util,
 )
 
-shortcuts = {
-'cl': [('cmd', ['changelog']), ('rev', None)],
-'sl': [('cmd', ['shortlog']), ('rev', None)],
-'cs': [('cmd', ['changeset']), ('node', None)],
-'f': [('cmd', ['file']), ('filenode', None)],
-'fl': [('cmd', ['filelog']), ('filenode', None)],
-'fd': [('cmd', ['filediff']), ('node', None)],
-'fa': [('cmd', ['annotate']), ('filenode', None)],
-'mf': [('cmd', ['manifest']), ('manifest', None)],
-'ca': [('cmd', ['archive']), ('node', None)],
-'tags': [('cmd', ['tags'])],
-'tip': [('cmd', ['changeset']), ('node', ['tip'])],
-'static': [('cmd', ['static']), ('file', None)]
-}
-
-def normalize(form):
-# first expand the shortcuts
-for k in shortcuts:
-if k in form:
-for name, value in shortcuts[k]:
-if value is None:
-value = form[k]
-form[name] = value
-del form[k]
-# And strip the values
-bytesform = {}
-for k, v in form.iteritems():
-bytesform[pycompat.bytesurl(k)] = [
-pycompat.bytesurl(i.strip()) for i in v]
-return bytesform
-
 @attr.s(frozen=True)
 class parsedrequest(object):
 """Represents a parsed WSGI request.
@@ -258,7 +227,7 @@
 self.run_once = wsgienv[r'wsgi.run_once']
 self.env = wsgienv
 self.req = parserequestfromenv(wsgienv, inp)
-self.form = normalize(self.req.querystringdict)
+self.form = self.req.querystringdict
 self._start_response = start_response
 self.server_write = None
 self.headers = []



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2776: hgweb: use a multidict for holding query string parameters

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  My intention with refactoring the WSGI code was to make it easier
  to read. I initially wanted to vendor and use WebOb, because it seems
  to be a pretty reasonable abstraction layer for WSGI. However, it isn't
  using relative imports and I didn't want to deal with the hassle of
  patching it. But that doesn't mean we can't use good ideas from WebOb.
  
  WebOb has a "multidict" data structure for holding parsed query string
  and POST form data. It quacks like a dict but allows you to store
  multiple values for each key. It offers mechanisms to return just one
  value, all values, or return 1 value asserting that only 1 value is
  set. I quite like its API.
  
  This commit implements a read-only "multidict" in the spirit of
  WebOb's multidict.
  
  We replace the query string attributes of our parsed request with
  an instance of it.
  
  For the record, I'm not a huge fan of the method to convert instances
  to a dict of lists. But this is needed to support the wsgirequest.form
  API.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2776

AFFECTED FILES
  mercurial/hgweb/request.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -170,10 +170,10 @@
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
-if 'cmd' not in req.querystringdict:
+if 'cmd' not in req.qsparams:
 return False
 
-cmd = req.querystringdict['cmd'][0]
+cmd = req.qsparams['cmd']
 
 # The "cmd" request parameter is used by both the wire protocol and hgweb.
 # While not all wire protocol commands are available for all transports,
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -28,6 +28,84 @@
 util,
 )
 
+class multidict(object):
+"""A dict like object that can store multiple values for a key.
+
+Used to store parsed request parameters.
+
+This is inspired by WebOb's class of the same name.
+"""
+def __init__(self):
+# Stores (key, value) 2-tuples. This isn't the most efficient. But we
+# don't rely on parameters that much, so it shouldn't be a perf issue.
+# we can always add dict for fast lookups.
+self._items = []
+
+def __getitem__(self, key):
+"""Returns the last set value for a key."""
+for k, v in reversed(self._items):
+if k == key:
+return v
+
+raise KeyError(key)
+
+def __setitem__(self, key, value):
+"""Replace a values for a key with a new value."""
+try:
+del self[key]
+except KeyError:
+pass
+
+self._items.append((key, value))
+
+def __delitem__(self, key):
+"""Delete all values for a key."""
+oldlen = len(self._items)
+
+self._items[:] = [(k, v) for k, v in self._items if k != key]
+
+if oldlen == len(self._items):
+raise KeyError(key)
+
+def __contains__(self, key):
+return any(k == key for k, v in self._items)
+
+def __len__(self):
+return len(self._items)
+
+def add(self, key, value):
+"""Add a new value for a key. Does not replace existing values."""
+self._items.append((key, value))
+
+def getall(self, key):
+"""Obtains all values for a key."""
+return [v for k, v in self._items if k == key]
+
+def getone(self, key):
+"""Obtain a single value for a key.
+
+Raises KeyError if key not defined or it has multiple values set.
+"""
+vals = self.getall(key)
+
+if not vals:
+raise KeyError(key)
+
+if len(vals) > 1:
+raise KeyError('multiple values for %r' % key)
+
+return vals[0]
+
+def asdictoflists(self):
+d = {}
+for k, v in self._items:
+if k in d:
+d[k].append(v)
+else:
+d[k] = [v]
+
+return d
+
 @attr.s(frozen=True)
 class parsedrequest(object):
 """Represents a parsed WSGI request.
@@ -56,10 +134,8 @@
 havepathinfo = attr.ib()
 # Raw query string (part after "?" in URL).
 querystring = attr.ib()
-# List of 2-tuples of query string arguments.
-querystringlist = attr.ib()
-# Dict of query string arguments. Values are lists with at least 1 item.
-querystringdict = attr.ib()
+# multidict of query string parameters.
+qsparams = attr.ib()
 # wsgiref.headers.Headers instance. Operates like a dict with case
 # insensitive keys.
 headers = attr.ib()
@@ -157,14 +233,9 @@
 
 # We store as a list 

Re: [PATCH 6 of 8] ui: do not use rawinput() when we have to replace sys.stdin/stdout

2018-03-09 Thread Matt Harbison

On Fri, 09 Mar 2018 07:35:39 -0500, Yuya Nishihara  wrote:


# HG changeset patch
# User Yuya Nishihara 
# Date 1520325533 21600
#  Tue Mar 06 02:38:53 2018 -0600
# Node ID ad7ff97565b261d82952acc9f941e5dd99f11374
# Parent  63a13b91e1ab4d9fa0a713935be58794b9cadab5
ui: do not use rawinput() when we have to replace sys.stdin/stdout


Windows really doesn't like this[1].  The simplest example out of that  
might be:


--- c:/Users/Matt/projects/hg/tests/test-merge-tools.t
+++ c:/Users/Matt/projects/hg/tests/test-merge-tools.t.err
@@ -556,6 +556,9 @@
   > u
   > EOF
   keep (l)ocal [working copy], take (o)ther [merge rev], or leave  
(u)nresolved for f? u

+
+  unrecognized response
+  keep (l)ocal [working copy], take (o)ther [merge rev], or leave  
(u)nresolved for f?

   0 files updated, 0 files merged, 0 files removed, 1 files unresolved
   use 'hg resolve' to retry unresolved file merges or 'hg merge --abort'  
to abandon

   [1]

(I bisected it back to this commit.  The fact that the testbot didn't pick  
up on it when first run is an example of changes not affecting the results  
until the subsequent run.)


[1]  
https://buildbot.mercurial-scm.org/builders/Win7%20x86_64%20hg%20tests/builds/542/steps/run-tests.py%20%28python%202.7.13%29/logs/stdio

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2588: commit: adds multiline commit message support(issue5616)

2018-03-09 Thread sangeet259 (Sangeet Kumar Mishra)
sangeet259 added a comment.


  Since the current code just overwrites message each time with the newer. What 
can be done to avoid losing the earlier message values?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2588

To: sangeet259, #hg-reviewers
Cc: durin42, tom.prince, yuja, pulkit, jeffpc, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2772: hgweb: parse and store POST form data

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg abandoned this revision.
indygreg added a subscriber: durin42.
indygreg added a comment.


  @durin42 thinks we don't need this feature. So I'll submit a patch to remove 
it instead.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2772

To: indygreg, #hg-reviewers
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 3] bdiff: convert more longs to int64_t

2018-03-09 Thread Matt Harbison
# HG changeset patch
# User Matt Harbison 
# Date 1520650747 18000
#  Fri Mar 09 21:59:07 2018 -0500
# Node ID 09be2aeb8f5a364fab574ed3cf00bddbb7b9728a
# Parent  1f313a913f4356f272ef275061d5d169d9c1690e
bdiff: convert more longs to int64_t

MSVC previously flagged these where the function is stored in a pointer:

bdiff.c(284) : warning C4028: formal parameter 1 different from declaration
bdiff.c(284) : warning C4028: formal parameter 2 different from declaration
bdiff.c(284) : warning C4028: formal parameter 3 different from declaration
bdiff.c(284) : warning C4028: formal parameter 4 different from declaration

diff --git a/mercurial/cext/bdiff.c b/mercurial/cext/bdiff.c
--- a/mercurial/cext/bdiff.c
+++ b/mercurial/cext/bdiff.c
@@ -257,7 +257,8 @@
return NULL;
 }
 
-static int hunk_consumer(long a1, long a2, long b1, long b2, void *priv)
+static int hunk_consumer(int64_t a1, int64_t a2, int64_t b1, int64_t b2,
+ void *priv)
 {
PyObject *rl = (PyObject *)priv;
PyObject *m = Py_BuildValue("", a1, a2, b1, b2);
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 3] xdiff: silence a 32-bit shift warning on Windows

2018-03-09 Thread Matt Harbison
# HG changeset patch
# User Matt Harbison 
# Date 1520649753 18000
#  Fri Mar 09 21:42:33 2018 -0500
# Node ID 1f313a913f4356f272ef275061d5d169d9c1690e
# Parent  d3b978ff5c3fc50b33b3ca8f6c371df23d46404b
xdiff: silence a 32-bit shift warning on Windows

It's probably harmless, but:

warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits
(was 64-bit shift intended?)

Adding a 'ULL' suffix to 1 also works, but I doubt that's portable.

diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -71,7 +71,7 @@
cf->flags = flags;
 
cf->hbits = xdl_hashbits(size);
-   cf->hsize = 1 << cf->hbits;
+   cf->hsize = ((uint64_t)1) << cf->hbits;
 
if (xdl_cha_init(>ncha, sizeof(xdlclass_t), size / 4 + 1) < 0) {
 
@@ -263,7 +263,7 @@
 
{
hbits = xdl_hashbits(narec);
-   hsize = 1 << hbits;
+   hsize = ((uint64_t)1) << hbits;
if (!(rhash = (xrecord_t **) xdl_malloc(hsize * 
sizeof(xrecord_t *
goto abort;
memset(rhash, 0, hsize * sizeof(xrecord_t *));
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2773: hgweb: remove support for short query string based aliases (BC)

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg updated this revision to Diff 6818.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2773?vs=6817=6818

REVISION DETAIL
  https://phab.mercurial-scm.org/D2773

AFFECTED FILES
  mercurial/hgweb/request.py
  tests/test-hgweb-raw.t

CHANGE DETAILS

diff --git a/tests/test-hgweb-raw.t b/tests/test-hgweb-raw.t
--- a/tests/test-hgweb-raw.t
+++ b/tests/test-hgweb-raw.t
@@ -17,7 +17,7 @@
   $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid
 
   $ cat hg.pid >> $DAEMON_PIDS
-  $ (get-with-headers.py localhost:$HGPORT 
'?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw' content-type 
content-length content-disposition) >getoutput.txt
+  $ (get-with-headers.py localhost:$HGPORT 
'raw-file/bf0ff59095c9/sub/some%20text%25.txt' content-type content-length 
content-disposition) >getoutput.txt
 
   $ killdaemons.py hg.pid
 
@@ -32,14 +32,14 @@
   It is very boring to read, but computers don't
   care about things like that.
   $ cat access.log error.log
-  $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw 
HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /raw-file/bf0ff59095c9/sub/some%20text%25.txt 
HTTP/1.1" 200 - (glob)
 
   $ rm access.log error.log
   $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid \
   > --config web.guessmime=True
 
   $ cat hg.pid >> $DAEMON_PIDS
-  $ (get-with-headers.py localhost:$HGPORT 
'?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw' content-type 
content-length content-disposition) >getoutput.txt
+  $ (get-with-headers.py localhost:$HGPORT 
'raw-file/bf0ff59095c9/sub/some%20text%25.txt' content-type content-length 
content-disposition) >getoutput.txt
   $ killdaemons.py hg.pid
 
   $ cat getoutput.txt
@@ -53,6 +53,6 @@
   It is very boring to read, but computers don't
   care about things like that.
   $ cat access.log error.log
-  $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw 
HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /raw-file/bf0ff59095c9/sub/some%20text%25.txt 
HTTP/1.1" 200 - (glob)
 
   $ cd ..
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -28,37 +28,6 @@
 util,
 )
 
-shortcuts = {
-'cl': [('cmd', ['changelog']), ('rev', None)],
-'sl': [('cmd', ['shortlog']), ('rev', None)],
-'cs': [('cmd', ['changeset']), ('node', None)],
-'f': [('cmd', ['file']), ('filenode', None)],
-'fl': [('cmd', ['filelog']), ('filenode', None)],
-'fd': [('cmd', ['filediff']), ('node', None)],
-'fa': [('cmd', ['annotate']), ('filenode', None)],
-'mf': [('cmd', ['manifest']), ('manifest', None)],
-'ca': [('cmd', ['archive']), ('node', None)],
-'tags': [('cmd', ['tags'])],
-'tip': [('cmd', ['changeset']), ('node', ['tip'])],
-'static': [('cmd', ['static']), ('file', None)]
-}
-
-def normalize(form):
-# first expand the shortcuts
-for k in shortcuts:
-if k in form:
-for name, value in shortcuts[k]:
-if value is None:
-value = form[k]
-form[name] = value
-del form[k]
-# And strip the values
-bytesform = {}
-for k, v in form.iteritems():
-bytesform[pycompat.bytesurl(k)] = [
-pycompat.bytesurl(i.strip()) for i in v]
-return bytesform
-
 @attr.s(frozen=True)
 class parsedrequest(object):
 """Represents a parsed WSGI request.
@@ -311,7 +280,7 @@
 self.run_once = wsgienv[r'wsgi.run_once']
 self.env = wsgienv
 self.req = parserequestfromenv(wsgienv, inp)
-self.form = normalize(self.req.params)
+self.form = self.req.params
 self._start_response = start_response
 self.server_write = None
 self.headers = []



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2767: hgweb: document continuereader

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2767

AFFECTED FILES
  mercurial/hgweb/common.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/common.py b/mercurial/hgweb/common.py
--- a/mercurial/hgweb/common.py
+++ b/mercurial/hgweb/common.py
@@ -101,6 +101,13 @@
 self.headers = headers
 
 class continuereader(object):
+"""File object wrapper to handle HTTP 100-continue.
+
+This is used by servers so they automatically handle Expect: 100-continue
+request headers. On first read of the request body, the 100 Continue
+response is sent. This should trigger the client into actually sending
+the request body.
+"""
 def __init__(self, f, write):
 self.f = f
 self._write = write



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2769: hgweb: refactor the request draining code

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The previous code for draining was only invoked in a few places in
  the wire protocol. Behavior wasn't consist. Furthermore, it was
  difficult to reason about.
  
  With us converting the input stream to a capped reader, it is now
  safe to always drain the input stream when its size is known because
  we can never overrun the input and read into the next HTTP request.
  The only question is "should we?"
  
  This commit changes the draining code so every request is examined.
  Draining now kicks in for a few requests where it wouldn't before.
  But I think the code is sufficiently restricted so the behavior is
  safe. Possibly the most dangerous part of this code is the issuing
  of Connection: close for POST and PUT requests that don't have a
  Content-Length. I don't think there are any such uses in our WSGI
  application, so this should be safe.
  
  In the near future, I plan to significantly refactor the WSGI
  response handling. I anticipate this code evolving a bit. So any
  minor regressions around draining or connection closing behavior
  might be fixed as a result of that work.
  
  All tests pass with this change. That scares me a bit because it
  means we are lacking low-level tests for the HTTP protocol.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2769

AFFECTED FILES
  mercurial/hgweb/request.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -301,9 +301,6 @@
 wsgireq.respond(HTTP_OK, HGTYPE, body=rsp)
 return []
 elif isinstance(rsp, wireprototypes.pusherr):
-# This is the httplib workaround documented in _handlehttperror().
-wsgireq.drain()
-
 rsp = '0\n%s\n' % rsp.res
 wsgireq.respond(HTTP_OK, HGTYPE, body=rsp)
 return []
@@ -316,21 +313,6 @@
 def _handlehttperror(e, wsgireq, req):
 """Called when an ErrorResponse is raised during HTTP request 
processing."""
 
-# Clients using Python's httplib are stateful: the HTTP client
-# won't process an HTTP response until all request data is
-# sent to the server. The intent of this code is to ensure
-# we always read HTTP request data from the client, thus
-# ensuring httplib transitions to a state that allows it to read
-# the HTTP response. In other words, it helps prevent deadlocks
-# on clients using httplib.
-
-if (req.method == 'POST' and
-# But not if Expect: 100-continue is being used.
-(req.headers.get('Expect', '').lower() != '100-continue')):
-wsgireq.drain()
-else:
-wsgireq.headers.append((r'Connection', r'Close'))
-
 # TODO This response body assumes the failed command was
 # "unbundle." That assumption is not always valid.
 wsgireq.respond(e, HGTYPE, body='0\n%s\n' % pycompat.bytestr(e))
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -254,12 +254,6 @@
 self.server_write = None
 self.headers = []
 
-def drain(self):
-'''need to read all data from request, httplib is half-duplex'''
-length = int(self.env.get('CONTENT_LENGTH') or 0)
-for s in util.filechunkiter(self.inp, limit=length):
-pass
-
 def respond(self, status, type, filename=None, body=None):
 if not isinstance(type, str):
 type = pycompat.sysstr(type)
@@ -292,6 +286,49 @@
 elif isinstance(status, int):
 status = statusmessage(status)
 
+# Various HTTP clients (notably httplib) won't read the HTTP
+# response until the HTTP request has been sent in full. If servers
+# (us) send a response before the HTTP request has been fully sent,
+# the connection may deadlock because neither end is reading.
+#
+# We work around this by "draining" the request data before
+# sending any response in some conditions.
+drain = False
+close = False
+
+if self.env[r'REQUEST_METHOD'] in (r'POST', r'PUT'):
+# If we can't guarantee the end of stream, close the connection
+# because any reading may be unsafe.
+if not isinstance(self.inp, util.cappedreader):
+close = True
+else:
+# We know the length of the input. So draining is possible.
+# Should we drain?
+
+# If the client sent an Expect: 100-continue, we assume
+# the client is intelligent and can handle no draining.
+if (self.env.get(r'HTTP_EXPECT', r'').lower()
+== 

D2771: hgweb: expose input stream on parsed WSGI request object

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Our next step towards moving away from wsgirequest to our newer,
  friendlier parsedrequest type is input stream access.
  
  This commit exposes the input stream on the instance. Consumers
  in the HTTP protocol server switch to it.
  
  Because there were very few consumers of the input stream, we stopped
  storing a reference to the input stream on wsgirequest directly. All
  access now goes through parsedrequest. However, wsgirequest still
  may read from this stream as part of cgi.parse(). So we still need to
  create the stream from wsgirequest.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2771

AFFECTED FILES
  mercurial/hgweb/hgwebdir_mod.py
  mercurial/hgweb/request.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -83,7 +83,7 @@
 postlen = int(self._req.headers.get(b'X-HgArgs-Post', 0))
 if postlen:
 args.update(urlreq.parseqs(
-self._wsgireq.inp.read(postlen), keep_blank_values=True))
+self._req.bodyfh.read(postlen), keep_blank_values=True))
 return args
 
 argvalue = decodevaluefromheaders(self._req, b'X-HgArg')
@@ -97,7 +97,7 @@
 # If httppostargs is used, we need to read Content-Length
 # minus the amount that was consumed by args.
 length -= int(self._req.headers.get(b'X-HgArgs-Post', 0))
-for s in util.filechunkiter(self._wsgireq.inp, limit=length):
+for s in util.filechunkiter(self._req.bodyfh, limit=length):
 fp.write(s)
 
 @contextlib.contextmanager
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -61,7 +61,10 @@
 
 @attr.s(frozen=True)
 class parsedrequest(object):
-"""Represents a parsed WSGI request / static HTTP request parameters."""
+"""Represents a parsed WSGI request.
+
+Contains both parsed parameters as well as a handle on the input stream.
+"""
 
 # Request method.
 method = attr.ib()
@@ -91,8 +94,10 @@
 # wsgiref.headers.Headers instance. Operates like a dict with case
 # insensitive keys.
 headers = attr.ib()
+# Request body input stream.
+bodyfh = attr.ib()
 
-def parserequestfromenv(env):
+def parserequestfromenv(env, bodyfh):
 """Parse URL components from environment variables.
 
 WSGI defines request attributes via environment variables. This function
@@ -209,6 +214,12 @@
 if 'CONTENT_LENGTH' in env and 'HTTP_CONTENT_LENGTH' not in env:
 headers['Content-Length'] = env['CONTENT_LENGTH']
 
+# TODO do this once we remove wsgirequest.inp, otherwise we could have
+# multiple readers from the underlying input stream.
+#bodyfh = env['wsgi.input']
+#if 'Content-Length' in headers:
+#bodyfh = util.cappedreader(bodyfh, int(headers['Content-Length']))
+
 return parsedrequest(method=env['REQUEST_METHOD'],
  url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
@@ -219,7 +230,8 @@
  querystring=querystring,
  querystringlist=querystringlist,
  querystringdict=querystringdict,
- headers=headers)
+ headers=headers,
+ bodyfh=bodyfh)
 
 class wsgirequest(object):
 """Higher-level API for a WSGI request.
@@ -233,28 +245,27 @@
 if (version < (1, 0)) or (version >= (2, 0)):
 raise RuntimeError("Unknown and unsupported WSGI version %d.%d"
% version)
-self.inp = wsgienv[r'wsgi.input']
+
+inp = wsgienv[r'wsgi.input']
 
 if r'HTTP_CONTENT_LENGTH' in wsgienv:
-self.inp = util.cappedreader(self.inp,
- int(wsgienv[r'HTTP_CONTENT_LENGTH']))
+inp = util.cappedreader(inp, int(wsgienv[r'HTTP_CONTENT_LENGTH']))
 elif r'CONTENT_LENGTH' in wsgienv:
-self.inp = util.cappedreader(self.inp,
- int(wsgienv[r'CONTENT_LENGTH']))
+inp = util.cappedreader(inp, int(wsgienv[r'CONTENT_LENGTH']))
 
 self.err = wsgienv[r'wsgi.errors']
 self.threaded = wsgienv[r'wsgi.multithread']
 self.multiprocess = wsgienv[r'wsgi.multiprocess']
 self.run_once = wsgienv[r'wsgi.run_once']
 self.env = wsgienv
-self.form = normalize(cgi.parse(self.inp,
+self.form = normalize(cgi.parse(inp,
 self.env,
 keep_blank_values=1))
 self._start_response = start_response
 

D2773: hgweb: remove support for short query string based aliases (BC)

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Form data exposed by hgweb is post-processed to expand certain
  shortcuts. For example, URLs with "?cs=@" is essentially expanded to
  "?cmd=changeset=@". And the URL router treats this the same
  as "/changeset/@".
  
  These shortcuts were initially added in 2005 in 
https://phab.mercurial-scm.org/rHG34cb3957d875ce3341c0ec4b86f016a60aded698 and
  https://phab.mercurial-scm.org/rHG964baa35faf8218650d412581f0567eb41ae1ee9. 
They have rarely been touched in the last decade (just
  moving code around a bit).
  
  We have almost no test coverage of this feature. AFAICT no templates
  reference URLs of this form. I even looked at the initial version
  of paper and coal from ~2008 and they use the "/command/params" URL
  form and not these shortcuts.
  
  Furthermore, I couldn't even get some shortcuts to work! For example,
  "?sl=@" attempts to do a revision search instead of showing shortlog
  starting at revision @. Maybe I'm just doing it wrong?
  
  Because this is ancient, mostly untested code, there is a migration
  path to something better, and because anyone passionate enough to
  preserve URLs can install URL redirects, let's nuke the feature.
  
  .. bc::
  
Query string shorts in hgweb like ``?cs=@`` have been removed. Use
URLs of the form ``/:cmd`` instead.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2773

AFFECTED FILES
  mercurial/hgweb/request.py
  tests/test-hgweb-raw.t

CHANGE DETAILS

diff --git a/tests/test-hgweb-raw.t b/tests/test-hgweb-raw.t
--- a/tests/test-hgweb-raw.t
+++ b/tests/test-hgweb-raw.t
@@ -17,7 +17,7 @@
   $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid
 
   $ cat hg.pid >> $DAEMON_PIDS
-  $ (get-with-headers.py localhost:$HGPORT 
'?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw' content-type 
content-length content-disposition) >getoutput.txt
+  $ (get-with-headers.py localhost:$HGPORT 
'raw-file/bf0ff59095c9/sub/some%20text%25.txt' content-type content-length 
content-disposition) >getoutput.txt
 
   $ killdaemons.py hg.pid
 
@@ -32,14 +32,14 @@
   It is very boring to read, but computers don't
   care about things like that.
   $ cat access.log error.log
-  $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw 
HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /raw-file/bf0ff59095c9/sub/some%20text%25.txt 
HTTP/1.1" 200 -
 
   $ rm access.log error.log
   $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid \
   > --config web.guessmime=True
 
   $ cat hg.pid >> $DAEMON_PIDS
-  $ (get-with-headers.py localhost:$HGPORT 
'?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw' content-type 
content-length content-disposition) >getoutput.txt
+  $ (get-with-headers.py localhost:$HGPORT 
'raw-file/bf0ff59095c9/sub/some%20text%25.txt' content-type content-length 
content-disposition) >getoutput.txt
   $ killdaemons.py hg.pid
 
   $ cat getoutput.txt
@@ -53,6 +53,6 @@
   It is very boring to read, but computers don't
   care about things like that.
   $ cat access.log error.log
-  $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw 
HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /raw-file/bf0ff59095c9/sub/some%20text%25.txt 
HTTP/1.1" 200 -
 
   $ cd ..
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -28,37 +28,6 @@
 util,
 )
 
-shortcuts = {
-'cl': [('cmd', ['changelog']), ('rev', None)],
-'sl': [('cmd', ['shortlog']), ('rev', None)],
-'cs': [('cmd', ['changeset']), ('node', None)],
-'f': [('cmd', ['file']), ('filenode', None)],
-'fl': [('cmd', ['filelog']), ('filenode', None)],
-'fd': [('cmd', ['filediff']), ('node', None)],
-'fa': [('cmd', ['annotate']), ('filenode', None)],
-'mf': [('cmd', ['manifest']), ('manifest', None)],
-'ca': [('cmd', ['archive']), ('node', None)],
-'tags': [('cmd', ['tags'])],
-'tip': [('cmd', ['changeset']), ('node', ['tip'])],
-'static': [('cmd', ['static']), ('file', None)]
-}
-
-def normalize(form):
-# first expand the shortcuts
-for k in shortcuts:
-if k in form:
-for name, value in shortcuts[k]:
-if value is None:
-value = form[k]
-form[name] = value
-del form[k]
-# And strip the values
-bytesform = {}
-for k, v in form.iteritems():
-bytesform[pycompat.bytesurl(k)] = [
-pycompat.bytesurl(i.strip()) for i in v]
-return bytesform
-
 @attr.s(frozen=True)
 class parsedrequest(object):
 """Represents a parsed WSGI request.
@@ -311,7 +280,7 @@
 self.run_once = wsgienv[r'wsgi.run_once']
 self.env = wsgienv
 self.req = parserequestfromenv(wsgienv, inp)
-self.form = 

D2770: hgweb: make parsedrequest part of wsgirequest

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This is kind of ugly. But upcoming commits will teach our parsedrequest
  instances about how to read from the input stream and how to parse
  form variables which might be specified on the input stream. Because
  the input stream is global state and can't be accessed without
  side-effects, we need to take actions to ensure that multiple
  consumers don't read from it independently. The easiest way to
  do this is for one object to hold a reference to both things so it
  can ensure (via attribute nuking) that consumers only touch one
  object or the other.
  
  So we create our parsed request instance from the wsgirequest
  constructor and hold a reference to it there. This is better than
  our new type holding a reference to wsgirequest because all the
  code for managing access will be temporary and we shouldn't pollute
  parsedrequest with this ugly history.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2770

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/hgwebdir_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -254,6 +254,8 @@
 self.server_write = None
 self.headers = []
 
+self.req = parserequestfromenv(wsgienv)
+
 def respond(self, status, type, filename=None, body=None):
 if not isinstance(type, str):
 type = pycompat.sysstr(type)
diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py
--- a/mercurial/hgweb/hgwebdir_mod.py
+++ b/mercurial/hgweb/hgwebdir_mod.py
@@ -229,7 +229,7 @@
 yield r
 
 def _runwsgi(self, wsgireq):
-req = requestmod.parserequestfromenv(wsgireq.env)
+req = wsgireq.req
 
 try:
 self.refresh()
@@ -289,6 +289,11 @@
 real = repos.get(virtualrepo)
 if real:
 wsgireq.env['REPO_NAME'] = virtualrepo
+# We have to re-parse because of updated environment
+# variable.
+# TODO this is kind of hacky and we should have a better
+# way of doing this than with REPO_NAME side-effects.
+wsgireq.req = requestmod.parserequestfromenv(wsgireq.env)
 try:
 # ensure caller gets private copy of ui
 repo = hg.repository(self.ui.copy(), real)
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -304,7 +304,7 @@
 yield r
 
 def _runwsgi(self, wsgireq, repo):
-req = requestmod.parserequestfromenv(wsgireq.env)
+req = wsgireq.req
 rctx = requestcontext(self, repo)
 
 # This state is global across all threads.



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2772: hgweb: parse and store POST form data

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This is essentially a port of what wsgirequest.__init__ was doing
  inline via cgi.parse().
  
  Our version is better because - unlike cgi.parse() - we keep the
  query string and POST data separate. We do still expose a unified
  dict for convenience, however. But consumers can now be explicit
  about whether a parameter should be fetched from the query string
  or form data.
  
  Because we can only read from POST data once, we had to support
  passing in previously resolved values to support hgwebdir's ugly
  hack of having to construct a new parsedrequest type.
  
  The only consumer we update to use the new API is wsgirequest.
  
  Also, the normalization of values is explicitly not performed in
  the new code. This is because the intent of our request object is
  to represent the raw request with minimal transformation.
  
  With this commit, all meaningful request components are now
  reflected on our parsedrequest type!

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2772

AFFECTED FILES
  mercurial/hgweb/hgwebdir_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -96,8 +96,15 @@
 headers = attr.ib()
 # Request body input stream.
 bodyfh = attr.ib()
+# Like ``querystringlist`` and ``querystringdict`` but for form data
+# submitted on POST requests decoded from well-known content types.
+postformlist = attr.ib()
+postformdict = attr.ib()
 
-def parserequestfromenv(env, bodyfh):
+# All "form" parameters. A combination of query string and POST form data.
+params = attr.ib()
+
+def parserequestfromenv(env, bodyfh, previousformdata=None):
 """Parse URL components from environment variables.
 
 WSGI defines request attributes via environment variables. This function
@@ -220,6 +227,49 @@
 #if 'Content-Length' in headers:
 #bodyfh = util.cappedreader(bodyfh, int(headers['Content-Length']))
 
+# Form data is kinda wonky. It can come from request bodies, which we can
+# only read once. Since hgwebdir may construct a new parsedrequest, we
+# allow existing form data to be passed in to this function. That's kinda
+# hacky.
+if previousformdata is None:
+if env['REQUEST_METHOD'] == 'POST':
+# This is based on cgi.parse(), but without the hacky parts (like
+# merging QUERY_STRING and setting QUERY_STRING as a side-effect).
+ct, params = cgi.parse_header(env.get('CONTENT_TYPE', ''))
+if ct == 'multipart/form-data':
+# We don't have a way to preserve order. So we normalize to a
+# list for consistency with x-www-form-urlencoded.
+postformlist = []
+for k, l in cgi.parse_multipart(bodyfh, params).iteritems():
+for v in l:
+postformlist.append((k, v))
+elif ct == 'application/x-www-form-urlencoded':
+cl = int(headers['Content-Length'])
+postformlist = util.urlreq.parseqsl(bodyfh.read(cl),
+keep_blank_values=True)
+else:
+postformlist = []
+
+postformdict = {}
+for k, v in postformdict:
+if k in postformdict:
+postformdict[k].append(v)
+else:
+postformdict[k] = [v]
+else:
+postformlist = []
+postformdict = {}
+
+# Now that we have the raw post data. Merge in query string data to
+# provide a unified interface.
+formdict = {k: list(v) for k, v in postformdict.iteritems()}
+for k, l in querystringdict.iteritems():
+if k not in formdict:
+formdict[k] = []
+formdict[k].extend(l)
+else:
+postformlist, postformdict, formdict = previousformdata
+
 return parsedrequest(method=env['REQUEST_METHOD'],
  url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
@@ -231,7 +281,9 @@
  querystringlist=querystringlist,
  querystringdict=querystringdict,
  headers=headers,
- bodyfh=bodyfh)
+ bodyfh=bodyfh,
+ postformlist=postformlist, postformdict=postformdict,
+ params=formdict)
 
 class wsgirequest(object):
 """Higher-level API for a WSGI request.
@@ -258,15 +310,12 @@
 self.multiprocess = wsgienv[r'wsgi.multiprocess']
 self.run_once = wsgienv[r'wsgi.run_once']
 self.env = wsgienv
-self.form = 

D2768: hgweb: use a capped reader for WSGI input stream

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Per PEP-, the input stream from WSGI should respect EOF and
  prevent reads past the end of the request body. However, not all
  WSGI servers guarantee this. Notably, our BaseHTTPServer based
  built-in HTTP server doesn't. Instead, it exposes the raw socket
  and you can read() from it all you want, getting the connection in
  a bad state by doing so.
  
  We have a "cappedreader" utility class that proxies a file object
  and prevents reading past a limit.
  
  This commit converts the WSGI input stream into a capped reader when
  the input length is advertised via Content-Length headers.
  
  "cappedreader" only exposes a read() method. PEP- states that
  the input stream MUST also support readline(), readlines(hint), and
  __iter__(). However, since our code only calls read and we're not
  implementing a spec conforming WSGI server (just a WSGI application
  at this point), we don't need to support these additional methods.
  So the limited functionality of "cappedreader" is sufficient for our
  WSGI application.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2768

AFFECTED FILES
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -234,6 +234,14 @@
 raise RuntimeError("Unknown and unsupported WSGI version %d.%d"
% version)
 self.inp = wsgienv[r'wsgi.input']
+
+if r'HTTP_CONTENT_LENGTH' in wsgienv:
+self.inp = util.cappedreader(self.inp,
+ int(wsgienv[r'HTTP_CONTENT_LENGTH']))
+elif r'CONTENT_LENGTH' in wsgienv:
+self.inp = util.cappedreader(self.inp,
+ int(wsgienv[r'CONTENT_LENGTH']))
+
 self.err = wsgienv[r'wsgi.errors']
 self.threaded = wsgienv[r'wsgi.multithread']
 self.multiprocess = wsgienv[r'wsgi.multiprocess']



To: indygreg, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2763: xdiff: remove unused flags parameter

2018-03-09 Thread quark (Jun Wu)
quark added a comment.


  I don't think the Python ".so"s should be consumed by non-Python "dlopen". So 
"version" doesn't change since Python API remains the same.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2763

To: quark, #hg-reviewers, indygreg
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2765: xdiff: use int64 for hash table size

2018-03-09 Thread quark (Jun Wu)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG71fbceb58746: xdiff: use int64 for hash table size 
(authored by quark, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2765?vs=6802=6809

REVISION DETAIL
  https://phab.mercurial-scm.org/D2765

AFFECTED FILES
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xutils.c
  mercurial/thirdparty/xdiff/xutils.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xutils.h 
b/mercurial/thirdparty/xdiff/xutils.h
--- a/mercurial/thirdparty/xdiff/xutils.h
+++ b/mercurial/thirdparty/xdiff/xutils.h
@@ -32,7 +32,7 @@
 int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample);
 int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2);
 uint64_t xdl_hash_record(char const **data, char const *top);
-unsigned int xdl_hashbits(unsigned int size);
+unsigned int xdl_hashbits(int64_t size);
 
 
 
diff --git a/mercurial/thirdparty/xdiff/xutils.c 
b/mercurial/thirdparty/xdiff/xutils.c
--- a/mercurial/thirdparty/xdiff/xutils.c
+++ b/mercurial/thirdparty/xdiff/xutils.c
@@ -141,9 +141,10 @@
return ha;
 }
 
-unsigned int xdl_hashbits(unsigned int size) {
-   unsigned int val = 1, bits = 0;
+unsigned int xdl_hashbits(int64_t size) {
+   int64_t val = 1;
+   unsigned int bits = 0;
 
-   for (; val < size && bits < CHAR_BIT * sizeof(unsigned int); val <<= 1, 
bits++);
+   for (; val < size && bits < (int64_t) CHAR_BIT * sizeof(unsigned int); 
val <<= 1, bits++);
return bits ? bits: 1;
 }
diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -70,7 +70,7 @@
 static int xdl_init_classifier(xdlclassifier_t *cf, int64_t size, int64_t 
flags) {
cf->flags = flags;
 
-   cf->hbits = xdl_hashbits((unsigned int) size);
+   cf->hbits = xdl_hashbits(size);
cf->hsize = 1 << cf->hbits;
 
if (xdl_cha_init(>ncha, sizeof(xdlclass_t), size / 4 + 1) < 0) {
@@ -262,7 +262,7 @@
goto abort;
 
{
-   hbits = xdl_hashbits((unsigned int) narec);
+   hbits = xdl_hashbits(narec);
hsize = 1 << hbits;
if (!(rhash = (xrecord_t **) xdl_malloc(hsize * 
sizeof(xrecord_t *
goto abort;



To: quark, #hg-reviewers, indygreg
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2762: xdiff: replace {unsigned ,}long with {u,}int64_t

2018-03-09 Thread quark (Jun Wu)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGe882437cc082: xdiff: replace {unsigned ,}long with 
{u,}int64_t (authored by quark, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2762?vs=6799=6806

REVISION DETAIL
  https://phab.mercurial-scm.org/D2762

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiff.h
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xdiffi.h
  mercurial/thirdparty/xdiff/xinclude.h
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xtypes.h
  mercurial/thirdparty/xdiff/xutils.c
  mercurial/thirdparty/xdiff/xutils.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xutils.h 
b/mercurial/thirdparty/xdiff/xutils.h
--- a/mercurial/thirdparty/xdiff/xutils.h
+++ b/mercurial/thirdparty/xdiff/xutils.h
@@ -25,13 +25,13 @@
 
 
 
-long xdl_bogosqrt(long n);
-int xdl_cha_init(chastore_t *cha, long isize, long icount);
+int64_t xdl_bogosqrt(int64_t n);
+int xdl_cha_init(chastore_t *cha, int64_t isize, int64_t icount);
 void xdl_cha_free(chastore_t *cha);
 void *xdl_cha_alloc(chastore_t *cha);
-long xdl_guess_lines(mmfile_t *mf, long sample);
-int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags);
-unsigned long xdl_hash_record(char const **data, char const *top, long flags);
+int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample);
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags);
+uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags);
 unsigned int xdl_hashbits(unsigned int size);
 
 
diff --git a/mercurial/thirdparty/xdiff/xutils.c 
b/mercurial/thirdparty/xdiff/xutils.c
--- a/mercurial/thirdparty/xdiff/xutils.c
+++ b/mercurial/thirdparty/xdiff/xutils.c
@@ -27,8 +27,8 @@
 
 
 
-long xdl_bogosqrt(long n) {
-   long i;
+int64_t xdl_bogosqrt(int64_t n) {
+   int64_t i;
 
/*
 * Classical integer square root approximation using shifts.
@@ -40,20 +40,20 @@
 }
 
 
-void *xdl_mmfile_first(mmfile_t *mmf, long *size)
+void *xdl_mmfile_first(mmfile_t *mmf, int64_t *size)
 {
*size = mmf->size;
return mmf->ptr;
 }
 
 
-long xdl_mmfile_size(mmfile_t *mmf)
+int64_t xdl_mmfile_size(mmfile_t *mmf)
 {
return mmf->size;
 }
 
 
-int xdl_cha_init(chastore_t *cha, long isize, long icount) {
+int xdl_cha_init(chastore_t *cha, int64_t isize, int64_t icount) {
 
cha->head = cha->tail = NULL;
cha->isize = isize;
@@ -100,8 +100,8 @@
return data;
 }
 
-long xdl_guess_lines(mmfile_t *mf, long sample) {
-   long nl = 0, size, tsize = 0;
+int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample) {
+   int64_t nl = 0, size, tsize = 0;
char const *data, *cur, *top;
 
if ((cur = data = xdl_mmfile_first(mf, )) != NULL) {
@@ -121,15 +121,15 @@
return nl + 1;
 }
 
-int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags)
 {
if (s1 == s2 && !memcmp(l1, l2, s1))
return 1;
return 0;
 }
 
-unsigned long xdl_hash_record(char const **data, char const *top, long flags) {
-   unsigned long ha = 5381;
+uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags) {
+   uint64_t ha = 5381;
char const *ptr = *data;
 
for (; ptr < top && *ptr != '\n'; ptr++) {
diff --git a/mercurial/thirdparty/xdiff/xtypes.h 
b/mercurial/thirdparty/xdiff/xtypes.h
--- a/mercurial/thirdparty/xdiff/xtypes.h
+++ b/mercurial/thirdparty/xdiff/xtypes.h
@@ -27,30 +27,30 @@
 
 typedef struct s_chanode {
struct s_chanode *next;
-   long icurr;
+   int64_t icurr;
 } chanode_t;
 
 typedef struct s_chastore {
chanode_t *head, *tail;
-   long isize, nsize;
+   int64_t isize, nsize;
chanode_t *ancur;
chanode_t *sncur;
-   long scurr;
+   int64_t scurr;
 } chastore_t;
 
 typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
-   long size;
-   unsigned long ha;
+   int64_t size;
+   uint64_t ha;
 } xrecord_t;
 
 typedef struct s_xdfile {
/* manual memory management */
chastore_t rcha;
 
/* number of records (lines) */
-   long nrec;
+   int64_t nrec;
 
/* hash table size
 * the maximum hash value in the table is (1 << hbits) */
@@ -64,7 +64,7 @@
 * [recs[i] for i in range(0, dstart)] are common prefix.
 * [recs[i] for i in range(dstart, dend + 1 - dstart)] are interesting
 * lines */
-   long dstart, dend;
+   int64_t dstart, dend;
 
/* pointer to records (lines) */
xrecord_t **recs;
@@ -82,22 +82,22 @@
 * rindex[0] is likely dstart, if not removed up by rule 2.
 * rindex[nreff - 1] is likely dend, if not removed by rule 2.
 */
-   long *rindex;
+   

D2763: xdiff: remove unused flags parameter

2018-03-09 Thread quark (Jun Wu)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG4c8ffc67bac2: xdiff: remove unused flags parameter 
(authored by quark, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2763?vs=6800=6808

REVISION DETAIL
  https://phab.mercurial-scm.org/D2763

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xutils.c
  mercurial/thirdparty/xdiff/xutils.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xutils.h 
b/mercurial/thirdparty/xdiff/xutils.h
--- a/mercurial/thirdparty/xdiff/xutils.h
+++ b/mercurial/thirdparty/xdiff/xutils.h
@@ -30,8 +30,8 @@
 void xdl_cha_free(chastore_t *cha);
 void *xdl_cha_alloc(chastore_t *cha);
 int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample);
-int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags);
-uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags);
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2);
+uint64_t xdl_hash_record(char const **data, char const *top);
 unsigned int xdl_hashbits(unsigned int size);
 
 
diff --git a/mercurial/thirdparty/xdiff/xutils.c 
b/mercurial/thirdparty/xdiff/xutils.c
--- a/mercurial/thirdparty/xdiff/xutils.c
+++ b/mercurial/thirdparty/xdiff/xutils.c
@@ -121,14 +121,14 @@
return nl + 1;
 }
 
-int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags)
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2)
 {
if (s1 == s2 && !memcmp(l1, l2, s1))
return 1;
return 0;
 }
 
-uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags) {
+uint64_t xdl_hash_record(char const **data, char const *top) {
uint64_t ha = 5381;
char const *ptr = *data;
 
diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -118,7 +118,7 @@
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
if (rcrec->ha == rec->ha &&
xdl_recmatch(rcrec->line, rcrec->size,
-   rec->ptr, rec->size, cf->flags))
+   rec->ptr, rec->size))
break;
 
if (!rcrec) {
@@ -273,7 +273,7 @@
if ((cur = blk = xdl_mmfile_first(mf, )) != NULL) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
-   hav = xdl_hash_record(, top, xpp->flags);
+   hav = xdl_hash_record(, top);
if (nrec >= narec) {
narec *= 2;
if (!(rrecs = (xrecord_t **) xdl_realloc(recs, 
narec * sizeof(xrecord_t *
diff --git a/mercurial/thirdparty/xdiff/xdiffi.c 
b/mercurial/thirdparty/xdiff/xdiffi.c
--- a/mercurial/thirdparty/xdiff/xdiffi.c
+++ b/mercurial/thirdparty/xdiff/xdiffi.c
@@ -398,12 +398,11 @@
 }
 
 
-static int recs_match(xrecord_t *rec1, xrecord_t *rec2, int64_t flags)
+static int recs_match(xrecord_t *rec1, xrecord_t *rec2)
 {
return (rec1->ha == rec2->ha &&
xdl_recmatch(rec1->ptr, rec1->size,
-rec2->ptr, rec2->size,
-flags));
+rec2->ptr, rec2->size));
 }
 
 /*
@@ -762,10 +761,10 @@
  * following group, expand this group to include it. Return 0 on success or -1
  * if g cannot be slid down.
  */
-static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g, int64_t flags)
+static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
 {
if (g->end < xdf->nrec &&
-   recs_match(xdf->recs[g->start], xdf->recs[g->end], flags)) {
+   recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
 
@@ -783,10 +782,10 @@
  * into a previous group, expand this group to include it. Return 0 on success
  * or -1 if g cannot be slid up.
  */
-static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g, int64_t flags)
+static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
 {
if (g->start > 0 &&
-   recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1], flags)) {
+   recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
 
@@ -847,7 +846,7 @@
end_matching_other = -1;
 
/* Shift the group backward as much as possible: */
-   while (!group_slide_up(xdf, , flags))
+   while (!group_slide_up(xdf, ))
if (group_previous(xdfo, ))

D2686: xdiff: add a preprocessing step that trims files

2018-03-09 Thread quark (Jun Wu)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG665958f30789: xdiff: add a preprocessing step that trims 
files (authored by quark, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2686?vs=6797=6804

REVISION DETAIL
  https://phab.mercurial-scm.org/D2686

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xtypes.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xtypes.h 
b/mercurial/thirdparty/xdiff/xtypes.h
--- a/mercurial/thirdparty/xdiff/xtypes.h
+++ b/mercurial/thirdparty/xdiff/xtypes.h
@@ -60,6 +60,10 @@
 
 typedef struct s_xdfenv {
xdfile_t xdf1, xdf2;
+
+   /* number of lines for common prefix and suffix that are removed
+* from xdf1 and xdf2 as a preprocessing step */
+   long nprefix, nsuffix;
 } xdfenv_t;
 
 
diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -156,6 +156,87 @@
 }
 
 
+/*
+ * Trim common prefix from files.
+ *
+ * Note: trimming could affect hunk shifting. But the performance benefit
+ * outweighs the shift change. A diff result with suboptimal shifting is still
+ * valid.
+ */
+static void xdl_trim_files(mmfile_t *mf1, mmfile_t *mf2, long reserved,
+   xdfenv_t *xe, mmfile_t *out_mf1, mmfile_t *out_mf2) {
+   mmfile_t msmall, mlarge;
+   /* prefix lines, prefix bytes, suffix lines, suffix bytes */
+   long plines = 0, pbytes = 0, slines = 0, sbytes = 0, i;
+   /* prefix char pointer for msmall and mlarge */
+   const char *pp1, *pp2;
+   /* suffix char pointer for msmall and mlarge */
+   const char *ps1, *ps2;
+
+   /* reserved must >= 0 for the line boundary adjustment to work */
+   if (reserved < 0)
+   reserved = 0;
+
+   if (mf1->size < mf2->size) {
+   memcpy(, mf1, sizeof(mmfile_t));
+   memcpy(, mf2, sizeof(mmfile_t));
+   } else {
+   memcpy(, mf2, sizeof(mmfile_t));
+   memcpy(, mf1, sizeof(mmfile_t));
+   }
+
+   pp1 = msmall.ptr, pp2 = mlarge.ptr;
+   for (i = 0; i < msmall.size && *pp1 == *pp2; ++i) {
+   plines += (*pp1 == '\n');
+   pp1++, pp2++;
+   }
+
+   ps1 = msmall.ptr + msmall.size - 1, ps2 = mlarge.ptr + mlarge.size - 1;
+   while (ps1 > pp1 && *ps1 == *ps2) {
+   slines += (*ps1 == '\n');
+   ps1--, ps2--;
+   }
+
+   /* Retract common prefix and suffix boundaries for reserved lines */
+   if (plines <= reserved + 1) {
+   plines = 0;
+   } else {
+   i = 0;
+   while (i <= reserved) {
+   pp1--;
+   i += (*pp1 == '\n');
+   }
+   /* The new mmfile starts at the next char just after '\n' */
+   pbytes = pp1 - msmall.ptr + 1;
+   plines -= reserved;
+   }
+
+   if (slines <= reserved + 1) {
+   slines = 0;
+   } else {
+   /* Note: with compiler SIMD support (ex. -O3 -mavx2), this
+* might perform better than memchr. */
+   i = 0;
+   while (i <= reserved) {
+   ps1++;
+   i += (*ps1 == '\n');
+   }
+   /* The new mmfile includes this '\n' */
+   sbytes = msmall.ptr + msmall.size - ps1 - 1;
+   slines -= reserved;
+   if (msmall.ptr[msmall.size - 1] == '\n')
+   slines -= 1;
+   }
+
+   xe->nprefix = plines;
+   xe->nsuffix = slines;
+   out_mf1->ptr = mf1->ptr + pbytes;
+   out_mf1->size = mf1->size - pbytes - sbytes;
+   out_mf2->ptr = mf2->ptr + pbytes;
+   out_mf2->size = mf2->size - pbytes - sbytes;
+}
+
+
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, 
xpparam_t const *xpp,
   xdlclassifier_t *cf, xdfile_t *xdf) {
unsigned int hbits;
@@ -254,10 +335,13 @@
xdl_cha_free(>rcha);
 }
 
+/* Reserved lines for trimming, to leave room for shifting */
+#define TRIM_RESERVED_LINES 100
 
 int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe) {
long enl1, enl2, sample;
+   mmfile_t tmf1, tmf2;
xdlclassifier_t cf;
 
memset(, 0, sizeof(cf));
@@ -270,12 +354,14 @@
if (xdl_init_classifier(, enl1 + enl2 + 1, xpp->flags) < 0)
return -1;
 
-   if (xdl_prepare_ctx(1, mf1, enl1, xpp, , >xdf1) < 0) {
+   xdl_trim_files(mf1, mf2, TRIM_RESERVED_LINES, xe, , );
+
+   if (xdl_prepare_ctx(1, , enl1, xpp, , >xdf1) < 0) {
 
xdl_free_classifier();
return -1;
}
-   if 

D2764: xdiff: remove unused xpp and xecfg parameters

2018-03-09 Thread quark (Jun Wu)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG2e2b48cca761: xdiff: remove unused xpp and xecfg parameters 
(authored by quark, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2764?vs=6801=6807

REVISION DETAIL
  https://phab.mercurial-scm.org/D2764

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xprepare.c

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -56,7 +56,7 @@
 static void xdl_free_classifier(xdlclassifier_t *cf);
 static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, 
xrecord_t **rhash,
   unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec, 
xpparam_t const *xpp,
+static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec,
   xdlclassifier_t *cf, xdfile_t *xdf);
 static void xdl_free_ctx(xdfile_t *xdf);
 static int xdl_clean_mmatch(char const *dis, int64_t i, int64_t s, int64_t e);
@@ -237,7 +237,7 @@
 }
 
 
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec, 
xpparam_t const *xpp,
+static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec,
   xdlclassifier_t *cf, xdfile_t *xdf) {
unsigned int hbits;
int64_t nrec, hsize, bsize;
@@ -356,12 +356,12 @@
 
xdl_trim_files(mf1, mf2, TRIM_RESERVED_LINES, xe, , );
 
-   if (xdl_prepare_ctx(1, , enl1, xpp, , >xdf1) < 0) {
+   if (xdl_prepare_ctx(1, , enl1, , >xdf1) < 0) {
 
xdl_free_classifier();
return -1;
}
-   if (xdl_prepare_ctx(2, , enl2, xpp, , >xdf2) < 0) {
+   if (xdl_prepare_ctx(2, , enl2, , >xdf2) < 0) {
 
xdl_free_ctx(>xdf1);
xdl_free_classifier();
diff --git a/mercurial/thirdparty/xdiff/xdiffi.c 
b/mercurial/thirdparty/xdiff/xdiffi.c
--- a/mercurial/thirdparty/xdiff/xdiffi.c
+++ b/mercurial/thirdparty/xdiff/xdiffi.c
@@ -1012,7 +1012,7 @@
  * inside the differential hunk according to the specified configuration.
  * Also advance xscr if the first changes must be discarded.
  */
-xdchange_t *xdl_get_hunk(xdchange_t **xscr, xdemitconf_t const *xecfg)
+xdchange_t *xdl_get_hunk(xdchange_t **xscr)
 {
xdchange_t *xch, *xchp, *lxch;
int64_t max_common = 0;
@@ -1070,7 +1070,7 @@
if ((xecfg->flags & XDL_EMIT_BDIFFHUNK) != 0) {
int64_t i1 = 0, i2 = 0, n1 = xe->xdf1.nrec, n2 = xe->xdf2.nrec;
for (xch = xscr; xch; xch = xche->next) {
-   xche = xdl_get_hunk(, xecfg);
+   xche = xdl_get_hunk();
if (!xch)
break;
if (xch != xche)
@@ -1089,7 +1089,7 @@
return -1;
} else {
for (xch = xscr; xch; xch = xche->next) {
-   xche = xdl_get_hunk(, xecfg);
+   xche = xdl_get_hunk();
if (!xch)
break;
if (xecfg->hunk_func(xch->i1 + p,



To: quark, #hg-reviewers, indygreg
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2685: xdiff: add comments for fields in xdfile_t

2018-03-09 Thread quark (Jun Wu)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG58028f6d1fb8: xdiff: add comments for fields in xdfile_t 
(authored by quark, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2685?vs=6798=6805

REVISION DETAIL
  https://phab.mercurial-scm.org/D2685

AFFECTED FILES
  mercurial/thirdparty/xdiff/xtypes.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xtypes.h 
b/mercurial/thirdparty/xdiff/xtypes.h
--- a/mercurial/thirdparty/xdiff/xtypes.h
+++ b/mercurial/thirdparty/xdiff/xtypes.h
@@ -46,15 +46,49 @@
 } xrecord_t;
 
 typedef struct s_xdfile {
+   /* manual memory management */
chastore_t rcha;
+
+   /* number of records (lines) */
long nrec;
+
+   /* hash table size
+* the maximum hash value in the table is (1 << hbits) */
unsigned int hbits;
+
+   /* hash table, hash value => xrecord_t
+* note: xrecord_t is a linked list. */
xrecord_t **rhash;
+
+   /* range excluding common prefix and suffix
+* [recs[i] for i in range(0, dstart)] are common prefix.
+* [recs[i] for i in range(dstart, dend + 1 - dstart)] are interesting
+* lines */
long dstart, dend;
+
+   /* pointer to records (lines) */
xrecord_t **recs;
+
+   /* record changed, use original "recs" index
+* rchag[i] can be either 0 or 1. 1 means recs[i] (line i) is marked
+* "changed". */
char *rchg;
+
+   /* cleaned-up record index => original "recs" index
+* clean-up means:
+*  rule 1. remove common prefix and suffix
+*  rule 2. remove records that are only on one side, since they can
+*  not match the other side
+* rindex[0] is likely dstart, if not removed up by rule 2.
+* rindex[nreff - 1] is likely dend, if not removed by rule 2.
+*/
long *rindex;
+
+   /* rindex size */
long nreff;
+
+   /* cleaned-up record index => hash value
+* ha[i] = recs[rindex[i]]->ha */
unsigned long *ha;
 } xdfile_t;
 



To: quark, #hg-reviewers, indygreg
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2763: xdiff: remove unused flags parameter

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg accepted this revision.
indygreg added a comment.
This revision is now accepted and ready to land.


  Strictly speaking, we might want to bump the C extension version number 
because of the ABI change. But I'm not sure if these functions are exported or 
even used by our C extension. So it may not matter. It's an experimental 
feature right now, so I'm inclined to not care.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2763

To: quark, #hg-reviewers, indygreg
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2685: xdiff: add comments for fields in xdfile_t

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg accepted this revision.
indygreg added a comment.
This revision is now accepted and ready to land.


  So. Much. Better.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2685

To: quark, #hg-reviewers, indygreg
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2686: xdiff: add a preprocessing step that trims files

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg accepted this revision.
indygreg added a comment.
This revision is now accepted and ready to land.


  I almost accepted the last version and this one is mostly cosmetic changes. 
So LGTM!
  
  Your work here is very much appreciated. Thank you for doing the thorough 
performance analysis.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2686

To: quark, #hg-reviewers, indygreg
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2743: wireprotoserver: access headers through parsed request

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg added inline comments.

INLINE COMMENTS

> durin42 wrote in wireprotoserver.py:97
> Missed one?

No. This preserves the behavior since HTTP_CONTENT_LENGTH != CONTENT_LENGTH. I 
fix this in a later commit by teaching the request object about both keys.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2743

To: indygreg, #hg-reviewers
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2764: xdiff: remove unused xpp and xecfg parameters

2018-03-09 Thread quark (Jun Wu)
quark created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  They are unused. Thus removed.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2764

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xprepare.c

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -56,7 +56,7 @@
 static void xdl_free_classifier(xdlclassifier_t *cf);
 static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, 
xrecord_t **rhash,
   unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec, 
xpparam_t const *xpp,
+static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec,
   xdlclassifier_t *cf, xdfile_t *xdf);
 static void xdl_free_ctx(xdfile_t *xdf);
 static int xdl_clean_mmatch(char const *dis, int64_t i, int64_t s, int64_t e);
@@ -237,7 +237,7 @@
 }
 
 
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec, 
xpparam_t const *xpp,
+static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, int64_t narec,
   xdlclassifier_t *cf, xdfile_t *xdf) {
unsigned int hbits;
int64_t nrec, hsize, bsize;
@@ -356,12 +356,12 @@
 
xdl_trim_files(mf1, mf2, TRIM_RESERVED_LINES, xe, , );
 
-   if (xdl_prepare_ctx(1, , enl1, xpp, , >xdf1) < 0) {
+   if (xdl_prepare_ctx(1, , enl1, , >xdf1) < 0) {
 
xdl_free_classifier();
return -1;
}
-   if (xdl_prepare_ctx(2, , enl2, xpp, , >xdf2) < 0) {
+   if (xdl_prepare_ctx(2, , enl2, , >xdf2) < 0) {
 
xdl_free_ctx(>xdf1);
xdl_free_classifier();
diff --git a/mercurial/thirdparty/xdiff/xdiffi.c 
b/mercurial/thirdparty/xdiff/xdiffi.c
--- a/mercurial/thirdparty/xdiff/xdiffi.c
+++ b/mercurial/thirdparty/xdiff/xdiffi.c
@@ -1012,7 +1012,7 @@
  * inside the differential hunk according to the specified configuration.
  * Also advance xscr if the first changes must be discarded.
  */
-xdchange_t *xdl_get_hunk(xdchange_t **xscr, xdemitconf_t const *xecfg)
+xdchange_t *xdl_get_hunk(xdchange_t **xscr)
 {
xdchange_t *xch, *xchp, *lxch;
int64_t max_common = 0;
@@ -1070,7 +1070,7 @@
if ((xecfg->flags & XDL_EMIT_BDIFFHUNK) != 0) {
int64_t i1 = 0, i2 = 0, n1 = xe->xdf1.nrec, n2 = xe->xdf2.nrec;
for (xch = xscr; xch; xch = xche->next) {
-   xche = xdl_get_hunk(, xecfg);
+   xche = xdl_get_hunk();
if (!xch)
break;
if (xch != xche)
@@ -1089,7 +1089,7 @@
return -1;
} else {
for (xch = xscr; xch; xch = xche->next) {
-   xche = xdl_get_hunk(, xecfg);
+   xche = xdl_get_hunk();
if (!xch)
break;
if (xecfg->hunk_func(xch->i1 + p,



To: quark, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2765: xdiff: use int64 for hash table size

2018-03-09 Thread quark (Jun Wu)
quark created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Follow-up of the previous "long" -> "int64" change. Now xdiff only uses int
  for return values and small integers (ex. booleans, shifting score, bits in
  hash table size, etc) so it should be able to handle large input.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2765

AFFECTED FILES
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xutils.c
  mercurial/thirdparty/xdiff/xutils.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xutils.h 
b/mercurial/thirdparty/xdiff/xutils.h
--- a/mercurial/thirdparty/xdiff/xutils.h
+++ b/mercurial/thirdparty/xdiff/xutils.h
@@ -32,7 +32,7 @@
 int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample);
 int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2);
 uint64_t xdl_hash_record(char const **data, char const *top);
-unsigned int xdl_hashbits(unsigned int size);
+unsigned int xdl_hashbits(int64_t size);
 
 
 
diff --git a/mercurial/thirdparty/xdiff/xutils.c 
b/mercurial/thirdparty/xdiff/xutils.c
--- a/mercurial/thirdparty/xdiff/xutils.c
+++ b/mercurial/thirdparty/xdiff/xutils.c
@@ -141,9 +141,10 @@
return ha;
 }
 
-unsigned int xdl_hashbits(unsigned int size) {
-   unsigned int val = 1, bits = 0;
+unsigned int xdl_hashbits(int64_t size) {
+   int64_t val = 1;
+   unsigned int bits = 0;
 
-   for (; val < size && bits < CHAR_BIT * sizeof(unsigned int); val <<= 1, 
bits++);
+   for (; val < size && bits < (int64_t) CHAR_BIT * sizeof(unsigned int); 
val <<= 1, bits++);
return bits ? bits: 1;
 }
diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -70,7 +70,7 @@
 static int xdl_init_classifier(xdlclassifier_t *cf, int64_t size, int64_t 
flags) {
cf->flags = flags;
 
-   cf->hbits = xdl_hashbits((unsigned int) size);
+   cf->hbits = xdl_hashbits(size);
cf->hsize = 1 << cf->hbits;
 
if (xdl_cha_init(>ncha, sizeof(xdlclass_t), size / 4 + 1) < 0) {
@@ -262,7 +262,7 @@
goto abort;
 
{
-   hbits = xdl_hashbits((unsigned int) narec);
+   hbits = xdl_hashbits(narec);
hsize = 1 << hbits;
if (!(rhash = (xrecord_t **) xdl_malloc(hsize * 
sizeof(xrecord_t *
goto abort;



To: quark, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2762: xdiff: replace {unsigned ,}long with {u,}int64_t

2018-03-09 Thread quark (Jun Wu)
quark created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  MSVC treats "long" as 4-byte. That could cause overflows since the xdiff
  code uses "long" in places where "size_t" or "ssize_t" should be used.
  Let's use explicit 8 byte integers to avoid
  
  FWIW git avoids that overflow by limiting diff size to 1GB [1]. After
  examining the code, I think the remaining risk (the use of "int") is low
  since "int" is only used for return values and hash table size. Although a
  wrong hash table size would not affect the correctness of the code, but that
  could make the code extremely slow. The next patch will change hash table
  size to 8-byte integer so the 1GB limit is unlikely needed.
  
  This patch was done by using `sed`.
  
  [1]: https://github.com/git/git/commit/dcd1742e56ebb944c4ff62346da4548e1e3be67

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2762

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiff.h
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xdiffi.h
  mercurial/thirdparty/xdiff/xinclude.h
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xtypes.h
  mercurial/thirdparty/xdiff/xutils.c
  mercurial/thirdparty/xdiff/xutils.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xutils.h 
b/mercurial/thirdparty/xdiff/xutils.h
--- a/mercurial/thirdparty/xdiff/xutils.h
+++ b/mercurial/thirdparty/xdiff/xutils.h
@@ -25,13 +25,13 @@
 
 
 
-long xdl_bogosqrt(long n);
-int xdl_cha_init(chastore_t *cha, long isize, long icount);
+int64_t xdl_bogosqrt(int64_t n);
+int xdl_cha_init(chastore_t *cha, int64_t isize, int64_t icount);
 void xdl_cha_free(chastore_t *cha);
 void *xdl_cha_alloc(chastore_t *cha);
-long xdl_guess_lines(mmfile_t *mf, long sample);
-int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags);
-unsigned long xdl_hash_record(char const **data, char const *top, long flags);
+int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample);
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags);
+uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags);
 unsigned int xdl_hashbits(unsigned int size);
 
 
diff --git a/mercurial/thirdparty/xdiff/xutils.c 
b/mercurial/thirdparty/xdiff/xutils.c
--- a/mercurial/thirdparty/xdiff/xutils.c
+++ b/mercurial/thirdparty/xdiff/xutils.c
@@ -27,8 +27,8 @@
 
 
 
-long xdl_bogosqrt(long n) {
-   long i;
+int64_t xdl_bogosqrt(int64_t n) {
+   int64_t i;
 
/*
 * Classical integer square root approximation using shifts.
@@ -40,20 +40,20 @@
 }
 
 
-void *xdl_mmfile_first(mmfile_t *mmf, long *size)
+void *xdl_mmfile_first(mmfile_t *mmf, int64_t *size)
 {
*size = mmf->size;
return mmf->ptr;
 }
 
 
-long xdl_mmfile_size(mmfile_t *mmf)
+int64_t xdl_mmfile_size(mmfile_t *mmf)
 {
return mmf->size;
 }
 
 
-int xdl_cha_init(chastore_t *cha, long isize, long icount) {
+int xdl_cha_init(chastore_t *cha, int64_t isize, int64_t icount) {
 
cha->head = cha->tail = NULL;
cha->isize = isize;
@@ -100,8 +100,8 @@
return data;
 }
 
-long xdl_guess_lines(mmfile_t *mf, long sample) {
-   long nl = 0, size, tsize = 0;
+int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample) {
+   int64_t nl = 0, size, tsize = 0;
char const *data, *cur, *top;
 
if ((cur = data = xdl_mmfile_first(mf, )) != NULL) {
@@ -121,15 +121,15 @@
return nl + 1;
 }
 
-int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags)
 {
if (s1 == s2 && !memcmp(l1, l2, s1))
return 1;
return 0;
 }
 
-unsigned long xdl_hash_record(char const **data, char const *top, long flags) {
-   unsigned long ha = 5381;
+uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags) {
+   uint64_t ha = 5381;
char const *ptr = *data;
 
for (; ptr < top && *ptr != '\n'; ptr++) {
diff --git a/mercurial/thirdparty/xdiff/xtypes.h 
b/mercurial/thirdparty/xdiff/xtypes.h
--- a/mercurial/thirdparty/xdiff/xtypes.h
+++ b/mercurial/thirdparty/xdiff/xtypes.h
@@ -27,30 +27,30 @@
 
 typedef struct s_chanode {
struct s_chanode *next;
-   long icurr;
+   int64_t icurr;
 } chanode_t;
 
 typedef struct s_chastore {
chanode_t *head, *tail;
-   long isize, nsize;
+   int64_t isize, nsize;
chanode_t *ancur;
chanode_t *sncur;
-   long scurr;
+   int64_t scurr;
 } chastore_t;
 
 typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
-   long size;
-   unsigned long ha;
+   int64_t size;
+   uint64_t ha;
 } xrecord_t;
 
 typedef struct s_xdfile {
/* manual memory management */
chastore_t rcha;
 
/* number of records (lines) */
-   long nrec;
+  

D2763: xdiff: remove unused flags parameter

2018-03-09 Thread quark (Jun Wu)
quark created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  After https://phab.mercurial-scm.org/D2683, the flags parameter in some 
functions is no longer needed.
  Thus removed.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2763

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xutils.c
  mercurial/thirdparty/xdiff/xutils.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xutils.h 
b/mercurial/thirdparty/xdiff/xutils.h
--- a/mercurial/thirdparty/xdiff/xutils.h
+++ b/mercurial/thirdparty/xdiff/xutils.h
@@ -30,8 +30,8 @@
 void xdl_cha_free(chastore_t *cha);
 void *xdl_cha_alloc(chastore_t *cha);
 int64_t xdl_guess_lines(mmfile_t *mf, int64_t sample);
-int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags);
-uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags);
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2);
+uint64_t xdl_hash_record(char const **data, char const *top);
 unsigned int xdl_hashbits(unsigned int size);
 
 
diff --git a/mercurial/thirdparty/xdiff/xutils.c 
b/mercurial/thirdparty/xdiff/xutils.c
--- a/mercurial/thirdparty/xdiff/xutils.c
+++ b/mercurial/thirdparty/xdiff/xutils.c
@@ -121,14 +121,14 @@
return nl + 1;
 }
 
-int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2, 
int64_t flags)
+int xdl_recmatch(const char *l1, int64_t s1, const char *l2, int64_t s2)
 {
if (s1 == s2 && !memcmp(l1, l2, s1))
return 1;
return 0;
 }
 
-uint64_t xdl_hash_record(char const **data, char const *top, int64_t flags) {
+uint64_t xdl_hash_record(char const **data, char const *top) {
uint64_t ha = 5381;
char const *ptr = *data;
 
diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -118,7 +118,7 @@
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
if (rcrec->ha == rec->ha &&
xdl_recmatch(rcrec->line, rcrec->size,
-   rec->ptr, rec->size, cf->flags))
+   rec->ptr, rec->size))
break;
 
if (!rcrec) {
@@ -273,7 +273,7 @@
if ((cur = blk = xdl_mmfile_first(mf, )) != NULL) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
-   hav = xdl_hash_record(, top, xpp->flags);
+   hav = xdl_hash_record(, top);
if (nrec >= narec) {
narec *= 2;
if (!(rrecs = (xrecord_t **) xdl_realloc(recs, 
narec * sizeof(xrecord_t *
diff --git a/mercurial/thirdparty/xdiff/xdiffi.c 
b/mercurial/thirdparty/xdiff/xdiffi.c
--- a/mercurial/thirdparty/xdiff/xdiffi.c
+++ b/mercurial/thirdparty/xdiff/xdiffi.c
@@ -398,12 +398,11 @@
 }
 
 
-static int recs_match(xrecord_t *rec1, xrecord_t *rec2, int64_t flags)
+static int recs_match(xrecord_t *rec1, xrecord_t *rec2)
 {
return (rec1->ha == rec2->ha &&
xdl_recmatch(rec1->ptr, rec1->size,
-rec2->ptr, rec2->size,
-flags));
+rec2->ptr, rec2->size));
 }
 
 /*
@@ -762,10 +761,10 @@
  * following group, expand this group to include it. Return 0 on success or -1
  * if g cannot be slid down.
  */
-static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g, int64_t flags)
+static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
 {
if (g->end < xdf->nrec &&
-   recs_match(xdf->recs[g->start], xdf->recs[g->end], flags)) {
+   recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
 
@@ -783,10 +782,10 @@
  * into a previous group, expand this group to include it. Return 0 on success
  * or -1 if g cannot be slid up.
  */
-static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g, int64_t flags)
+static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
 {
if (g->start > 0 &&
-   recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1], flags)) {
+   recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
 
@@ -847,7 +846,7 @@
end_matching_other = -1;
 
/* Shift the group backward as much as possible: */
-   while (!group_slide_up(xdf, , flags))
+   while (!group_slide_up(xdf, ))
if (group_previous(xdfo, ))

D2766: xdiff: resolve signed unsigned comparison warning

2018-03-09 Thread quark (Jun Wu)
quark created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Since the value won't be changed inside the code (because context lines
  feature was removed by https://phab.mercurial-scm.org/D2705), let's just 
remove the variable and inline
  the 0 value.
  
  The code might be potentially further simplified. But I'd like to make sure
  correctness is easily verifiable in this patch.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2766

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiffi.c

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xdiffi.c 
b/mercurial/thirdparty/xdiff/xdiffi.c
--- a/mercurial/thirdparty/xdiff/xdiffi.c
+++ b/mercurial/thirdparty/xdiff/xdiffi.c
@@ -1015,16 +1015,14 @@
 xdchange_t *xdl_get_hunk(xdchange_t **xscr)
 {
xdchange_t *xch, *xchp, *lxch;
-   int64_t max_common = 0;
-   int64_t max_ignorable = 0;
uint64_t ignored = 0; /* number of ignored blank lines */
 
/* remove ignorable changes that are too far before other changes */
for (xchp = *xscr; xchp && xchp->ignore; xchp = xchp->next) {
xch = xchp->next;
 
if (xch == NULL ||
-   xch->i1 - (xchp->i1 + xchp->chg1) >= max_ignorable)
+   xch->i1 - (xchp->i1 + xchp->chg1) >= 0)
*xscr = xch;
}
 
@@ -1035,16 +1033,16 @@
 
for (xchp = *xscr, xch = xchp->next; xch; xchp = xch, xch = xch->next) {
int64_t distance = xch->i1 - (xchp->i1 + xchp->chg1);
-   if (distance > max_common)
+   if (distance > 0)
break;
 
-   if (distance < max_ignorable && (!xch->ignore || lxch == xchp)) 
{
+   if (distance < 0 && (!xch->ignore || lxch == xchp)) {
lxch = xch;
ignored = 0;
-   } else if (distance < max_ignorable && xch->ignore) {
+   } else if (distance < 0 && xch->ignore) {
ignored += xch->chg2;
} else if (lxch != xchp &&
-  xch->i1 + ignored - (lxch->i1 + lxch->chg1) > 
max_common) {
+  xch->i1 + ignored - (lxch->i1 + lxch->chg1) > 0) {
break;
} else if (!xch->ignore) {
lxch = xch;



To: quark, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2686: xdiff: add a preprocessing step that trims files

2018-03-09 Thread quark (Jun Wu)
quark updated this revision to Diff 6797.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2686?vs=6708=6797

REVISION DETAIL
  https://phab.mercurial-scm.org/D2686

AFFECTED FILES
  mercurial/thirdparty/xdiff/xdiffi.c
  mercurial/thirdparty/xdiff/xprepare.c
  mercurial/thirdparty/xdiff/xtypes.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xtypes.h 
b/mercurial/thirdparty/xdiff/xtypes.h
--- a/mercurial/thirdparty/xdiff/xtypes.h
+++ b/mercurial/thirdparty/xdiff/xtypes.h
@@ -60,6 +60,10 @@
 
 typedef struct s_xdfenv {
xdfile_t xdf1, xdf2;
+
+   /* number of lines for common prefix and suffix that are removed
+* from xdf1 and xdf2 as a preprocessing step */
+   long nprefix, nsuffix;
 } xdfenv_t;
 
 
diff --git a/mercurial/thirdparty/xdiff/xprepare.c 
b/mercurial/thirdparty/xdiff/xprepare.c
--- a/mercurial/thirdparty/xdiff/xprepare.c
+++ b/mercurial/thirdparty/xdiff/xprepare.c
@@ -156,6 +156,87 @@
 }
 
 
+/*
+ * Trim common prefix from files.
+ *
+ * Note: trimming could affect hunk shifting. But the performance benefit
+ * outweighs the shift change. A diff result with suboptimal shifting is still
+ * valid.
+ */
+static void xdl_trim_files(mmfile_t *mf1, mmfile_t *mf2, long reserved,
+   xdfenv_t *xe, mmfile_t *out_mf1, mmfile_t *out_mf2) {
+   mmfile_t msmall, mlarge;
+   /* prefix lines, prefix bytes, suffix lines, suffix bytes */
+   long plines = 0, pbytes = 0, slines = 0, sbytes = 0, i;
+   /* prefix char pointer for msmall and mlarge */
+   const char *pp1, *pp2;
+   /* suffix char pointer for msmall and mlarge */
+   const char *ps1, *ps2;
+
+   /* reserved must >= 0 for the line boundary adjustment to work */
+   if (reserved < 0)
+   reserved = 0;
+
+   if (mf1->size < mf2->size) {
+   memcpy(, mf1, sizeof(mmfile_t));
+   memcpy(, mf2, sizeof(mmfile_t));
+   } else {
+   memcpy(, mf2, sizeof(mmfile_t));
+   memcpy(, mf1, sizeof(mmfile_t));
+   }
+
+   pp1 = msmall.ptr, pp2 = mlarge.ptr;
+   for (i = 0; i < msmall.size && *pp1 == *pp2; ++i) {
+   plines += (*pp1 == '\n');
+   pp1++, pp2++;
+   }
+
+   ps1 = msmall.ptr + msmall.size - 1, ps2 = mlarge.ptr + mlarge.size - 1;
+   while (ps1 > pp1 && *ps1 == *ps2) {
+   slines += (*ps1 == '\n');
+   ps1--, ps2--;
+   }
+
+   /* Retract common prefix and suffix boundaries for reserved lines */
+   if (plines <= reserved + 1) {
+   plines = 0;
+   } else {
+   i = 0;
+   while (i <= reserved) {
+   pp1--;
+   i += (*pp1 == '\n');
+   }
+   /* The new mmfile starts at the next char just after '\n' */
+   pbytes = pp1 - msmall.ptr + 1;
+   plines -= reserved;
+   }
+
+   if (slines <= reserved + 1) {
+   slines = 0;
+   } else {
+   /* Note: with compiler SIMD support (ex. -O3 -mavx2), this
+* might perform better than memchr. */
+   i = 0;
+   while (i <= reserved) {
+   ps1++;
+   i += (*ps1 == '\n');
+   }
+   /* The new mmfile includes this '\n' */
+   sbytes = msmall.ptr + msmall.size - ps1 - 1;
+   slines -= reserved;
+   if (msmall.ptr[msmall.size - 1] == '\n')
+   slines -= 1;
+   }
+
+   xe->nprefix = plines;
+   xe->nsuffix = slines;
+   out_mf1->ptr = mf1->ptr + pbytes;
+   out_mf1->size = mf1->size - pbytes - sbytes;
+   out_mf2->ptr = mf2->ptr + pbytes;
+   out_mf2->size = mf2->size - pbytes - sbytes;
+}
+
+
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, 
xpparam_t const *xpp,
   xdlclassifier_t *cf, xdfile_t *xdf) {
unsigned int hbits;
@@ -254,10 +335,13 @@
xdl_cha_free(>rcha);
 }
 
+/* Reserved lines for trimming, to leave room for shifting */
+#define TRIM_RESERVED_LINES 100
 
 int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe) {
long enl1, enl2, sample;
+   mmfile_t tmf1, tmf2;
xdlclassifier_t cf;
 
memset(, 0, sizeof(cf));
@@ -270,12 +354,14 @@
if (xdl_init_classifier(, enl1 + enl2 + 1, xpp->flags) < 0)
return -1;
 
-   if (xdl_prepare_ctx(1, mf1, enl1, xpp, , >xdf1) < 0) {
+   xdl_trim_files(mf1, mf2, TRIM_RESERVED_LINES, xe, , );
+
+   if (xdl_prepare_ctx(1, , enl1, xpp, , >xdf1) < 0) {
 
xdl_free_classifier();
return -1;
}
-   if (xdl_prepare_ctx(2, mf2, enl2, xpp, , >xdf2) < 0) {
+   if (xdl_prepare_ctx(2, , enl2, xpp, , >xdf2) < 0) {
 
xdl_free_ctx(>xdf1);

D2685: xdiff: add comments for fields in xdfile_t

2018-03-09 Thread quark (Jun Wu)
quark updated this revision to Diff 6798.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2685?vs=6643=6798

REVISION DETAIL
  https://phab.mercurial-scm.org/D2685

AFFECTED FILES
  mercurial/thirdparty/xdiff/xtypes.h

CHANGE DETAILS

diff --git a/mercurial/thirdparty/xdiff/xtypes.h 
b/mercurial/thirdparty/xdiff/xtypes.h
--- a/mercurial/thirdparty/xdiff/xtypes.h
+++ b/mercurial/thirdparty/xdiff/xtypes.h
@@ -46,15 +46,49 @@
 } xrecord_t;
 
 typedef struct s_xdfile {
+   /* manual memory management */
chastore_t rcha;
+
+   /* number of records (lines) */
long nrec;
+
+   /* hash table size
+* the maximum hash value in the table is (1 << hbits) */
unsigned int hbits;
+
+   /* hash table, hash value => xrecord_t
+* note: xrecord_t is a linked list. */
xrecord_t **rhash;
+
+   /* range excluding common prefix and suffix
+* [recs[i] for i in range(0, dstart)] are common prefix.
+* [recs[i] for i in range(dstart, dend + 1 - dstart)] are interesting
+* lines */
long dstart, dend;
+
+   /* pointer to records (lines) */
xrecord_t **recs;
+
+   /* record changed, use original "recs" index
+* rchag[i] can be either 0 or 1. 1 means recs[i] (line i) is marked
+* "changed". */
char *rchg;
+
+   /* cleaned-up record index => original "recs" index
+* clean-up means:
+*  rule 1. remove common prefix and suffix
+*  rule 2. remove records that are only on one side, since they can
+*  not match the other side
+* rindex[0] is likely dstart, if not removed up by rule 2.
+* rindex[nreff - 1] is likely dend, if not removed by rule 2.
+*/
long *rindex;
+
+   /* rindex size */
long nreff;
+
+   /* cleaned-up record index => hash value
+* ha[i] = recs[rindex[i]]->ha */
unsigned long *ha;
 } xdfile_t;
 



To: quark, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-09 Thread mharbison72 (Matt Harbison)
mharbison72 added a comment.


  
  
  > IDK if this means anything, but when it is stuck and I hit Ctrl+C, instead 
of terminating the test, the test simply continues with the script, and then 
hangs on the next 'remote output:' line.  Typically, Ctrl+C ends the test 
runner.
  
  A couple of more tidbits on this:
  
  1. When I hit Ctrl+C, it *does* print the shell prompt, but the process keeps 
printing test output after it, so I missed it.  When it jams, I can hit Ctrl+C 
again to get it going again.
  2. I did that all the way through the test, and eventually it got to the last 
one, which prints:
  
i> write(2) -> 2:
i> 0\n
i> flush() -> None
o> readline() -> 2:
o> 0\n
o> readline() -> 2:
o> 1\n
o> read(1) -> 1: 1
result: 1
remote output:
e> read(152) -> 152:
e> adding changesets\n
e> adding manifests\n
e> adding file changes\n
e> added 1 changesets with 1 changes to 1 files\n
e> ui.write 1\n
e> ui.write_err 1\n
e> ui.write 2\n
e> ui.write_err 2\n
+ '[' -n 1 ']'
+ hg --config extensions.strip= strip --no-backup -r 'all() - 
::'
+ echo SALT1520635194 2034 0
SALT1520635194 2034 0
  
  Note the exit code is 1 here.  So is it possible that the debug command is 
having trouble reading bytes from an empty stderr in the initial cases?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2758: transaction: add a name and a __repr__ implementation (API)

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG49e2ca27c10e: transaction: add a name and a __str__ 
implementation (API) (authored by martinvonz, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2758?vs=6791=6796

REVISION DETAIL
  https://phab.mercurial-scm.org/D2758

AFFECTED FILES
  mercurial/localrepo.py
  mercurial/transaction.py

CHANGE DETAILS

diff --git a/mercurial/transaction.py b/mercurial/transaction.py
--- a/mercurial/transaction.py
+++ b/mercurial/transaction.py
@@ -105,7 +105,7 @@
 class transaction(util.transactional):
 def __init__(self, report, opener, vfsmap, journalname, undoname=None,
  after=None, createmode=None, validator=None, releasefn=None,
- checkambigfiles=None):
+ checkambigfiles=None, name=r''):
 """Begin a new transaction
 
 Begins a new transaction that allows rolling back writes in the event 
of
@@ -149,6 +149,8 @@
 if checkambigfiles:
 self.checkambigfiles.update(checkambigfiles)
 
+self.names = [name]
+
 # A dict dedicated to precisely tracking the changes introduced in the
 # transaction.
 self.changes = {}
@@ -186,6 +188,11 @@
 # holds callbacks to call during abort
 self._abortcallback = {}
 
+def __repr__(self):
+name = r'/'.join(self.names)
+return (r'' %
+(name, self.count, self.usages))
+
 def __del__(self):
 if self.journal:
 self._abort()
@@ -365,14 +372,17 @@
 self.file.flush()
 
 @active
-def nest(self):
+def nest(self, name=r''):
 self.count += 1
 self.usages += 1
+self.names.append(name)
 return self
 
 def release(self):
 if self.count > 0:
 self.usages -= 1
+if self.names:
+self.names.pop()
 # if the transaction scopes are left without being closed, fail
 if self.count > 0 and self.usages == 0:
 self._abort()
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1177,7 +1177,7 @@
 raise error.ProgrammingError('transaction requires locking')
 tr = self.currenttransaction()
 if tr is not None:
-return tr.nest()
+return tr.nest(name=desc)
 
 # abort here if the journal already exists
 if self.svfs.exists("journal"):
@@ -1316,7 +1316,8 @@
  self.store.createmode,
  validator=validate,
  releasefn=releasefn,
- checkambigfiles=_cachedfiles)
+ checkambigfiles=_cachedfiles,
+ name=desc)
 tr.changes['revs'] = xrange(0, 0)
 tr.changes['obsmarkers'] = set()
 tr.changes['phases'] = {}



To: martinvonz, #hg-reviewers, durin42
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2757: tests: add a few tests involving --collapse and rebase.singletransaction=1

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG5b84bdd511eb: tests: add a few tests involving --collapse 
and rebase.singletransaction=1 (authored by martinvonz, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2757?vs=6767=6793

REVISION DETAIL
  https://phab.mercurial-scm.org/D2757

AFFECTED FILES
  tests/test-rebase-transaction.t

CHANGE DETAILS

diff --git a/tests/test-rebase-transaction.t b/tests/test-rebase-transaction.t
--- a/tests/test-rebase-transaction.t
+++ b/tests/test-rebase-transaction.t
@@ -48,3 +48,149 @@
   o  0: A
   
   $ cd ..
+
+Check that --collapse works
+
+  $ hg init collapse && cd collapse
+  $ hg debugdrawdag <<'EOF'
+  >   Z
+  >   |
+  >   | D
+  >   | |
+  >   | C
+  >   | |
+  >   Y B
+  >   |/
+  >   A
+  > EOF
+- We should only see two status stored messages. One from the start, one from
+- the end.
+  $ hg rebase --collapse --debug -b D -d Z | grep 'status stored'
+  rebase status stored
+  rebase status stored
+  $ hg tglog
+  o  3: Collapsed revision
+  |  * B
+  |  * C
+  |  * D
+  o  2: Z
+  |
+  o  1: Y
+  |
+  o  0: A
+  
+  $ cd ..
+
+With --collapse, check that conflicts can be resolved and rebase can then be
+continued
+
+  $ hg init collapse-conflict && cd collapse-conflict
+  $ hg debugdrawdag <<'EOF'
+  >   Z   # Z/conflict=Z
+  >   |
+  >   | D
+  >   | |
+  >   | C # C/conflict=C
+  >   | |
+  >   Y B
+  >   |/
+  >   A
+  > EOF
+  $ hg rebase --collapse -b D -d Z
+  rebasing 1:112478962961 "B" (B)
+  rebasing 3:c26739dbe603 "C" (C)
+  merging conflict
+  warning: conflicts while merging conflict! (edit, then use 'hg resolve 
--mark')
+  unresolved conflicts (see hg resolve, then hg rebase --continue)
+  [1]
+  $ hg tglog
+  o  5: D
+  |
+  | @  4: Z
+  | |
+  @ |  3: C
+  | |
+  | o  2: Y
+  | |
+  o |  1: B
+  |/
+  o  0: A
+  
+  $ hg st
+  M C
+  M conflict
+  A B
+  ? conflict.orig
+  $ echo resolved > conflict
+  $ hg resolve -m
+  (no more unresolved files)
+  continue: hg rebase --continue
+  $ hg rebase --continue
+  already rebased 1:112478962961 "B" (B) as 79bc8f4973ce
+  rebasing 3:c26739dbe603 "C" (C)
+  rebasing 5:d24bb333861c "D" (D tip)
+  saved backup bundle to 
$TESTTMP/collapse-conflict/.hg/strip-backup/112478962961-b5b34645-rebase.hg
+  $ hg tglog
+  o  3: Collapsed revision
+  |  * B
+  |  * C
+  |  * D
+  o  2: Z
+  |
+  o  1: Y
+  |
+  o  0: A
+  
+  $ cd ..
+
+With --collapse, check that the commit message editing can be canceled and
+rebase can then be continued
+
+  $ hg init collapse-cancel-editor && cd collapse-cancel-editor
+  $ hg debugdrawdag <<'EOF'
+  >   Z
+  >   |
+  >   | D
+  >   | |
+  >   | C
+  >   | |
+  >   Y B
+  >   |/
+  >   A
+  > EOF
+  $ HGEDITOR=false hg --config ui.interactive=1 rebase --collapse -b D -d Z
+  rebasing 1:112478962961 "B" (B)
+  rebasing 3:26805aba1e60 "C" (C)
+  rebasing 5:f585351a92f8 "D" (D tip)
+  abort: edit failed: false exited with status 1
+  [255]
+  $ hg tglog
+  o  5: D
+  |
+  | @  4: Z
+  | |
+  o |  3: C
+  | |
+  | o  2: Y
+  | |
+  o |  1: B
+  |/
+  o  0: A
+  
+  $ hg rebase --continue
+  already rebased 1:112478962961 "B" (B) as e9b22a392ce0
+  already rebased 3:26805aba1e60 "C" (C) as e9b22a392ce0
+  already rebased 5:f585351a92f8 "D" (D tip) as e9b22a392ce0
+  saved backup bundle to 
$TESTTMP/collapse-cancel-editor/.hg/strip-backup/112478962961-cb2a9b47-rebase.hg
+  $ hg tglog
+  o  3: Collapsed revision
+  |  * B
+  |  * C
+  |  * D
+  o  2: Z
+  |
+  o  1: Y
+  |
+  o  0: A
+  
+  $ cd ..



To: martinvonz, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2756: tests: simplify test-rebase-transaction.t

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG28ea00cd817e: tests: simplify test-rebase-transaction.t 
(authored by martinvonz, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2756?vs=6766=6792

REVISION DETAIL
  https://phab.mercurial-scm.org/D2756

AFFECTED FILES
  tests/test-rebase-transaction.t

CHANGE DETAILS

diff --git a/tests/test-rebase-transaction.t b/tests/test-rebase-transaction.t
--- a/tests/test-rebase-transaction.t
+++ b/tests/test-rebase-transaction.t
@@ -1,22 +1,23 @@
+Rebasing using a single transaction
+
   $ cat >> $HGRCPATH < [extensions]
   > rebase=
   > drawdag=$TESTDIR/drawdag.py
   > 
+  > [rebase]
+  > singletransaction=True
+  > 
   > [phases]
   > publish=False
   > 
   > [alias]
   > tglog = log -G --template "{rev}: {desc}"
   > EOF
 
-Rebasing using a single transaction
+Check that a simple rebase works
 
-  $ hg init singletr && cd singletr
-  $ cat >> .hg/hgrc < [rebase]
-  > singletransaction=True
-  > EOF
+  $ hg init simple && cd simple
   $ hg debugdrawdag <<'EOF'
   >   Z
   >   |



To: martinvonz, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2758: transaction: add a name and a __str__ implementation (API)

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz updated this revision to Diff 6791.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2758?vs=6790=6791

REVISION DETAIL
  https://phab.mercurial-scm.org/D2758

AFFECTED FILES
  mercurial/localrepo.py
  mercurial/transaction.py

CHANGE DETAILS

diff --git a/mercurial/transaction.py b/mercurial/transaction.py
--- a/mercurial/transaction.py
+++ b/mercurial/transaction.py
@@ -105,7 +105,7 @@
 class transaction(util.transactional):
 def __init__(self, report, opener, vfsmap, journalname, undoname=None,
  after=None, createmode=None, validator=None, releasefn=None,
- checkambigfiles=None):
+ checkambigfiles=None, name=r''):
 """Begin a new transaction
 
 Begins a new transaction that allows rolling back writes in the event 
of
@@ -149,6 +149,8 @@
 if checkambigfiles:
 self.checkambigfiles.update(checkambigfiles)
 
+self.names = [name]
+
 # A dict dedicated to precisely tracking the changes introduced in the
 # transaction.
 self.changes = {}
@@ -186,6 +188,11 @@
 # holds callbacks to call during abort
 self._abortcallback = {}
 
+def __repr__(self):
+name = r'/'.join(self.names)
+return (r'' %
+(name, self.count, self.usages))
+
 def __del__(self):
 if self.journal:
 self._abort()
@@ -365,14 +372,17 @@
 self.file.flush()
 
 @active
-def nest(self):
+def nest(self, name=r''):
 self.count += 1
 self.usages += 1
+self.names.append(name)
 return self
 
 def release(self):
 if self.count > 0:
 self.usages -= 1
+if self.names:
+self.names.pop()
 # if the transaction scopes are left without being closed, fail
 if self.count > 0 and self.usages == 0:
 self._abort()
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1177,7 +1177,7 @@
 raise error.ProgrammingError('transaction requires locking')
 tr = self.currenttransaction()
 if tr is not None:
-return tr.nest()
+return tr.nest(name=desc)
 
 # abort here if the journal already exists
 if self.svfs.exists("journal"):
@@ -1316,7 +1316,8 @@
  self.store.createmode,
  validator=validate,
  releasefn=releasefn,
- checkambigfiles=_cachedfiles)
+ checkambigfiles=_cachedfiles,
+ name=desc)
 tr.changes['revs'] = xrange(0, 0)
 tr.changes['obsmarkers'] = set()
 tr.changes['phases'] = {}



To: martinvonz, #hg-reviewers
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2755: phabricator: update doc string for deprecated token argument

2018-03-09 Thread joerg.sonnenberger (Joerg Sonnenberger)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG0bebd4608ce3: phabricator: update doc string for deprecated 
token argument (authored by joerg.sonnenberger, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2755?vs=6765=6795

REVISION DETAIL
  https://phab.mercurial-scm.org/D2755

AFFECTED FILES
  contrib/phabricator.py

CHANGE DETAILS

diff --git a/contrib/phabricator.py b/contrib/phabricator.py
--- a/contrib/phabricator.py
+++ b/contrib/phabricator.py
@@ -22,7 +22,8 @@
 url = https://phab.example.com/
 
 # API token. Get it from https://$HOST/conduit/login/
-token = cli-
+# Deprecated: see [phabricator.auth] below
+#token = cli-
 
 # Repo callsign. If a repo has a URL https://$HOST/diffusion/FOO, then its
 # callsign is "FOO".



To: joerg.sonnenberger, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2754: phabricator: print deprecation warning only once

2018-03-09 Thread joerg.sonnenberger (Joerg Sonnenberger)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG6490b0915881: phabricator: print deprecation warning only 
once (authored by joerg.sonnenberger, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2754?vs=6764=6794

REVISION DETAIL
  https://phab.mercurial-scm.org/D2754

AFFECTED FILES
  contrib/phabricator.py

CHANGE DETAILS

diff --git a/contrib/phabricator.py b/contrib/phabricator.py
--- a/contrib/phabricator.py
+++ b/contrib/phabricator.py
@@ -99,13 +99,17 @@
 process('', params)
 return util.urlreq.urlencode(flatparams)
 
+printed_token_warning = False
+
 def readlegacytoken(repo):
 """Transitional support for old phabricator tokens.
 
 Remove before the 4.6 release.
 """
+global printed_token_warning
 token = repo.ui.config('phabricator', 'token')
-if token:
+if token and not printed_token_warning:
+printed_token_warning = True
 repo.ui.warn(_('phabricator.token is deprecated - please '
'migrate to the phabricator.auth section.\n'))
 return token



To: joerg.sonnenberger, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2758: transaction: add a name and a __str__ implementation (API)

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  Done

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2758

To: martinvonz, #hg-reviewers
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2758: transaction: add a name and a __str__ implementation (API)

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz updated this revision to Diff 6790.
martinvonz marked an inline comment as done.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2758?vs=6769=6790

REVISION DETAIL
  https://phab.mercurial-scm.org/D2758

AFFECTED FILES
  mercurial/localrepo.py
  mercurial/transaction.py

CHANGE DETAILS

diff --git a/mercurial/transaction.py b/mercurial/transaction.py
--- a/mercurial/transaction.py
+++ b/mercurial/transaction.py
@@ -105,7 +105,7 @@
 class transaction(util.transactional):
 def __init__(self, report, opener, vfsmap, journalname, undoname=None,
  after=None, createmode=None, validator=None, releasefn=None,
- checkambigfiles=None):
+ checkambigfiles=None, name=''):
 """Begin a new transaction
 
 Begins a new transaction that allows rolling back writes in the event 
of
@@ -149,6 +149,8 @@
 if checkambigfiles:
 self.checkambigfiles.update(checkambigfiles)
 
+self.names = [name]
+
 # A dict dedicated to precisely tracking the changes introduced in the
 # transaction.
 self.changes = {}
@@ -186,6 +188,11 @@
 # holds callbacks to call during abort
 self._abortcallback = {}
 
+def __repr__(self):
+name = r'/'.join(self.names)
+return (r'' %
+(name, self.count, self.usages))
+
 def __del__(self):
 if self.journal:
 self._abort()
@@ -365,14 +372,17 @@
 self.file.flush()
 
 @active
-def nest(self):
+def nest(self, name=''):
 self.count += 1
 self.usages += 1
+self.names.append(name)
 return self
 
 def release(self):
 if self.count > 0:
 self.usages -= 1
+if self.names:
+self.names.pop()
 # if the transaction scopes are left without being closed, fail
 if self.count > 0 and self.usages == 0:
 self._abort()
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1177,7 +1177,7 @@
 raise error.ProgrammingError('transaction requires locking')
 tr = self.currenttransaction()
 if tr is not None:
-return tr.nest()
+return tr.nest(name=desc)
 
 # abort here if the journal already exists
 if self.svfs.exists("journal"):
@@ -1316,7 +1316,8 @@
  self.store.createmode,
  validator=validate,
  releasefn=releasefn,
- checkambigfiles=_cachedfiles)
+ checkambigfiles=_cachedfiles,
+ name=desc)
 tr.changes['revs'] = xrange(0, 0)
 tr.changes['obsmarkers'] = set()
 tr.changes['phases'] = {}



To: martinvonz, #hg-reviewers
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2758: transaction: add a name and a __str__ implementation (API)

2018-03-09 Thread durin42 (Augie Fackler)
durin42 added a comment.


  Either way, you should make sure this always returns a sysstr, otherwise 
you'll cause pretty terrible python3 breakage. Look around for other __repr__ 
instances for examples.

INLINE COMMENTS

> transaction.py:191
>  
> +def __str__(self):
> +name = '/'.join(self.names)

nit: I think this would make more sense as a __repr__ than a __str__.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2758

To: martinvonz, #hg-reviewers
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[Bug 5811] New: fsmonitor returns bogus stat tuples, breaking after cleanup in ffa3026d4196

2018-03-09 Thread mercurial-bugs
https://bz.mercurial-scm.org/show_bug.cgi?id=5811

Bug ID: 5811
   Summary: fsmonitor returns bogus stat tuples, breaking after
cleanup in ffa3026d4196
   Product: Mercurial
   Version: default branch
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: bug
  Priority: wish
 Component: fsmonitor
  Assignee: bugzi...@mercurial-scm.org
  Reporter: duri...@gmail.com
CC: mercurial-devel@mercurial-scm.org

Change ffa3026d4196 moved us from using thing.st_mtime to always using
thing[ST_MTIME] because the latter always gets us an integer. We need to teach
the code in bser.c to handle __getitem__(ST_MTIME), as it currently stacktraces
like this:

** Unknown exception encountered with possibly-broken third-party extension
perf
** which supports versions unknown of Mercurial.
** Please disable perf and try your action again.
** If that fixes the bug please report it to the extension author.
** Python 2.7.13 (default, Jan 23 2017, 15:28:18) [GCC 6.2.0 20161005]
** Mercurial Distributed SCM (version 4.5.2+939-09b58af83d44)
** Extensions loaded: blackbox, convert, fsmonitor, rebase, histedit,
patchbomb, purge, record, share, shelve, show, strip, mq, qbackout, qimportbz,
reviewboard, bzexport, firefoxtree, mozext, push-to-try, perf, phabricator,
showstack
Traceback (most recent call last):
  File "/home/gps/lib/python/mercurial/commandserver.py", line 368, in
_serverequest
sv.serve()
  File "/home/gps/lib/python/mercurial/commandserver.py", line 292, in serve
while self.serveone():
  File "/home/gps/lib/python/mercurial/commandserver.py", line 267, in serveone
handler(self)
  File "/home/gps/lib/python/mercurial/chgserver.py", line 453, in runcommand
return super(chgcmdserver, self).runcommand()
  File "/home/gps/lib/python/mercurial/commandserver.py", line 251, in
runcommand
ret = (dispatch.dispatch(req) or 0) & 255 # might return None
  File "/home/gps/lib/python/mercurial/dispatch.py", line 208, in dispatch
ret = _runcatch(req)
  File "/home/gps/lib/python/mercurial/dispatch.py", line 349, in _runcatch
return _callcatch(ui, _runcatchfunc)
  File "/home/gps/lib/python/mercurial/dispatch.py", line 357, in _callcatch
return scmutil.callcatch(ui, func)
  File "/home/gps/lib/python/mercurial/scmutil.py", line 154, in callcatch
return func()
  File "/home/gps/lib/python/mercurial/dispatch.py", line 339, in _runcatchfunc
return _dispatch(req)
  File "/home/gps/lib/python/mercurial/dispatch.py", line 943, in _dispatch
cmdpats, cmdoptions)
  File "/home/gps/lib/python/mercurial/dispatch.py", line 700, in runcommand
ret = _runcommand(ui, options, cmd, d)
  File "/home/gps/lib/python/mercurial/dispatch.py", line 951, in _runcommand
return cmdfunc()
  File "/home/gps/lib/python/mercurial/dispatch.py", line 940, in 
d = lambda: util.checksignature(func)(ui, *args, **strcmdopt)
  File "/home/gps/lib/python/mercurial/util.py", line 1494, in check
return func(*args, **kwargs)
  File "/home/gps/lib/python/mercurial/util.py", line 1494, in check
return func(*args, **kwargs)
  File "/home/gps/lib/python/hgext/mq.py", line 3588, in mqcommand
return orig(ui, repo, *args, **kwargs)
  File "/home/gps/lib/python/mercurial/util.py", line 1494, in check
return func(*args, **kwargs)
  File "/home/gps/lib/python/mercurial/commands.py", line 4898, in status
'unknown' in show, opts.get('subrepos'))
  File "/home/gps/lib/python/hgext/fsmonitor/__init__.py", line 792, in status
return overridestatus(orig, self, *args, **kwargs)
  File "/home/gps/lib/python/hgext/fsmonitor/__init__.py", line 539, in
overridestatus
listsubrepos)
  File "/home/gps/lib/python/mercurial/localrepo.py", line 2093, in status
listsubrepos)
  File "/home/gps/lib/python/mercurial/context.py", line 360, in status
listunknown)
  File "/home/gps/lib/python/mercurial/context.py", line 1790, in _buildstatus
s = self._dirstatestatus(match, listignored, listclean, listunknown)
  File "/home/gps/lib/python/mercurial/context.py", line 1723, in
_dirstatestatus
clean=clean, unknown=unknown)
  File "/home/gps/lib/python/mercurial/dirstate.py", line 1071, in status
elif (time != st[stat.ST_MTIME]
IndexError: tuple index out of range

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2744: hgweb: handle CONTENT_LENGTH

2018-03-09 Thread durin42 (Augie Fackler)
durin42 accepted this revision.
durin42 added inline comments.
This revision is now accepted and ready to land.

INLINE COMMENTS

> wireprotoserver.py:94
>  def forwardpayload(self, fp):
> -if b'Content-Length' in self._req.headers:
> -length = int(self._req.headers[b'Content-Length'])
> -else:
> -length = int(self._wsgireq.env[r'CONTENT_LENGTH'])
> +# TODO Content-Length may not always be defined.
> +length = int(self._req.headers[b'Content-Length'])

For our clients it always will, because we precompute the bundle to a file. 
It's gross.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2744

To: indygreg, #hg-reviewers, durin42
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-09 Thread mharbison72 (Matt Harbison)
mharbison72 added a comment.


  In https://phab.mercurial-scm.org/D2720#44396, @indygreg wrote:
  
  > My only explanation is this is stdout output buffering and things are 
really hanging on the next read operation.
  >
  > Maybe try sprinkling some `ui.fout.flush()` and/or `util.stdout.flush()` 
and/or `sys.stdout.flush()` calls throughout the debug command? Maybe at the 
first thing in the loop that evaluates commands to execute?
  
  
  Both `util.stdout.flush()` and `ui.fout.flush()` get a little further in the 
same command; `sys.stdout.flush()` has no effect, when added at the top of the 
loop.  But then it stalls on remote output:
  
...
o> readline() -> 2:
o> 0\n
i> write(4) -> 4:
i> 426\n
i> write(426) -> 426:
i> 
HG10UN\x00\x00\x00\x9eh\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x0

0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\

x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
x00>cba485ca3678256e044428f70f58291196f6e9de\n
i> test\n
i> 0 0\n
i> foo\n
i> \n
i> 
initial\x00\x00\x00\x00\x00\x00\x00\x8d\xcb\xa4\x85\xca6x%n\x04D(\xf7\x0fX)\x11\x96\xf6\xe9\xde\x00\x00

\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x

00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x
00\x00\x00\x00\x00\x00\x00-foo\x00362fef284ce2ca02aecc8de6d5e8a1c3af0556fe\n
i> 
\x00\x00\x00\x00\x00\x00\x00\x07foo\x00\x00\x00b6/\xef(L\xe2\xca\x02\xae\xcc\x8d\xe6\xd5\xe8\xa1\xc3\xa

f\x05V\xfe\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x020\n
i> \x00\x00\x00\x00\x00\x00\x00\x00
i> write(2) -> 2:
i> 0\n
i> flush() -> None
o> readline() -> 2:
o> 0\n
o> readline() -> 2:
o> 1\n
o> read(1) -> 1: 0
result: 0
remote output:
  
  So I added a self.flush() inside ui._write() because I figured this is a 
problem on the remote end (and probably not localized like the debug command), 
but that didn't do anything.  (ui._write_err() already flushes, and has a 
comment about stderr may be buffered on Windows.)
  
  IDK if this means anything, but when it is stuck and I hit Ctrl + C, instead 
of terminating the test, the test simply continues with the script, and then 
hangs on the next 'remote output:' line.  Typically, Ctrl + C ends the test 
runner.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] xdiff: fix trivial build warnings on Windows

2018-03-09 Thread Jun Wu
Excerpts from Yuya Nishihara's message of 2018-03-08 21:33:42 +0900:
> On Tue, 6 Mar 2018 19:12:26 -0800, Jun Wu wrote:
> > Yeah, xdiff needs a migration from using "long", "int"s to "size_t" etc.
> > The git community has chosen to disallow diff >1GB files because of the
> > overflow concern [1].
> > 
> > [1]: 
> > https://github.com/git/git/commit/dcd1742e56ebb944c4ff62346da4548e1e3be675
> 
> So, should we queue this now or leave warnings to denote things that should
> be cleaned up?

I think the ideal solution would be replacing all "long"s to one of:
"int64_t" or "ssize_t", "size_t", instead of doing casting around.

I can talk a look at the actual change, since I think I have some knowledge
about xdiff internals now.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2759: rebase: fix issue 5494 also with --collapse

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2759

AFFECTED FILES
  hgext/rebase.py
  tests/test-rebase-interruptions.t

CHANGE DETAILS

diff --git a/tests/test-rebase-interruptions.t 
b/tests/test-rebase-interruptions.t
--- a/tests/test-rebase-interruptions.t
+++ b/tests/test-rebase-interruptions.t
@@ -479,7 +479,6 @@
   $ hg rebase --continue
   rebasing 2:fdaca8533b86 "b" (tip)
   saved backup bundle to 
$TESTTMP/repo/.hg/strip-backup/fdaca8533b86-7fd70513-rebase.hg
-BROKEN: the merge state was not cleared
   $ hg resolve --list
-  R a
   $ test -d .hg/merge
+  [1]
diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -579,6 +579,12 @@
 editor=editor,
 keepbranches=self.keepbranchesf,
 date=self.date)
+
+if newnode is None:
+# If it ended up being a no-op commit, then the normal
+# merge state clean-up path doesn't happen, so do it
+# here. Fix issue5494
+mergemod.mergestate.clean(repo)
 if newnode is not None:
 newrev = repo[newnode].rev()
 for oldrev in self.state:



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2761: rebase: use configoverride context manager for ui.forcemerge

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2761

AFFECTED FILES
  hgext/rebase.py

CHANGE DETAILS

diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -481,9 +481,8 @@
 if len(repo[None].parents()) == 2:
 repo.ui.debug('resuming interrupted rebase\n')
 else:
-try:
-ui.setconfig('ui', 'forcemerge', opts.get('tool', ''),
- 'rebase')
+overrides = {('ui', 'forcemerge'): opts.get('tool', '')}
+with ui.configoverride(overrides, 'rebase'):
 stats = rebasenode(repo, rev, p1, base, self.collapsef,
dest, wctx=self.wctx)
 if stats and stats[3] > 0:
@@ -493,8 +492,6 @@
 raise error.InterventionRequired(
 _('unresolved conflicts (see hg '
   'resolve, then hg rebase --continue)'))
-finally:
-ui.setconfig('ui', 'forcemerge', '', 'rebase')
 if not self.collapsef:
 merging = p2 != nullrev
 editform = cmdutil.mergeeditform(merging, 'rebase')



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2760: rebase: also restore "ui.allowemptycommit" value

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  It looks like this was lost when the code was converted to the
  ui.configoverride() context manager in 
https://phab.mercurial-scm.org/rHGf255b1811f5e454d8d0add4a1190effc82b301be 
(rebase: get rid
  of ui.backupconfig, 2017-03-16). (And then the bad example was
  duplicated in 
https://phab.mercurial-scm.org/rHG228916ca12b54b78faccd5c4abc54cba68606637 
(rebase: add concludememorynode(), and call
  it when rebasing in-memory, 2017-12-07).)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2760

AFFECTED FILES
  hgext/rebase.py

CHANGE DETAILS

diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -1054,9 +1054,9 @@
 
 destphase = max(ctx.phase(), phases.draft)
 overrides = {('phases', 'new-commit'): destphase}
+if keepbranch:
+overrides[('ui', 'allowemptycommit')] = True
 with repo.ui.configoverride(overrides, 'rebase'):
-if keepbranch:
-repo.ui.setconfig('ui', 'allowemptycommit', True)
 # Replicates the empty check in ``repo.commit``.
 if wctx.isempty() and not repo.ui.configbool('ui', 'allowemptycommit'):
 return None
@@ -1096,9 +1096,9 @@
 
 destphase = max(ctx.phase(), phases.draft)
 overrides = {('phases', 'new-commit'): destphase}
+if keepbranch:
+overrides[('ui', 'allowemptycommit')] = True
 with repo.ui.configoverride(overrides, 'rebase'):
-if keepbranch:
-repo.ui.setconfig('ui', 'allowemptycommit', True)
 # Commit might fail if unresolved files exist
 if date is None:
 date = ctx.date()



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2686: xdiff: add a preprocessing step that trims files

2018-03-09 Thread quark (Jun Wu)
quark added inline comments.

INLINE COMMENTS

> indygreg wrote in xprepare.c:169
> Bonus points if you resubmit this with more expressive variable names. Just 
> because xdiff's code is almost impossible to read doesn't mean we should 
> follow suit :)

The style guide in git community recommends using whatever style around the 
existing code base. I think we actually also do that, since new methods are not 
using `foo_bar` naming.

I'll add comments instead.

> indygreg wrote in xprepare.c:183-193
> I'm still showing this as a hot point in the code when compiling with default 
> settings used by Python packaging tools. I suspect we can get better results 
> on typical compiler flags by tweaking things a bit. But we can do that after 
> this lands.

Yes. It's expected.

I did try various ways to optimize it before sending the patch, including:

- Like `memchr`, test 8 bytes at once. Difficulty: memory alignment is not 
guaranteed (ex. `msmall.ptr % 8 != mlarge.ptr % 8`).
- Use various SIMD related compiler flags.

The first makes things slower, even if I did tell the compiler "pretend the 
memory to be aligned". The second makes no difference.

> indygreg wrote in xprepare.c:199-202
> This is clever. But `memrchr()` will be easier to read. Plus I suspect it 
> will be faster.
> 
> If you disagree, let's compromise at:
> 
>   i = 0;
>   while (i <= reserved) {
>  pp1--;
>  i += (*pp1 == '\n');
>   }
> 
> There's no sense using a `for` without the 3rd parameter IMO.

I think readability of the current code is better, since the memrchr version 
needs a "size" parameter, which is a burden to the existing logic.

I did some research before sending this patch. The glibc memchr is basically 
relying on `maybe_contain_zero_byte` that can test 8 bytes at once. But CPU 
SIMD instructions are faster than that trick.

The following code counts "\n"s in a file, using 3 ways: naive loop, testing 8 
bytes at once, and actually using memchr. See the benchmark at the end.

  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  
  char buf[6400] __attribute__ ((aligned (16)));
  int size;
  
  static int count_naive() {
int count = 0, i = 0;
for (int i = 0; i < size; ++i) {
  count += buf[i] == '\n';
}
return count;
  }
  
  static int count_memchr() {
int count = 0, i = 0;
const char *p = buf;
while (p) {
  p = memchr(p + 1, '\n', buf + size - p);
  count++;
}
return count;
  }
  
  static inline int maybe_contain_zero_byte(uint64_t x) {
// See https://github.com/lattera/glibc/blob/master/string/memchr.c
const uint64_t MAGIC_BITS = 0x7efefefefefefeff;
return x + MAGIC_BITS) ^ ~x) & ~MAGIC_BITS) != 0);
  }
  
  static int count_u64() {
uint64_t *p = (uint64_t *)
uint64_t x = '\n' + ('\n' << 8);
int count = 0;
x |= x << 16;
x |= x << 32;
for (int i = 0; i < size / 8; ++i, ++p) {
  uint64_t v = *p ^ x;
  if (maybe_contain_zero_byte(v)) {
const char *c = (const char *) p;
for (int j = 0; j < 8; ++j) {
  count += (((v >> (8 * j)) & 0xff) == 0);
}
  }
}
return count;
  }
  
  int main(int argc, char const *argv[]) {
int fd = open(argv[1], O_RDONLY);
size = (int) read(fd, buf, sizeof buf);
if (argv[2] && argv[2][0] == 'n') {
  printf("naive:  %d\n", count_naive());
} else if (argv[2] && argv[2][0] == 'm') {
  printf("memchr: %d\n", count_memchr());
} else {
  printf("u64:%d\n", count_u64());
}
return 0;
  }
  
  /*
  # gcc 7.3.0
  gcc -O2 a.c -o ao2
  gcc -O3 -mavx2 a.c -o ao3
  
  # best of 50 runs, wall time
  # test case: random data
  # head -c 6400 /dev/urandom > /tmp/r 
  ./ao2 naive  0.069
  ./ao2 u640.043
  ./ao2 memchr 0.039
  ./ao3 naive  0.038  # best
  ./ao3 u640.043
  ./ao3 memchr 0.039
  
  # test case: real code
  # v=read('/home/quark/hg-committed/mercurial/commands.py')
  # write('/tmp/c', v * (6400/len(v)))
  ./ao2 naive  0.069
  ./ao2 u640.059
  ./ao2 memchr 0.055
  ./ao3 naive  0.038  # best
  ./ao3 u640.055
  ./ao3 memchr 0.055  # slower
  
  # ruby script to run the tests
  path = ARGV[0]
  %w[./ao2 ./ao3].product(%w[naive u64 memchr]).each do |exe, name|
time = 50.times.map do
  t1 = Time.now
  system exe, path, name, 1=>'/dev/null'
  Time.now - t1
end.min
puts "#{exe} #{name.ljust(6)} #{time.round(3)}"
  end
  */

So I'd like to keep it simple and avoid over optimization. After all, this is 
O(100)-ish, assuming line length won't be ridiculously long. Even memchr is 
faster by 14%, it won't be noticeable. Not to say it's 31% slower in the -O3 
case.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2686

To: quark, #hg-reviewers, indygreg
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org

D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  My only explanation is this is stdout output buffering and things are really 
hanging on the next read operation.
  
  Maybe try sprinkling some `ui.fout.flush()` and/or `util.stdout.flush()` 
and/or `sys.stdout.flush()` calls throughout the debug command? Maybe at the 
first thing in the loop that evaluates commands to execute?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


mercurial@36786: 8 new changesets

2018-03-09 Thread Mercurial Commits
8 new changesets in mercurial:

https://www.mercurial-scm.org/repo/hg/rev/bf9a04d78084
changeset:   36779:bf9a04d78084
user:Augie Fackler 
date:Sun Mar 04 21:14:24 2018 -0500
summary: hgweb: adapt to socket._fileobject changes in Python 3

https://www.mercurial-scm.org/repo/hg/rev/f3c314020beb
changeset:   36780:f3c314020beb
user:Augie Fackler 
date:Mon Mar 05 15:07:32 2018 -0500
summary: osutil: implement minimal __getitem__ compatibility on our custom 
listdir type

https://www.mercurial-scm.org/repo/hg/rev/ffa3026d4196
changeset:   36781:ffa3026d4196
user:Augie Fackler 
date:Mon Mar 05 12:30:20 2018 -0500
summary: cleanup: use stat_result[stat.ST_MTIME] instead of 
stat_result.st_mtime

https://www.mercurial-scm.org/repo/hg/rev/86ba6e3eba4e
changeset:   36782:86ba6e3eba4e
user:Augie Fackler 
date:Mon Mar 05 12:31:08 2018 -0500
summary: util: stop calling os.stat_float_times()

https://www.mercurial-scm.org/repo/hg/rev/1fbbb8e83392
changeset:   36783:1fbbb8e83392
user:Yuya Nishihara 
date:Sun Mar 04 18:21:16 2018 -0500
summary: py3: read/write plain lock file in binary mode

https://www.mercurial-scm.org/repo/hg/rev/e3732c3ab92d
changeset:   36784:e3732c3ab92d
user:Yuya Nishihara 
date:Sun Mar 04 18:34:46 2018 -0500
summary: py3: fix type of default username

https://www.mercurial-scm.org/repo/hg/rev/e2c0c0884b1f
changeset:   36785:e2c0c0884b1f
user:Yuya Nishihara 
date:Sun Mar 04 18:41:09 2018 -0500
summary: py3: make test-commit-multiple.t byte-safe

https://www.mercurial-scm.org/repo/hg/rev/ed46d48453e8
changeset:   36786:ed46d48453e8
bookmark:@
tag: tip
user:Yuya Nishihara 
date:Sun Mar 04 18:47:07 2018 -0500
summary: py3: drop b'' from generate-working-copy-states.py output

-- 
Repository URL: https://www.mercurial-scm.org/repo/hg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-09 Thread mharbison72 (Matt Harbison)
mharbison72 added a comment.


  In https://phab.mercurial-scm.org/D2720#44350, @indygreg wrote:
  
  > In https://phab.mercurial-scm.org/D2720#44293, @mharbison72 wrote:
  >
  > > i> write(4) -> 4:
  > >  i> 426\n
  > >  i> write(426) -> 426:
  > >  i> 
HG10UN\x00\x00\x00\x9eh\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x0
  > >  
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
  > >  
x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
  > >  x00>cba485ca3678256e044428f70f58291196f6e9de\n
  > >  i> test\n
  > >  i> 0 0\n
  > >  i> foo\n
  > >  i> \n
  > >  i> 
initial\x00\x00\x00\x00\x00\x00\x00\x8d\xcb\xa4\x85\xca6x%n\x04D(\xf7\x0fX)\x11\x96\xf6\xe9\xde\x00\x00
  > >  
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x
  > >  
00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x
  > >  
00\x00\x00\x00\x00\x00\x00-foo\x00362fef284ce2ca02aecc8de6d5e8a1c3af0556fe\n
  > >  i> 
\x00\x00\x00\x00\x00\x00\x00\x07foo\x00\x00\x00b6/\xef(L\xe2\xca\x02\xae\xcc\x8d\xe6\xd5\xe8\xa1\xc3\xa
  > >  
f\x05V\xfe\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
  > >  \x00\x00\x00\x00\x00\x00\x00\x00\
  > >
  > >  
  >
  >
  > This is a wonky place to hang! For context, this is processing a `command 
unbundle` line. The lines that follow should be:
  >
  >   i> write(2) -> 2:
  >   i> 0\n
  >   i> flush() -> None
  >   o> readline() -> 2:
  >   o> 0\n
  >   o> readline() -> 2:
  >   o> 1\n
  >   o> read(1) -> 1: 0
  >   result: 0
  >   remote output: 
  >   e> read(115) -> 115:
  >   e> abort: incompatible Mercurial client; bundle2 required\n
  >   e> (see https://www.mercurial-scm.org/wiki/IncompatibleClient)\n
  >   
  >
  > What happens under the hood during pushes is we send out chunks containing 
the bundle followed by an empty chunk. That code is:
  >
  >   for d in iter(lambda: fp.read(4096), ''):
  >   self._writeframed(d)
  >   self._writeframed("", flush=True)
  >
  >
  > That empty chunk is apparently not getting sent. Or its logging is not 
getting written/printed.
  >
  > This patch shouldn't have changed any behavior with regard to this part of 
the I/O. So I'm scratching my head over how this caused deadlock. Are you sure 
you can bisect it to this patch.
  
  
  Yep.  I imported this with `hg phapread --stack`, and can run the 
https://phab.mercurial-scm.org/D2719 in 50 seconds or so.  The debug run of 
this I let hang out for 20 minutes before killing it.
  
  Here's the neighboring failure from https://phab.mercurial-scm.org/D2719, 
because I can't explain the AWOL output.  (But that shouldn't have stopped 
stdout above mid stream.)
  
--- e:/Projects/hg/tests/test-ssh-proto-unbundle.t
+++ e:/Projects/hg/tests/test-ssh-proto-unbundle.t.err
@@ -93,9 +93,6 @@
   o> read(1) -> 1: 0
   result: 0
   remote output:
-  e> read(-1) -> 115:
-  e> abort: incompatible Mercurial client; bundle2 required\n
-  e> (see https://www.mercurial-scm.org/wiki/IncompatibleClient)\n

   testing ssh2
   creating ssh peer from handshake results
@@ -143,9 +140,6 @@
   o> read(1) -> 1: 0
   result: 0
   remote output:
-  e> read(-1) -> 115:
-  e> abort: incompatible Mercurial client; bundle2 required\n
-  e> (see https://www.mercurial-scm.org/wiki/IncompatibleClient)\n

   $ cd ..

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2741: wireprotoserver: remove broken optimization for non-httplib client

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG9a6216c18ffd: wireprotoserver: remove broken optimization 
for non-httplib client (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2741?vs=6747=6785

REVISION DETAIL
  https://phab.mercurial-scm.org/D2741

AFFECTED FILES
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -328,10 +328,7 @@
 if (wsgireq.env[r'REQUEST_METHOD'] == r'POST' and
 # But not if Expect: 100-continue is being used.
 (wsgireq.env.get('HTTP_EXPECT',
- '').lower() != '100-continue') or
-# Or the non-httplib HTTP library is being advertised by
-# the client.
-wsgireq.env.get('X-HgHttp2', '')):
+ '').lower() != '100-continue')):
 wsgireq.drain()
 else:
 wsgireq.headers.append((r'Connection', r'Close'))



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2738: hgweb: only recognize wire protocol commands from query string (BC)

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGad13301e6479: hgweb: only recognize wire protocol commands 
from query string (BC) (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2738?vs=6744=6782

REVISION DETAIL
  https://phab.mercurial-scm.org/D2738

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -150,25 +150,26 @@
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def parsehttprequest(rctx, wsgireq, query, checkperm):
+def parsehttprequest(rctx, wsgireq, req, checkperm):
 """Parse the HTTP request for a wire protocol request.
 
 If the current request appears to be a wire protocol request, this
 function returns a dict with details about that request, including
 an ``abstractprotocolserver`` instance suitable for handling the
 request. Otherwise, ``None`` is returned.
 
 ``wsgireq`` is a ``wsgirequest`` instance.
+``req`` is a ``parsedrequest`` instance.
 """
 repo = rctx.repo
 
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
-if 'cmd' not in wsgireq.form:
+if 'cmd' not in req.querystringdict:
 return None
 
-cmd = wsgireq.form['cmd'][0]
+cmd = req.querystringdict['cmd'][0]
 
 # The "cmd" request parameter is used by both the wire protocol and hgweb.
 # While not all wire protocol commands are available for all transports,
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -330,7 +330,7 @@
 
 # Route it to a wire protocol handler if it looks like a wire protocol
 # request.
-protohandler = wireprotoserver.parsehttprequest(rctx, wsgireq, query,
+protohandler = wireprotoserver.parsehttprequest(rctx, wsgireq, req,
 self.check_perm)
 
 if protohandler:



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2740: wireprotoserver: move all wire protocol handling logic out of hgweb

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG09b9a9d4612b: wireprotoserver: move all wire protocol 
handling logic out of hgweb (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2740?vs=6746=6784

REVISION DETAIL
  https://phab.mercurial-scm.org/D2740

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -150,24 +150,29 @@
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def parsehttprequest(rctx, wsgireq, req, checkperm):
-"""Parse the HTTP request for a wire protocol request.
+def handlewsgirequest(rctx, wsgireq, req, checkperm):
+"""Possibly process a wire protocol request.
 
-If the current request appears to be a wire protocol request, this
-function returns a dict with details about that request, including
-an ``abstractprotocolserver`` instance suitable for handling the
-request. Otherwise, ``None`` is returned.
+If the current request is a wire protocol request, the request is
+processed by this function.
 
 ``wsgireq`` is a ``wsgirequest`` instance.
 ``req`` is a ``parsedrequest`` instance.
+
+Returns a 2-tuple of (bool, response) where the 1st element indicates
+whether the request was handled and the 2nd element is a return
+value for a WSGI application (often a generator of bytes).
 """
+# Avoid cycle involving hg module.
+from .hgweb import common as hgwebcommon
+
 repo = rctx.repo
 
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
 if 'cmd' not in req.querystringdict:
-return None
+return False, None
 
 cmd = req.querystringdict['cmd'][0]
 
@@ -179,17 +184,32 @@
 # known wire protocol commands and it is less confusing for machine
 # clients.
 if not iscmd(cmd):
-return None
+return False, None
+
+# The "cmd" query string argument is only valid on the root path of the
+# repo. e.g. ``/?cmd=foo``, ``/repo?cmd=foo``. URL paths within the repo
+# like ``/blah?cmd=foo`` are not allowed. So don't recognize the request
+# in this case. We send an HTTP 404 for backwards compatibility reasons.
+if req.dispatchpath:
+res = _handlehttperror(
+hgwebcommon.ErrorResponse(hgwebcommon.HTTP_NOT_FOUND), wsgireq,
+cmd)
+
+return True, res
 
 proto = httpv1protocolhandler(wsgireq, repo.ui,
   lambda perm: checkperm(rctx, wsgireq, perm))
 
-return {
-'cmd': cmd,
-'proto': proto,
-'dispatch': lambda: _callhttp(repo, wsgireq, proto, cmd),
-'handleerror': lambda ex: _handlehttperror(ex, wsgireq, cmd),
-}
+# The permissions checker should be the only thing that can raise an
+# ErrorResponse. It is kind of a layer violation to catch an hgweb
+# exception here. So consider refactoring into a exception type that
+# is associated with the wire protocol.
+try:
+res = _callhttp(repo, wsgireq, proto, cmd)
+except hgwebcommon.ErrorResponse as e:
+res = _handlehttperror(e, wsgireq, cmd)
+
+return True, res
 
 def _httpresponsetype(ui, wsgireq, prefer_uncompressed):
 """Determine the appropriate response type and compression settings.
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -318,25 +318,16 @@
if h[0] != 'Content-Security-Policy']
 wsgireq.headers.append(('Content-Security-Policy', rctx.csp))
 
+handled, res = wireprotoserver.handlewsgirequest(
+rctx, wsgireq, req, self.check_perm)
+if handled:
+return res
+
 if req.havepathinfo:
 query = req.dispatchpath
 else:
 query = req.querystring.partition('&')[0].partition(';')[0]
 
-# Route it to a wire protocol handler if it looks like a wire protocol
-# request.
-protohandler = wireprotoserver.parsehttprequest(rctx, wsgireq, req,
-self.check_perm)
-
-if protohandler:
-try:
-if query:
-raise ErrorResponse(HTTP_NOT_FOUND)
-
-return protohandler['dispatch']()
-except ErrorResponse as inst:
-return protohandler['handleerror'](inst)
-
 # translate user-visible url structure to internal structure
 
 args = query.split('/', 2)



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___

D2739: hgweb: use parsed request to construct query parameters

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG0c1de2e87c6e: hgweb: use parsed request to construct query 
parameters (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2739?vs=6745=6783

REVISION DETAIL
  https://phab.mercurial-scm.org/D2739

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -76,6 +76,9 @@
 dispatchparts = attr.ib()
 # URL path component (no query string) used for dispatch.
 dispatchpath = attr.ib()
+# Whether there is a path component to this request. This can be true
+# when ``dispatchpath`` is empty due to REPO_NAME muckery.
+havepathinfo = attr.ib()
 # Raw query string (part after "?" in URL).
 querystring = attr.ib()
 # List of 2-tuples of query string arguments.
@@ -188,6 +191,7 @@
  advertisedbaseurl=advertisedbaseurl,
  apppath=apppath,
  dispatchparts=dispatchparts, 
dispatchpath=dispatchpath,
+ havepathinfo='PATH_INFO' in env,
  querystring=querystring,
  querystringlist=querystringlist,
  querystringdict=querystringdict)
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -318,15 +318,10 @@
if h[0] != 'Content-Security-Policy']
 wsgireq.headers.append(('Content-Security-Policy', rctx.csp))
 
-if r'PATH_INFO' in wsgireq.env:
-parts = wsgireq.env[r'PATH_INFO'].strip(r'/').split(r'/')
-repo_parts = wsgireq.env.get(r'REPO_NAME', r'').split(r'/')
-if parts[:len(repo_parts)] == repo_parts:
-parts = parts[len(repo_parts):]
-query = r'/'.join(parts)
+if req.havepathinfo:
+query = req.dispatchpath
 else:
-query = wsgireq.env[r'QUERY_STRING'].partition(r'&')[0]
-query = query.partition(r';')[0]
+query = req.querystring.partition('&')[0].partition(';')[0]
 
 # Route it to a wire protocol handler if it looks like a wire protocol
 # request.
@@ -344,7 +339,7 @@
 
 # translate user-visible url structure to internal structure
 
-args = query.split(r'/', 2)
+args = query.split('/', 2)
 if 'cmd' not in wsgireq.form and args and args[0]:
 cmd = args.pop(0)
 style = cmd.rfind('-')



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2742: hgweb: parse and store HTTP request headers

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG29b477d7f334: hgweb: parse and store HTTP request headers 
(authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2742?vs=6748=6786

REVISION DETAIL
  https://phab.mercurial-scm.org/D2742

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -11,6 +11,7 @@
 import cgi
 import errno
 import socket
+import wsgiref.headers as wsgiheaders
 #import wsgiref.validate
 
 from .common import (
@@ -85,6 +86,9 @@
 querystringlist = attr.ib()
 # Dict of query string arguments. Values are lists with at least 1 item.
 querystringdict = attr.ib()
+# wsgiref.headers.Headers instance. Operates like a dict with case
+# insensitive keys.
+headers = attr.ib()
 
 def parserequestfromenv(env):
 """Parse URL components from environment variables.
@@ -186,15 +190,26 @@
 else:
 querystringdict[k] = [v]
 
+# HTTP_* keys contain HTTP request headers. The Headers structure should
+# perform case normalization for us. We just rewrite underscore to dash
+# so keys match what likely went over the wire.
+headers = []
+for k, v in env.iteritems():
+if k.startswith('HTTP_'):
+headers.append((k[len('HTTP_'):].replace('_', '-'), v))
+
+headers = wsgiheaders.Headers(headers)
+
 return parsedrequest(url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
  advertisedbaseurl=advertisedbaseurl,
  apppath=apppath,
  dispatchparts=dispatchparts, 
dispatchpath=dispatchpath,
  havepathinfo='PATH_INFO' in env,
  querystring=querystring,
  querystringlist=querystringlist,
- querystringdict=querystringdict)
+ querystringdict=querystringdict,
+ headers=headers)
 
 class wsgirequest(object):
 """Higher-level API for a WSGI request.
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -351,7 +351,7 @@
 if args:
 wsgireq.form['file'] = args
 
-ua = wsgireq.env.get('HTTP_USER_AGENT', '')
+ua = req.headers.get('User-Agent', '')
 if cmd == 'rev' and 'mercurial' in ua:
 wsgireq.form['style'] = ['raw']
 



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2736: hgweb: use the parsed application path directly

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG328d665ef23d: hgweb: use the parsed application path 
directly (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2736?vs=6742=6781

REVISION DETAIL
  https://phab.mercurial-scm.org/D2736

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -146,14 +146,13 @@
 # root. We also exclude its path components from PATH_INFO when resolving
 # the dispatch path.
 
-# TODO the use of trailing slashes in apppath is arguably wrong. We need it
-# to appease low-level parts of hgweb_mod for now.
 apppath = env['SCRIPT_NAME']
-if not apppath.endswith('/'):
-apppath += '/'
 
 if env.get('REPO_NAME'):
-apppath += env.get('REPO_NAME') + '/'
+if not apppath.endswith('/'):
+apppath += '/'
+
+apppath += env.get('REPO_NAME')
 
 if 'PATH_INFO' in env:
 dispatchparts = env['PATH_INFO'].strip('/').split('/')
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -148,7 +148,7 @@
 logourl = self.config('web', 'logourl')
 logoimg = self.config('web', 'logoimg')
 staticurl = (self.config('web', 'staticurl')
- or pycompat.sysbytes(wsgireq.url) + 'static/')
+ or req.apppath + '/static/')
 if not staticurl.endswith('/'):
 staticurl += '/'
 
@@ -170,24 +170,24 @@
 if not self.reponame:
 self.reponame = (self.config('web', 'name', '')
  or wsgireq.env.get('REPO_NAME')
- or wsgireq.url.strip(r'/') or self.repo.root)
+ or req.apppath or self.repo.root)
 
 def websubfilter(text):
 return templatefilters.websub(text, self.websubtable)
 
 # create the templater
 # TODO: export all keywords: defaults = templatekw.keywords.copy()
 defaults = {
-'url': pycompat.sysbytes(wsgireq.url),
+'url': req.apppath + '/',
 'logourl': logourl,
 'logoimg': logoimg,
 'staticurl': staticurl,
 'urlbase': req.advertisedbaseurl,
 'repo': self.reponame,
 'encoding': encoding.encoding,
 'motd': motd,
 'sessionvars': sessionvars,
-'pathdef': makebreadcrumb(pycompat.sysbytes(wsgireq.url)),
+'pathdef': makebreadcrumb(req.apppath),
 'style': style,
 'nonce': self.nonce,
 }
@@ -318,8 +318,6 @@
if h[0] != 'Content-Security-Policy']
 wsgireq.headers.append(('Content-Security-Policy', rctx.csp))
 
-wsgireq.url = pycompat.sysstr(req.apppath)
-
 if r'PATH_INFO' in wsgireq.env:
 parts = wsgireq.env[r'PATH_INFO'].strip(r'/').split(r'/')
 repo_parts = wsgireq.env.get(r'REPO_NAME', r'').split(r'/')



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2734: hgweb: parse WSGI request into a data structure

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG3e13a21ce6b5: hgweb: parse WSGI request into a data 
structure (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2734?vs=6740=6778

REVISION DETAIL
  https://phab.mercurial-scm.org/D2734

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/hgwebdir_mod.py
  mercurial/hgweb/request.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -11,13 +11,17 @@
 import cgi
 import errno
 import socket
+#import wsgiref.validate
 
 from .common import (
 ErrorResponse,
 HTTP_NOT_MODIFIED,
 statusmessage,
 )
 
+from ..thirdparty import (
+attr,
+)
 from .. import (
 pycompat,
 util,
@@ -54,6 +58,124 @@
 pycompat.bytesurl(i.strip()) for i in v]
 return bytesform
 
+@attr.s(frozen=True)
+class parsedrequest(object):
+"""Represents a parsed WSGI request / static HTTP request parameters."""
+
+# Full URL for this request.
+url = attr.ib()
+# URL without any path components. Just ://.
+baseurl = attr.ib()
+# Advertised URL. Like ``url`` and ``baseurl`` but uses SERVER_NAME instead
+# of HTTP: Host header for hostname. This is likely what clients used.
+advertisedurl = attr.ib()
+advertisedbaseurl = attr.ib()
+# WSGI application path.
+apppath = attr.ib()
+# List of path parts to be used for dispatch.
+dispatchparts = attr.ib()
+# URL path component (no query string) used for dispatch.
+dispatchpath = attr.ib()
+# Raw query string (part after "?" in URL).
+querystring = attr.ib()
+
+def parserequestfromenv(env):
+"""Parse URL components from environment variables.
+
+WSGI defines request attributes via environment variables. This function
+parses the environment variables into a data structure.
+"""
+# PEP-0333 defines the WSGI spec and is a useful reference for this code.
+
+# We first validate that the incoming object conforms with the WSGI spec.
+# We only want to be dealing with spec-conforming WSGI implementations.
+# TODO enable this once we fix internal violations.
+#wsgiref.validate.check_environ(env)
+
+# PEP-0333 states that environment keys and values are native strings
+# (bytes on Python 2 and str on Python 3). The code points for the Unicode
+# strings on Python 3 must be between \0-\000FF. We deal with bytes
+# in Mercurial, so mass convert string keys and values to bytes.
+if pycompat.ispy3:
+env = {k.encode('latin-1'): v for k, v in env.iteritems()}
+env = {k: v.encode('latin-1') if isinstance(v, str) else v
+   for k, v in env.iteritems()}
+
+# https://www.python.org/dev/peps/pep-0333/#environ-variables defines
+# the environment variables.
+# https://www.python.org/dev/peps/pep-0333/#url-reconstruction defines
+# how URLs are reconstructed.
+fullurl = env['wsgi.url_scheme'] + '://'
+advertisedfullurl = fullurl
+
+def addport(s):
+if env['wsgi.url_scheme'] == 'https':
+if env['SERVER_PORT'] != '443':
+s += ':' + env['SERVER_PORT']
+else:
+if env['SERVER_PORT'] != '80':
+s += ':' + env['SERVER_PORT']
+
+return s
+
+if env.get('HTTP_HOST'):
+fullurl += env['HTTP_HOST']
+else:
+fullurl += env['SERVER_NAME']
+addport(fullurl)
+
+advertisedfullurl += env['SERVER_NAME']
+advertisedfullurl = addport(advertisedfullurl)
+
+baseurl = fullurl
+advertisedbaseurl = advertisedfullurl
+
+fullurl += util.urlreq.quote(env.get('SCRIPT_NAME', ''))
+advertisedfullurl += util.urlreq.quote(env.get('SCRIPT_NAME', ''))
+fullurl += util.urlreq.quote(env.get('PATH_INFO', ''))
+advertisedfullurl += util.urlreq.quote(env.get('PATH_INFO', ''))
+
+if env.get('QUERY_STRING'):
+fullurl += '?' + env['QUERY_STRING']
+advertisedfullurl += '?' + env['QUERY_STRING']
+
+# When dispatching requests, we look at the URL components (PATH_INFO
+# and QUERY_STRING) after the application root (SCRIPT_NAME). But hgwebdir
+# has the concept of "virtual" repositories. This is defined via REPO_NAME.
+# If REPO_NAME is defined, we append it to SCRIPT_NAME to form a new app
+# root. We also exclude its path components from PATH_INFO when resolving
+# the dispatch path.
+
+# TODO the use of trailing slashes in apppath is arguably wrong. We need it
+# to appease low-level parts of hgweb_mod for now.
+apppath = env['SCRIPT_NAME']
+if not apppath.endswith('/'):
+apppath += '/'
+
+if env.get('REPO_NAME'):
+apppath += env.get('REPO_NAME') + '/'
+
+if 'PATH_INFO' in env:
+dispatchparts = env['PATH_INFO'].strip('/').split('/')
+
+  

D2737: hgweb: teach WSGI parser about query strings

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGb48ac58ea4f6: hgweb: teach WSGI parser about query strings 
(authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2737?vs=6743=6780

REVISION DETAIL
  https://phab.mercurial-scm.org/D2737

AFFECTED FILES
  mercurial/hgweb/request.py
  mercurial/urllibcompat.py

CHANGE DETAILS

diff --git a/mercurial/urllibcompat.py b/mercurial/urllibcompat.py
--- a/mercurial/urllibcompat.py
+++ b/mercurial/urllibcompat.py
@@ -48,6 +48,7 @@
 "urlunparse",
 ))
 urlreq._registeralias(urllib.parse, "parse_qs", "parseqs")
+urlreq._registeralias(urllib.parse, "parse_qsl", "parseqsl")
 urlreq._registeralias(urllib.parse, "unquote_to_bytes", "unquote")
 import urllib.request
 urlreq._registeraliases(urllib.request, (
@@ -159,6 +160,7 @@
 "urlunparse",
 ))
 urlreq._registeralias(urlparse, "parse_qs", "parseqs")
+urlreq._registeralias(urlparse, "parse_qsl", "parseqsl")
 urlerr._registeraliases(urllib2, (
 "HTTPError",
 "URLError",
diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
--- a/mercurial/hgweb/request.py
+++ b/mercurial/hgweb/request.py
@@ -78,6 +78,10 @@
 dispatchpath = attr.ib()
 # Raw query string (part after "?" in URL).
 querystring = attr.ib()
+# List of 2-tuples of query string arguments.
+querystringlist = attr.ib()
+# Dict of query string arguments. Values are lists with at least 1 item.
+querystringdict = attr.ib()
 
 def parserequestfromenv(env):
 """Parse URL components from environment variables.
@@ -168,12 +172,25 @@
 
 querystring = env.get('QUERY_STRING', '')
 
+# We store as a list so we have ordering information. We also store as
+# a dict to facilitate fast lookup.
+querystringlist = util.urlreq.parseqsl(querystring, keep_blank_values=True)
+
+querystringdict = {}
+for k, v in querystringlist:
+if k in querystringdict:
+querystringdict[k].append(v)
+else:
+querystringdict[k] = [v]
+
 return parsedrequest(url=fullurl, baseurl=baseurl,
  advertisedurl=advertisedfullurl,
  advertisedbaseurl=advertisedbaseurl,
  apppath=apppath,
  dispatchparts=dispatchparts, 
dispatchpath=dispatchpath,
- querystring=querystring)
+ querystring=querystring,
+ querystringlist=querystringlist,
+ querystringdict=querystringdict)
 
 class wsgirequest(object):
 """Higher-level API for a WSGI request.



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2735: hgweb: use computed base URL from parsed request

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG56154c4626b9: hgweb: use computed base URL from parsed 
request (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2735?vs=6741=6779

REVISION DETAIL
  https://phab.mercurial-scm.org/D2735

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -142,21 +142,9 @@
 if typ in allowed or self.configbool('web', 'allow%s' % typ):
 yield {'type': typ, 'extension': spec[2], 'node': nodeid}
 
-def templater(self, wsgireq):
+def templater(self, wsgireq, req):
 # determine scheme, port and server name
 # this is needed to create absolute urls
-
-proto = wsgireq.env.get('wsgi.url_scheme')
-if proto == 'https':
-proto = 'https'
-default_port = '443'
-else:
-proto = 'http'
-default_port = '80'
-
-port = wsgireq.env[r'SERVER_PORT']
-port = port != default_port and (r':' + port) or r''
-urlbase = r'%s://%s%s' % (proto, wsgireq.env[r'SERVER_NAME'], port)
 logourl = self.config('web', 'logourl')
 logoimg = self.config('web', 'logoimg')
 staticurl = (self.config('web', 'staticurl')
@@ -194,7 +182,7 @@
 'logourl': logourl,
 'logoimg': logoimg,
 'staticurl': staticurl,
-'urlbase': urlbase,
+'urlbase': req.advertisedbaseurl,
 'repo': self.reponame,
 'encoding': encoding.encoding,
 'motd': motd,
@@ -396,7 +384,7 @@
 # process the web interface request
 
 try:
-tmpl = rctx.templater(wsgireq)
+tmpl = rctx.templater(wsgireq, req)
 ctype = tmpl('mimetype', encoding=encoding.encoding)
 ctype = templater.stringify(ctype)
 



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2730: hgweb: ensure all wsgi environment values are str

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG7fc80c982656: hgweb: ensure all wsgi environment values are 
str (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2730?vs=6736=6774

REVISION DETAIL
  https://phab.mercurial-scm.org/D2730

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/server.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/server.py b/mercurial/hgweb/server.py
--- a/mercurial/hgweb/server.py
+++ b/mercurial/hgweb/server.py
@@ -124,8 +124,8 @@
 env[r'SERVER_NAME'] = self.server.server_name
 env[r'SERVER_PORT'] = str(self.server.server_port)
 env[r'REQUEST_URI'] = self.path
-env[r'SCRIPT_NAME'] = self.server.prefix
-env[r'PATH_INFO'] = path[len(self.server.prefix):]
+env[r'SCRIPT_NAME'] = pycompat.sysstr(self.server.prefix)
+env[r'PATH_INFO'] = pycompat.sysstr(path[len(self.server.prefix):])
 env[r'REMOTE_HOST'] = self.client_address[0]
 env[r'REMOTE_ADDR'] = self.client_address[0]
 if query:
@@ -154,7 +154,7 @@
 env[hkey] = hval
 env[r'SERVER_PROTOCOL'] = self.request_version
 env[r'wsgi.version'] = (1, 0)
-env[r'wsgi.url_scheme'] = self.url_scheme
+env[r'wsgi.url_scheme'] = pycompat.sysstr(self.url_scheme)
 if env.get(r'HTTP_EXPECT', '').lower() == '100-continue':
 self.rfile = common.continuereader(self.rfile, self.wfile.write)
 
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -159,7 +159,8 @@
 urlbase = r'%s://%s%s' % (proto, req.env[r'SERVER_NAME'], port)
 logourl = self.config('web', 'logourl')
 logoimg = self.config('web', 'logoimg')
-staticurl = self.config('web', 'staticurl') or req.url + 'static/'
+staticurl = (self.config('web', 'staticurl')
+ or pycompat.sysbytes(req.url) + 'static/')
 if not staticurl.endswith('/'):
 staticurl += '/'
 
@@ -182,24 +183,24 @@
 if not self.reponame:
 self.reponame = (self.config('web', 'name', '')
  or req.env.get('REPO_NAME')
- or req.url.strip('/') or self.repo.root)
+ or req.url.strip(r'/') or self.repo.root)
 
 def websubfilter(text):
 return templatefilters.websub(text, self.websubtable)
 
 # create the templater
 # TODO: export all keywords: defaults = templatekw.keywords.copy()
 defaults = {
-'url': req.url,
+'url': pycompat.sysbytes(req.url),
 'logourl': logourl,
 'logoimg': logoimg,
 'staticurl': staticurl,
 'urlbase': urlbase,
 'repo': self.reponame,
 'encoding': encoding.encoding,
 'motd': motd,
 'sessionvars': sessionvars,
-'pathdef': makebreadcrumb(req.url),
+'pathdef': makebreadcrumb(pycompat.sysbytes(req.url)),
 'style': style,
 'nonce': self.nonce,
 }
@@ -333,17 +334,17 @@
 # use SCRIPT_NAME, PATH_INFO and QUERY_STRING as well as our REPO_NAME
 
 req.url = req.env[r'SCRIPT_NAME']
-if not req.url.endswith('/'):
-req.url += '/'
+if not req.url.endswith(r'/'):
+req.url += r'/'
 if req.env.get('REPO_NAME'):
 req.url += req.env[r'REPO_NAME'] + r'/'
 
 if r'PATH_INFO' in req.env:
-parts = req.env[r'PATH_INFO'].strip('/').split('/')
+parts = req.env[r'PATH_INFO'].strip(r'/').split(r'/')
 repo_parts = req.env.get(r'REPO_NAME', r'').split(r'/')
 if parts[:len(repo_parts)] == repo_parts:
 parts = parts[len(repo_parts):]
-query = '/'.join(parts)
+query = r'/'.join(parts)
 else:
 query = req.env[r'QUERY_STRING'].partition(r'&')[0]
 query = query.partition(r';')[0]
@@ -364,7 +365,7 @@
 
 # translate user-visible url structure to internal structure
 
-args = query.split('/', 2)
+args = query.split(r'/', 2)
 if 'cmd' not in req.form and args and args[0]:
 cmd = args.pop(0)
 style = cmd.rfind('-')



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2718: wireproto: declare permissions requirements in @wireprotocommand (API)

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG0b18604db95e: wireproto: declare permissions requirements 
in @wireprotocommand (API) (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2718?vs=6712=6772

REVISION DETAIL
  https://phab.mercurial-scm.org/D2718

AFFECTED FILES
  hgext/largefiles/uisetup.py
  mercurial/hgweb/hgweb_mod.py
  mercurial/wireproto.py
  mercurial/wireprotoserver.py
  tests/test-http-permissions.t

CHANGE DETAILS

diff --git a/tests/test-http-permissions.t b/tests/test-http-permissions.t
--- a/tests/test-http-permissions.t
+++ b/tests/test-http-permissions.t
@@ -21,12 +21,10 @@
   > @wireproto.wireprotocommand('customwritenoperm')
   > def customwritenoperm(repo, proto):
   > return b'write command no defined permissions\n'
-  > wireproto.permissions['customreadwithperm'] = 'pull'
-  > @wireproto.wireprotocommand('customreadwithperm')
+  > @wireproto.wireprotocommand('customreadwithperm', permission='pull')
   > def customreadwithperm(repo, proto):
   > return b'read-only command w/ defined permissions\n'
-  > wireproto.permissions['customwritewithperm'] = 'push'
-  > @wireproto.wireprotocommand('customwritewithperm')
+  > @wireproto.wireprotocommand('customwritewithperm', permission='push')
   > def customwritewithperm(repo, proto):
   > return b'write command w/ defined permissions\n'
   > EOF
diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -242,11 +242,7 @@
'over HTTP'))
 return []
 
-# Assume commands with no defined permissions are writes /
-# for pushes. This is the safest from a security perspective
-# because it doesn't allow commands with undefined semantics
-# from bypassing permissions checks.
-checkperm(wireproto.permissions.get(cmd, 'push'))
+checkperm(wireproto.commands[cmd].permission)
 
 rsp = wireproto.dispatch(repo, proto, cmd)
 
diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py
--- a/mercurial/wireproto.py
+++ b/mercurial/wireproto.py
@@ -592,10 +592,12 @@
 
 class commandentry(object):
 """Represents a declared wire protocol command."""
-def __init__(self, func, args='', transports=None):
+def __init__(self, func, args='', transports=None,
+ permission='push'):
 self.func = func
 self.args = args
 self.transports = transports or set()
+self.permission = permission
 
 def _merge(self, func, args):
 """Merge this instance with an incoming 2-tuple.
@@ -605,7 +607,8 @@
 data not captured by the 2-tuple and a new instance containing
 the union of the two objects is returned.
 """
-return commandentry(func, args=args, transports=set(self.transports))
+return commandentry(func, args=args, transports=set(self.transports),
+permission=self.permission)
 
 # Old code treats instances as 2-tuples. So expose that interface.
 def __iter__(self):
@@ -643,7 +646,8 @@
 else:
 # Use default values from @wireprotocommand.
 v = commandentry(v[0], args=v[1],
- transports=set(wireprototypes.TRANSPORTS))
+ transports=set(wireprototypes.TRANSPORTS),
+ permission='push')
 else:
 raise ValueError('command entries must be commandentry instances '
  'or 2-tuples')
@@ -672,12 +676,8 @@
 
 commands = commanddict()
 
-# Maps wire protocol name to operation type. This is used for permissions
-# checking. All defined @wireiprotocommand should have an entry in this
-# dict.
-permissions = {}
-
-def wireprotocommand(name, args='', transportpolicy=POLICY_ALL):
+def wireprotocommand(name, args='', transportpolicy=POLICY_ALL,
+ permission='push'):
 """Decorator to declare a wire protocol command.
 
 ``name`` is the name of the wire protocol command being provided.
@@ -688,6 +688,12 @@
 ``transportpolicy`` is a POLICY_* constant denoting which transports
 this wire protocol command should be exposed to. By default, commands
 are exposed to all wire protocol transports.
+
+``permission`` defines the permission type needed to run this command.
+Can be ``push`` or ``pull``. These roughly map to read-write and read-only,
+respectively. Default is to assume command requires ``push`` permissions
+because otherwise commands not declaring their permissions could modify
+a repository that is supposed to be read-only.
 """
 if transportpolicy == POLICY_ALL:
 transports = set(wireprototypes.TRANSPORTS)
@@ -701,14 +707,18 @@
 raise error.Abort(_('invalid transport policy value: %s') %
   

D2732: hgweb: rename req to wsgireq

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGb9b968e21f78: hgweb: rename req to wsgireq (authored by 
indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2732?vs=6738=6776

REVISION DETAIL
  https://phab.mercurial-scm.org/D2732

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/hgweb/hgwebdir_mod.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -36,26 +36,26 @@
 SSHV1 = wireprototypes.SSHV1
 SSHV2 = wireprototypes.SSHV2
 
-def decodevaluefromheaders(req, headerprefix):
+def decodevaluefromheaders(wsgireq, headerprefix):
 """Decode a long value from multiple HTTP request headers.
 
 Returns the value as a bytes, not a str.
 """
 chunks = []
 i = 1
 prefix = headerprefix.upper().replace(r'-', r'_')
 while True:
-v = req.env.get(r'HTTP_%s_%d' % (prefix, i))
+v = wsgireq.env.get(r'HTTP_%s_%d' % (prefix, i))
 if v is None:
 break
 chunks.append(pycompat.bytesurl(v))
 i += 1
 
 return ''.join(chunks)
 
 class httpv1protocolhandler(wireprototypes.baseprotocolhandler):
-def __init__(self, req, ui, checkperm):
-self._req = req
+def __init__(self, wsgireq, ui, checkperm):
+self._wsgireq = wsgireq
 self._ui = ui
 self._checkperm = checkperm
 
@@ -79,26 +79,26 @@
 return [data[k] for k in keys]
 
 def _args(self):
-args = util.rapply(pycompat.bytesurl, self._req.form.copy())
-postlen = int(self._req.env.get(r'HTTP_X_HGARGS_POST', 0))
+args = util.rapply(pycompat.bytesurl, self._wsgireq.form.copy())
+postlen = int(self._wsgireq.env.get(r'HTTP_X_HGARGS_POST', 0))
 if postlen:
 args.update(urlreq.parseqs(
-self._req.read(postlen), keep_blank_values=True))
+self._wsgireq.read(postlen), keep_blank_values=True))
 return args
 
-argvalue = decodevaluefromheaders(self._req, r'X-HgArg')
+argvalue = decodevaluefromheaders(self._wsgireq, r'X-HgArg')
 args.update(urlreq.parseqs(argvalue, keep_blank_values=True))
 return args
 
 def forwardpayload(self, fp):
-if r'HTTP_CONTENT_LENGTH' in self._req.env:
-length = int(self._req.env[r'HTTP_CONTENT_LENGTH'])
+if r'HTTP_CONTENT_LENGTH' in self._wsgireq.env:
+length = int(self._wsgireq.env[r'HTTP_CONTENT_LENGTH'])
 else:
-length = int(self._req.env[r'CONTENT_LENGTH'])
+length = int(self._wsgireq.env[r'CONTENT_LENGTH'])
 # If httppostargs is used, we need to read Content-Length
 # minus the amount that was consumed by args.
-length -= int(self._req.env.get(r'HTTP_X_HGARGS_POST', 0))
-for s in util.filechunkiter(self._req, limit=length):
+length -= int(self._wsgireq.env.get(r'HTTP_X_HGARGS_POST', 0))
+for s in util.filechunkiter(self._wsgireq, limit=length):
 fp.write(s)
 
 @contextlib.contextmanager
@@ -118,9 +118,9 @@
 
 def client(self):
 return 'remote:%s:%s:%s' % (
-self._req.env.get('wsgi.url_scheme') or 'http',
-urlreq.quote(self._req.env.get('REMOTE_HOST', '')),
-urlreq.quote(self._req.env.get('REMOTE_USER', '')))
+self._wsgireq.env.get('wsgi.url_scheme') or 'http',
+urlreq.quote(self._wsgireq.env.get('REMOTE_HOST', '')),
+urlreq.quote(self._wsgireq.env.get('REMOTE_USER', '')))
 
 def addcapabilities(self, repo, caps):
 caps.append('httpheader=%d' %
@@ -150,25 +150,25 @@
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def parsehttprequest(rctx, req, query, checkperm):
+def parsehttprequest(rctx, wsgireq, query, checkperm):
 """Parse the HTTP request for a wire protocol request.
 
 If the current request appears to be a wire protocol request, this
 function returns a dict with details about that request, including
 an ``abstractprotocolserver`` instance suitable for handling the
 request. Otherwise, ``None`` is returned.
 
-``req`` is a ``wsgirequest`` instance.
+``wsgireq`` is a ``wsgirequest`` instance.
 """
 repo = rctx.repo
 
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
-if 'cmd' not in req.form:
+if 'cmd' not in wsgireq.form:
 return None
 
-cmd = req.form['cmd'][0]
+cmd = wsgireq.form['cmd'][0]
 
 # The "cmd" request parameter is used by both the wire protocol and hgweb.
 # While not all wire protocol commands are available for all transports,
@@ -180,24 +180,24 @@
 if not iscmd(cmd):
 return None
 
-proto = 

D2719: wireproto: formalize permissions checking as part of protocol interface

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG66de4555cefd: wireproto: formalize permissions checking as 
part of protocol interface (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2719?vs=6713=6773

REVISION DETAIL
  https://phab.mercurial-scm.org/D2719

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/wireproto.py
  mercurial/wireprotoserver.py
  mercurial/wireprototypes.py
  tests/test-wireproto.py

CHANGE DETAILS

diff --git a/tests/test-wireproto.py b/tests/test-wireproto.py
--- a/tests/test-wireproto.py
+++ b/tests/test-wireproto.py
@@ -18,6 +18,9 @@
 names = spec.split()
 return [args[n] for n in names]
 
+def checkperm(self, perm):
+pass
+
 class clientpeer(wireproto.wirepeer):
 def __init__(self, serverrepo):
 self.serverrepo = serverrepo
diff --git a/mercurial/wireprototypes.py b/mercurial/wireprototypes.py
--- a/mercurial/wireprototypes.py
+++ b/mercurial/wireprototypes.py
@@ -146,3 +146,12 @@
 
 Returns a list of capabilities. The passed in argument can be returned.
 """
+
+@abc.abstractmethod
+def checkperm(self, perm):
+"""Validate that the client has permissions to perform a request.
+
+The argument is the permission required to proceed. If the client
+doesn't have that permission, the exception should raise or abort
+in a protocol specific manner.
+"""
diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -54,9 +54,10 @@
 return ''.join(chunks)
 
 class httpv1protocolhandler(wireprototypes.baseprotocolhandler):
-def __init__(self, req, ui):
+def __init__(self, req, ui, checkperm):
 self._req = req
 self._ui = ui
+self._checkperm = checkperm
 
 @property
 def name(self):
@@ -139,14 +140,17 @@
 
 return caps
 
+def checkperm(self, perm):
+return self._checkperm(perm)
+
 # This method exists mostly so that extensions like remotefilelog can
 # disable a kludgey legacy method only over http. As of early 2018,
 # there are no other known users, so with any luck we can discard this
 # hook if remotefilelog becomes a first-party extension.
 def iscmd(cmd):
 return cmd in wireproto.commands
 
-def parsehttprequest(repo, req, query):
+def parsehttprequest(rctx, req, query, checkperm):
 """Parse the HTTP request for a wire protocol request.
 
 If the current request appears to be a wire protocol request, this
@@ -156,6 +160,8 @@
 
 ``req`` is a ``wsgirequest`` instance.
 """
+repo = rctx.repo
+
 # HTTP version 1 wire protocol requests are denoted by a "cmd" query
 # string parameter. If it isn't present, this isn't a wire protocol
 # request.
@@ -174,13 +180,13 @@
 if not iscmd(cmd):
 return None
 
-proto = httpv1protocolhandler(req, repo.ui)
+proto = httpv1protocolhandler(req, repo.ui,
+  lambda perm: checkperm(rctx, req, perm))
 
 return {
 'cmd': cmd,
 'proto': proto,
-'dispatch': lambda checkperm: _callhttp(repo, req, proto, cmd,
-checkperm),
+'dispatch': lambda: _callhttp(repo, req, proto, cmd),
 'handleerror': lambda ex: _handlehttperror(ex, req, cmd),
 }
 
@@ -224,7 +230,7 @@
 opts = {'level': ui.configint('server', 'zliblevel')}
 return HGTYPE, util.compengines['zlib'], opts
 
-def _callhttp(repo, req, proto, cmd, checkperm):
+def _callhttp(repo, req, proto, cmd):
 def genversion2(gen, engine, engineopts):
 # application/mercurial-0.2 always sends a payload header
 # identifying the compression engine.
@@ -242,7 +248,7 @@
'over HTTP'))
 return []
 
-checkperm(wireproto.commands[cmd].permission)
+proto.checkperm(wireproto.commands[cmd].permission)
 
 rsp = wireproto.dispatch(repo, proto, cmd)
 
@@ -392,6 +398,9 @@
 def addcapabilities(self, repo, caps):
 return caps
 
+def checkperm(self, perm):
+pass
+
 class sshv2protocolhandler(sshv1protocolhandler):
 """Protocol handler for version 2 of the SSH protocol."""
 
diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py
--- a/mercurial/wireproto.py
+++ b/mercurial/wireproto.py
@@ -731,13 +731,10 @@
 vals[unescapearg(n)] = unescapearg(v)
 func, spec = commands[op]
 
-# If the protocol supports permissions checking, perform that
-# checking on each batched command.
-# TODO formalize permission checking as part of protocol interface.
-if util.safehasattr(proto, 'checkperm'):
-perm = commands[op].permission
-assert perm in ('push', 'pull')
-proto.checkperm(perm)
+# Validate 

D2731: hgweb: validate WSGI environment dict

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG8e1556ac01bb: hgweb: validate WSGI environment dict 
(authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2731?vs=6737=6775

REVISION DETAIL
  https://phab.mercurial-scm.org/D2731

AFFECTED FILES
  mercurial/hgweb/server.py

CHANGE DETAILS

diff --git a/mercurial/hgweb/server.py b/mercurial/hgweb/server.py
--- a/mercurial/hgweb/server.py
+++ b/mercurial/hgweb/server.py
@@ -13,6 +13,7 @@
 import socket
 import sys
 import traceback
+import wsgiref.validate
 
 from ..i18n import _
 
@@ -128,8 +129,7 @@
 env[r'PATH_INFO'] = pycompat.sysstr(path[len(self.server.prefix):])
 env[r'REMOTE_HOST'] = self.client_address[0]
 env[r'REMOTE_ADDR'] = self.client_address[0]
-if query:
-env[r'QUERY_STRING'] = query
+env[r'QUERY_STRING'] = query or r''
 
 if pycompat.ispy3:
 if self.headers.get_content_type() is None:
@@ -166,6 +166,8 @@
   socketserver.ForkingMixIn)
 env[r'wsgi.run_once'] = 0
 
+wsgiref.validate.check_environ(env)
+
 self.saved_status = None
 self.saved_headers = []
 self.length = None



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2716: wireprotoserver: check if command available before calling it

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG7574c8173d5e: wireprotoserver: check if command available 
before calling it (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2716?vs=6710=6770

REVISION DETAIL
  https://phab.mercurial-scm.org/D2716

AFFECTED FILES
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -235,14 +235,14 @@
 for chunk in gen:
 yield chunk
 
-rsp = wireproto.dispatch(repo, proto, cmd)
-
 if not wireproto.commands.commandavailable(cmd, proto):
 req.respond(HTTP_OK, HGERRTYPE,
 body=_('requested wire protocol command is not available '
'over HTTP'))
 return []
 
+rsp = wireproto.dispatch(repo, proto, cmd)
+
 if isinstance(rsp, bytes):
 req.respond(HTTP_OK, HGTYPE, body=rsp)
 return []



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2717: wireprotoserver: check permissions in main dispatch function

2018-03-09 Thread indygreg (Gregory Szorc)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHGc638a13093cf: wireprotoserver: check permissions in main 
dispatch function (authored by indygreg, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2717?vs=6711=6771

REVISION DETAIL
  https://phab.mercurial-scm.org/D2717

AFFECTED FILES
  mercurial/hgweb/hgweb_mod.py
  mercurial/wireprotoserver.py

CHANGE DETAILS

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -179,7 +179,8 @@
 return {
 'cmd': cmd,
 'proto': proto,
-'dispatch': lambda: _callhttp(repo, req, proto, cmd),
+'dispatch': lambda checkperm: _callhttp(repo, req, proto, cmd,
+checkperm),
 'handleerror': lambda ex: _handlehttperror(ex, req, cmd),
 }
 
@@ -223,7 +224,7 @@
 opts = {'level': ui.configint('server', 'zliblevel')}
 return HGTYPE, util.compengines['zlib'], opts
 
-def _callhttp(repo, req, proto, cmd):
+def _callhttp(repo, req, proto, cmd, checkperm):
 def genversion2(gen, engine, engineopts):
 # application/mercurial-0.2 always sends a payload header
 # identifying the compression engine.
@@ -241,6 +242,12 @@
'over HTTP'))
 return []
 
+# Assume commands with no defined permissions are writes /
+# for pushes. This is the safest from a security perspective
+# because it doesn't allow commands with undefined semantics
+# from bypassing permissions checks.
+checkperm(wireproto.permissions.get(cmd, 'push'))
+
 rsp = wireproto.dispatch(repo, proto, cmd)
 
 if isinstance(rsp, bytes):
diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -357,22 +357,15 @@
 protohandler = wireprotoserver.parsehttprequest(rctx.repo, req, query)
 
 if protohandler:
-cmd = protohandler['cmd']
 try:
 if query:
 raise ErrorResponse(HTTP_NOT_FOUND)
 
 # TODO fold this into parsehttprequest
-req.checkperm = lambda op: self.check_perm(rctx, req, op)
-protohandler['proto'].checkperm = req.checkperm
+checkperm = lambda op: self.check_perm(rctx, req, op)
+protohandler['proto'].checkperm = checkperm
 
-# Assume commands with no defined permissions are writes /
-# for pushes. This is the safest from a security perspective
-# because it doesn't allow commands with undefined semantics
-# from bypassing permissions checks.
-req.checkperm(perms.get(cmd, 'push'))
-
-return protohandler['dispatch']()
+return protohandler['dispatch'](checkperm)
 except ErrorResponse as inst:
 return protohandler['handleerror'](inst)
 



To: indygreg, #hg-reviewers, durin42
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2743: wireprotoserver: access headers through parsed request

2018-03-09 Thread durin42 (Augie Fackler)
durin42 added inline comments.

INLINE COMMENTS

> wireprotoserver.py:97
>  else:
>  length = int(self._wsgireq.env[r'CONTENT_LENGTH'])
>  # If httppostargs is used, we need to read Content-Length

Missed one?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2743

To: indygreg, #hg-reviewers
Cc: durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  In https://phab.mercurial-scm.org/D2720#44293, @mharbison72 wrote:
  
  > i> write(4) -> 4:
  >  i> 426\n
  >  i> write(426) -> 426:
  >  i> 
HG10UN\x00\x00\x00\x9eh\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x0
  >  
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
  >  
x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
  >  x00>cba485ca3678256e044428f70f58291196f6e9de\n
  >  i> test\n
  >  i> 0 0\n
  >  i> foo\n
  >  i> \n
  >  i> 
initial\x00\x00\x00\x00\x00\x00\x00\x8d\xcb\xa4\x85\xca6x%n\x04D(\xf7\x0fX)\x11\x96\xf6\xe9\xde\x00\x00
  >  
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x
  >  
00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x
  >  
00\x00\x00\x00\x00\x00\x00-foo\x00362fef284ce2ca02aecc8de6d5e8a1c3af0556fe\n
  >  i> 
\x00\x00\x00\x00\x00\x00\x00\x07foo\x00\x00\x00b6/\xef(L\xe2\xca\x02\xae\xcc\x8d\xe6\xd5\xe8\xa1\xc3\xa
  >  
f\x05V\xfe\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
  >  \x00\x00\x00\x00\x00\x00\x00\x00\
  >
  >  
  
  
  This is a wonky place to hang! For context, this is processing a `command 
unbundle` line. The lines that follow should be:
  
i> write(2) -> 2:
i> 0\n
i> flush() -> None
o> readline() -> 2:
o> 0\n
o> readline() -> 2:
o> 1\n
o> read(1) -> 1: 0
result: 0
remote output: 
e> read(115) -> 115:
e> abort: incompatible Mercurial client; bundle2 required\n
e> (see https://www.mercurial-scm.org/wiki/IncompatibleClient)\n
  
  What happens under the hood during pushes is we send out chunks containing 
the bundle followed by an empty chunk. That code is:
  
for d in iter(lambda: fp.read(4096), ''):
self._writeframed(d)
self._writeframed("", flush=True)
  
  That empty chunk is apparently not getting sent. Or its logging is not 
getting written/printed.
  
  This patch shouldn't have changed any behavior with regard to this part of 
the I/O. So I'm scratching my head over how this caused deadlock. Are you sure 
you can bisect it to this patch.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2758: transaction: add a name and a __str__ implementation (API)

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This has been useful for me for debugging.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2758

AFFECTED FILES
  mercurial/localrepo.py
  mercurial/transaction.py

CHANGE DETAILS

diff --git a/mercurial/transaction.py b/mercurial/transaction.py
--- a/mercurial/transaction.py
+++ b/mercurial/transaction.py
@@ -105,7 +105,7 @@
 class transaction(util.transactional):
 def __init__(self, report, opener, vfsmap, journalname, undoname=None,
  after=None, createmode=None, validator=None, releasefn=None,
- checkambigfiles=None):
+ checkambigfiles=None, name=''):
 """Begin a new transaction
 
 Begins a new transaction that allows rolling back writes in the event 
of
@@ -149,6 +149,8 @@
 if checkambigfiles:
 self.checkambigfiles.update(checkambigfiles)
 
+self.names = [name]
+
 # A dict dedicated to precisely tracking the changes introduced in the
 # transaction.
 self.changes = {}
@@ -186,6 +188,11 @@
 # holds callbacks to call during abort
 self._abortcallback = {}
 
+def __str__(self):
+name = '/'.join(self.names)
+return ('' %
+(name, self.count, self.usages))
+
 def __del__(self):
 if self.journal:
 self._abort()
@@ -365,14 +372,17 @@
 self.file.flush()
 
 @active
-def nest(self):
+def nest(self, name=''):
 self.count += 1
 self.usages += 1
+self.names.append(name)
 return self
 
 def release(self):
 if self.count > 0:
 self.usages -= 1
+if self.names:
+self.names.pop()
 # if the transaction scopes are left without being closed, fail
 if self.count > 0 and self.usages == 0:
 self._abort()
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1177,7 +1177,7 @@
 raise error.ProgrammingError('transaction requires locking')
 tr = self.currenttransaction()
 if tr is not None:
-return tr.nest()
+return tr.nest(name=desc)
 
 # abort here if the journal already exists
 if self.svfs.exists("journal"):
@@ -1316,7 +1316,8 @@
  self.store.createmode,
  validator=validate,
  releasefn=releasefn,
- checkambigfiles=_cachedfiles)
+ checkambigfiles=_cachedfiles,
+ name=desc)
 tr.changes['revs'] = xrange(0, 0)
 tr.changes['obsmarkers'] = set()
 tr.changes['phases'] = {}



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2728: rebase: also include commit of collapsed commits in single transaction

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  This is now ready for review again

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2728

To: martinvonz, #hg-reviewers, durham, quark
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2757: tests: add a few tests involving --collapse and rebase.singletransaction=1

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  I'm about to change the rebase code quite a bit and this was poorly
  tested.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2757

AFFECTED FILES
  tests/test-rebase-transaction.t

CHANGE DETAILS

diff --git a/tests/test-rebase-transaction.t b/tests/test-rebase-transaction.t
--- a/tests/test-rebase-transaction.t
+++ b/tests/test-rebase-transaction.t
@@ -48,3 +48,149 @@
   o  0: A
   
   $ cd ..
+
+Check that --collapse works
+
+  $ hg init collapse && cd collapse
+  $ hg debugdrawdag <<'EOF'
+  >   Z
+  >   |
+  >   | D
+  >   | |
+  >   | C
+  >   | |
+  >   Y B
+  >   |/
+  >   A
+  > EOF
+- We should only see two status stored messages. One from the start, one from
+- the end.
+  $ hg rebase --collapse --debug -b D -d Z | grep 'status stored'
+  rebase status stored
+  rebase status stored
+  $ hg tglog
+  o  3: Collapsed revision
+  |  * B
+  |  * C
+  |  * D
+  o  2: Z
+  |
+  o  1: Y
+  |
+  o  0: A
+  
+  $ cd ..
+
+With --collapse, check that conflicts can be resolved and rebase can then be
+continued
+
+  $ hg init collapse-conflict && cd collapse-conflict
+  $ hg debugdrawdag <<'EOF'
+  >   Z   # Z/conflict=Z
+  >   |
+  >   | D
+  >   | |
+  >   | C # C/conflict=C
+  >   | |
+  >   Y B
+  >   |/
+  >   A
+  > EOF
+  $ hg rebase --collapse -b D -d Z
+  rebasing 1:112478962961 "B" (B)
+  rebasing 3:c26739dbe603 "C" (C)
+  merging conflict
+  warning: conflicts while merging conflict! (edit, then use 'hg resolve 
--mark')
+  unresolved conflicts (see hg resolve, then hg rebase --continue)
+  [1]
+  $ hg tglog
+  o  5: D
+  |
+  | @  4: Z
+  | |
+  @ |  3: C
+  | |
+  | o  2: Y
+  | |
+  o |  1: B
+  |/
+  o  0: A
+  
+  $ hg st
+  M C
+  M conflict
+  A B
+  ? conflict.orig
+  $ echo resolved > conflict
+  $ hg resolve -m
+  (no more unresolved files)
+  continue: hg rebase --continue
+  $ hg rebase --continue
+  already rebased 1:112478962961 "B" (B) as 79bc8f4973ce
+  rebasing 3:c26739dbe603 "C" (C)
+  rebasing 5:d24bb333861c "D" (D tip)
+  saved backup bundle to 
$TESTTMP/collapse-conflict/.hg/strip-backup/112478962961-b5b34645-rebase.hg
+  $ hg tglog
+  o  3: Collapsed revision
+  |  * B
+  |  * C
+  |  * D
+  o  2: Z
+  |
+  o  1: Y
+  |
+  o  0: A
+  
+  $ cd ..
+
+With --collapse, check that the commit message editing can be canceled and
+rebase can then be continued
+
+  $ hg init collapse-cancel-editor && cd collapse-cancel-editor
+  $ hg debugdrawdag <<'EOF'
+  >   Z
+  >   |
+  >   | D
+  >   | |
+  >   | C
+  >   | |
+  >   Y B
+  >   |/
+  >   A
+  > EOF
+  $ HGEDITOR=false hg --config ui.interactive=1 rebase --collapse -b D -d Z
+  rebasing 1:112478962961 "B" (B)
+  rebasing 3:26805aba1e60 "C" (C)
+  rebasing 5:f585351a92f8 "D" (D tip)
+  abort: edit failed: false exited with status 1
+  [255]
+  $ hg tglog
+  o  5: D
+  |
+  | @  4: Z
+  | |
+  o |  3: C
+  | |
+  | o  2: Y
+  | |
+  o |  1: B
+  |/
+  o  0: A
+  
+  $ hg rebase --continue
+  already rebased 1:112478962961 "B" (B) as e9b22a392ce0
+  already rebased 3:26805aba1e60 "C" (C) as e9b22a392ce0
+  already rebased 5:f585351a92f8 "D" (D tip) as e9b22a392ce0
+  saved backup bundle to 
$TESTTMP/collapse-cancel-editor/.hg/strip-backup/112478962961-cb2a9b47-rebase.hg
+  $ hg tglog
+  o  3: Collapsed revision
+  |  * B
+  |  * C
+  |  * D
+  o  2: Z
+  |
+  o  1: Y
+  |
+  o  0: A
+  
+  $ cd ..



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2728: rebase: also include commit of collapsed commits in single transaction

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz updated this revision to Diff 6768.
martinvonz edited the summary of this revision.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2728?vs=6734=6768

REVISION DETAIL
  https://phab.mercurial-scm.org/D2728

AFFECTED FILES
  hgext/rebase.py
  tests/test-rebase-transaction.t

CHANGE DETAILS

diff --git a/tests/test-rebase-transaction.t b/tests/test-rebase-transaction.t
--- a/tests/test-rebase-transaction.t
+++ b/tests/test-rebase-transaction.t
@@ -29,11 +29,9 @@
   >   |/
   >   A
   > EOF
-- We should only see two status stored messages. One from the start, one from
-- the end.
+- We should only see one status stored message. It comes from the start.
   $ hg rebase --debug -b D -d Z | grep 'status stored'
   rebase status stored
-  rebase status stored
   $ hg tglog
   o  5: D
   |
@@ -64,7 +62,7 @@
   >   A
   > EOF
 - We should only see two status stored messages. One from the start, one from
-- the end.
+- cmdutil.commitforceeditor() which forces tr.writepending()
   $ hg rebase --collapse --debug -b D -d Z | grep 'status stored'
   rebase status stored
   rebase status stored
@@ -162,12 +160,14 @@
   rebasing 1:112478962961 "B" (B)
   rebasing 3:26805aba1e60 "C" (C)
   rebasing 5:f585351a92f8 "D" (D tip)
+  transaction abort!
+  rollback completed
   abort: edit failed: false exited with status 1
   [255]
   $ hg tglog
   o  5: D
   |
-  | @  4: Z
+  | o  4: Z
   | |
   o |  3: C
   | |
@@ -178,9 +178,9 @@
   o  0: A
   
   $ hg rebase --continue
-  already rebased 1:112478962961 "B" (B) as e9b22a392ce0
-  already rebased 3:26805aba1e60 "C" (C) as e9b22a392ce0
-  already rebased 5:f585351a92f8 "D" (D tip) as e9b22a392ce0
+  rebasing 1:112478962961 "B" (B)
+  rebasing 3:26805aba1e60 "C" (C)
+  rebasing 5:f585351a92f8 "D" (D tip)
   saved backup bundle to 
$TESTTMP/collapse-cancel-editor/.hg/strip-backup/112478962961-cb2a9b47-rebase.hg
   $ hg tglog
   o  3: Collapsed revision
diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -573,16 +573,12 @@
 keepbranches=self.keepbranchesf,
 date=self.date, wctx=self.wctx)
 else:
-dsguard = None
-if ui.configbool('rebase', 'singletransaction'):
-dsguard = dirstateguard.dirstateguard(repo, 'rebase')
-with util.acceptintervention(dsguard):
-newnode = concludenode(repo, revtoreuse, p1, self.external,
-commitmsg=commitmsg,
-extrafn=_makeextrafn(self.extrafns),
-editor=editor,
-keepbranches=self.keepbranchesf,
-date=self.date)
+newnode = concludenode(repo, revtoreuse, p1, self.external,
+commitmsg=commitmsg,
+extrafn=_makeextrafn(self.extrafns),
+editor=editor,
+keepbranches=self.keepbranchesf,
+date=self.date)
 if newnode is not None:
 newrev = repo[newnode].rev()
 for oldrev in self.state:
@@ -864,8 +860,7 @@
 dsguard = dirstateguard.dirstateguard(repo, 'rebase')
 with util.acceptintervention(dsguard):
 rbsrt._performrebase(tr)
-
-rbsrt._finishrebase()
+rbsrt._finishrebase()
 
 def _definedestmap(ui, repo, rbsrt, destf=None, srcf=None, basef=None,
revf=None, destspace=None):



To: martinvonz, #hg-reviewers, durham, quark
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2756: tests: simplify test-rebase-transaction.t

2018-03-09 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The file was extracted from test-rebase-base.t in 
https://phab.mercurial-scm.org/rHG8cef8f7d51d0f1e99889779ec1320d5c9c3b91de
  (test-rebase-base: clarify it is about the "--base" flag,
  2017-10-05). This patch follows up that and clarifies the new file's
  purpose and simplifies it a bit.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2756

AFFECTED FILES
  tests/test-rebase-transaction.t

CHANGE DETAILS

diff --git a/tests/test-rebase-transaction.t b/tests/test-rebase-transaction.t
--- a/tests/test-rebase-transaction.t
+++ b/tests/test-rebase-transaction.t
@@ -1,22 +1,23 @@
+Rebasing using a single transaction
+
   $ cat >> $HGRCPATH < [extensions]
   > rebase=
   > drawdag=$TESTDIR/drawdag.py
   > 
+  > [rebase]
+  > singletransaction=True
+  > 
   > [phases]
   > publish=False
   > 
   > [alias]
   > tglog = log -G --template "{rev}: {desc}"
   > EOF
 
-Rebasing using a single transaction
+Check that a simple rebase works
 
-  $ hg init singletr && cd singletr
-  $ cat >> .hg/hgrc < [rebase]
-  > singletransaction=True
-  > EOF
+  $ hg init simple && cd simple
   $ hg debugdrawdag <<'EOF'
   >   Z
   >   |



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2057: rust implementation of hg status

2018-03-09 Thread Ivzhh (Sheng Mao)
Ivzhh added a comment.


  In https://phab.mercurial-scm.org/D2057#44269, @yuja wrote:
  
  > >> Reading that page it seems to claim that filenames should be utf8, not 
bytes. If utf8, this is what the code does, but if it is bytes that definitely 
won't work.
  > > 
  > > IIRC it's bytes everyplace except Windows, where we pretend utf8 is real?
  >
  > It's MBCS (i.e. ANSI multi-byte characters) on Windows. The plain was to 
support
  >  both MBCS and UTF-8-variant on Windows, but that isn't a thing yet.
  >
  > Perhaps we'll have to write a platform compatibility layer (or 
serialization/deserialization
  >  layer) on top of the Rust's file API, something like vfs.py we have in 
Python code.
  
  
  Thank you for confirming that, I am a bit confusing when I read Encoding Plan 
wiki page. I am looking at Mozilla's rust winapi bindings, let me see if I can 
directly wrap around winapi::um::fileapi::FindFirstFileA 


REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2057

To: Ivzhh, #hg-reviewers, kevincox
Cc: yuja, glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2394: histedit: make histedit's commands accept revsets (issue5746)

2018-03-09 Thread sangeet259 (Sangeet Kumar Mishra)
sangeet259 added a comment.


  @durin42  Should I make any changes?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2394

To: sangeet259, durin42, #hg-reviewers
Cc: tom.prince, krbullock, rishabhmadan96, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-09 Thread mharbison72 (Matt Harbison)
mharbison72 added a comment.


  In https://phab.mercurial-scm.org/D2720#44288, @indygreg wrote:
  
  > In https://phab.mercurial-scm.org/D2720#44221, @mharbison72 wrote:
  >
  > > This patch seems to deadlock on Windows when running 
test-ssh-proto-unbundle.t.  There are two python processes idling:
  > >
  > > Past experience says that something needs to be flushed- stdout (and 
maybe stderr?) are full buffered on Windows IIRC.  The following failure with 
this patch might also point to that:
  >
  >
  > Hmmm.
  >
  > We explicitly create the pipes between the processes as unbuffered 
(`bufsize=0`). That being said, I would not at all be surprised if we need to 
throw a `flush()` in here somewhere.
  >
  > Could you please run the test with `run-tests.py -d` and see where it is 
deadlocking?
  >
  > The test failure due to the `o>` block is likely due to buffering or a race 
condition when writing to stdout of the `hg debugwireproto` process. I'll look 
into that after we fix the deadlock (since fixing might change order of things 
and I don't want to debug this in a system that is changing).
  
  
  It looks like the middle of line 84.  Here's the relevant output, from the 
start of the stuck command on 43 (the off-by-one in the SALT line is odd, but 
the other lines are line that too, so I assume it's a zero based index thing):
  
+ echo SALT1520615870 42 0
SALT1520615870 42 0
+ debugwireproto
++ cat -
+ commands='command unbundle
# This is "force" in hex.
heads 666f726365
PUSHFILE ../initial.v1.hg
eread 115'
+ echo 'testing ssh1'
testing ssh1
++ hg log -r tip -T '{node}'
+ tip=
+ echo 'command unbundle
# This is "force" in hex.
heads 666f726365
PUSHFILE ../initial.v1.hg
eread 115'
+ hg --verbose debugwireproto --localssh --noreadstderr
creating ssh peer from handshake results
i> write(104) -> 104:
i> hello\n
i> between\n
i> pairs 81\n
i> 
-
i> flush() -> None
o> readline() -> 4:
o> 384\n
o> readline() -> 384:
o> capabilities: lookup branchmap pushkey known getbundle unbundlehash 
batch changegroupsubset streamreqs=
generaldelta,revlogv1 
bundle2=HG20%0Abookmarks%0Achangegroup%3D01%2C02%0Adigests%3Dmd5%2Csha1%2Csha512%0Aerror

%3Dabort%2Cunsupportedcontent%2Cpushraced%2Cpushkey%0Ahgtagsfnodes%0Alistkeys%0Aphases%3Dheads%0Apushkey%0Arem
ote-changegroup%3Dhttp%2Chttps unbundle=HG10GZ,HG10BZ,HG10UN\n
o> readline() -> 2:
o> 1\n
o> readline() -> 1:
o> \n
sending unbundle command
i> write(9) -> 9:
i> unbundle\n
i> write(9) -> 9:
i> heads 10\n
i> write(10) -> 10: 666f726365
i> flush() -> None
o> readline() -> 2:
o> 0\n
i> write(4) -> 4:
i> 426\n
i> write(426) -> 426:
i> 
HG10UN\x00\x00\x00\x9eh\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x0

0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\

x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
x00>cba485ca3678256e044428f70f58291196f6e9de\n
i> test\n
i> 0 0\n
i> foo\n
i> \n
i> 
initial\x00\x00\x00\x00\x00\x00\x00\x8d\xcb\xa4\x85\xca6x%n\x04D(\xf7\x0fX)\x11\x96\xf6\xe9\xde\x00\x00

\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x

00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00h\x98b\x13\xbdD\x85\xeaQS55\xe3\xfc\x9ex\x00zq\x1f\x00\x00\x00\x00\x
00\x00\x00\x00\x00\x00\x00-foo\x00362fef284ce2ca02aecc8de6d5e8a1c3af0556fe\n
i> 
\x00\x00\x00\x00\x00\x00\x00\x07foo\x00\x00\x00b6/\xef(L\xe2\xca\x02\xae\xcc\x8d\xe6\xd5\xe8\xa1\xc3\xa

f\x05V\xfe\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00\

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 8] hgk: stop using util.bytesinput() to read a single line from stdin

2018-03-09 Thread Gregory Szorc
On Fri, Mar 9, 2018 at 4:35 AM, Yuya Nishihara  wrote:

> # HG changeset patch
> # User Yuya Nishihara 
> # Date 1520323525 21600
> #  Tue Mar 06 02:05:25 2018 -0600
> # Node ID f7d9876d750e048b4c0e0ec0682928e86a8e8ecb
> # Parent  b434965f984eff168a7caaa239277b15729bd0b1
> hgk: stop using util.bytesinput() to read a single line from stdin
>

Queued this series, thanks.

If you have this stdio stuff paged into your brain, you may want to look at
hook.py and what it is doing with stdio. Essentially, it is doing dup() and
dup2() to temporarily redirect stdout to stderr such that the wire protocol
can intercept output and forward it to the wire protocol client (or the
CLI). See hook.redirect(True) in the wire protocol server code and follow
the trail from there.

In the case of shell hooks, I believe those processes inherit the parent's
file descriptors. Which after hook.py mucks with the file descriptors, is
actually the stderr stream.

I question the appropriateness of the approach. I think it would be better
to e.g. send ui.ferr to shell hooks and to temporarily muck with
sys.stdout/sys.stderr when running Python hooks. But this has implications
and I haven't thought it through. I'd *really* like to see us not have to
do the dup()/dup2() dance in the hooks because that is mucking with global
state and can make it hard to debug server processes.


>
> Appears that the stdio here is an IPC channel between hg and hgk (tk)
> processes, which shouldn't need a fancy readline support.
>
> diff --git a/hgext/hgk.py b/hgext/hgk.py
> --- a/hgext/hgk.py
> +++ b/hgext/hgk.py
> @@ -51,7 +51,6 @@ from mercurial import (
>  pycompat,
>  registrar,
>  scmutil,
> -util,
>  )
>
>  cmdtable = {}
> @@ -105,15 +104,15 @@ def difftree(ui, repo, node1=None, node2
>
>  while True:
>  if opts[r'stdin']:
> -try:
> -line = util.bytesinput(ui.fin, ui.fout).split(' ')
> -node1 = line[0]
> -if len(line) > 1:
> -node2 = line[1]
> -else:
> -node2 = None
> -except EOFError:
> +line = ui.fin.readline()
> +if not line:
>  break
> +line = line.rstrip(pycompat.oslinesep).split(b' ')
> +node1 = line[0]
> +if len(line) > 1:
> +node2 = line[1]
> +else:
> +node2 = None
>  node1 = repo.lookup(node1)
>  if node2:
>  node2 = repo.lookup(node2)
> @@ -186,12 +185,11 @@ def catfile(ui, repo, type=None, r=None,
>  #
>  prefix = ""
>  if opts[r'stdin']:
> -try:
> -(type, r) = util.bytesinput(ui.fin, ui.fout).split(' ')
> -prefix = ""
> -except EOFError:
> +line = ui.fin.readline()
> +if not line:
>  return
> -
> +(type, r) = line.rstrip(pycompat.oslinesep).split(b' ')
> +prefix = ""
>  else:
>  if not type or not r:
>  ui.warn(_("cat-file: type or revision not supplied\n"))
> @@ -204,10 +202,10 @@ def catfile(ui, repo, type=None, r=None,
>  n = repo.lookup(r)
>  catcommit(ui, repo, n, prefix)
>  if opts[r'stdin']:
> -try:
> -(type, r) = util.bytesinput(ui.fin, ui.fout).split(' ')
> -except EOFError:
> +line = ui.fin.readline()
> +if not line:
>  break
> +(type, r) = line.rstrip(pycompat.oslinesep).split(b' ')
>  else:
>  break
>
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2593: state: add logic to parse the state file in old way if cbor fails

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  Not sure where to record this comment in this series. So I'll pick this 
commit.
  
  I think we want an explicit version header in the state files so clients know 
when they may be reading a file in an old format. For example, if I start a 
merge in one terminal window, I sometimes move to another terminal window to 
resolve parts of it. The multiple windows may be running different Mercurial 
versions. For example, sometimes one of the shells has a virtualenv activated 
and that virtualenv is running an older Mercurial. We don't want the older 
Mercurial trampling on state needed by the new Mercurial.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2593

To: pulkit, #hg-reviewers
Cc: indygreg, durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2752: cbor: add a __init__.py to top level cbor module

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  I think you should send the relative import patches to upstream. Adding `from 
__future__ import absolute_import` would also be a nice touch.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2752

To: pulkit, #hg-reviewers
Cc: indygreg, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2720: debugcommands: introduce actions to perform deterministic reads

2018-03-09 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  In https://phab.mercurial-scm.org/D2720#44221, @mharbison72 wrote:
  
  > This patch seems to deadlock on Windows when running 
test-ssh-proto-unbundle.t.  There are two python processes idling:
  >
  > Past experience says that something needs to be flushed- stdout (and maybe 
stderr?) are full buffered on Windows IIRC.  The following failure with this 
patch might also point to that:
  
  
  Hmmm.
  
  We explicitly create the pipes between the processes as unbuffered 
(`bufsize=0`). That being said, I would not at all be surprised if we need to 
throw a `flush()` in here somewhere.
  
  Could you please run the test with `run-tests.py -d` and see where it is 
deadlocking?
  
  The test failure due to the `o>` block is likely due to buffering or a race 
condition when writing to stdout of the `hg debugwireproto` process. I'll look 
into that after we fix the deadlock (since fixing might change order of things 
and I don't want to debug this in a system that is changing).

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2720

To: indygreg, #hg-reviewers
Cc: mharbison72, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2755: phabricator: update doc string for deprecated token argument

2018-03-09 Thread joerg.sonnenberger (Joerg Sonnenberger)
joerg.sonnenberger created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2755

AFFECTED FILES
  contrib/phabricator.py

CHANGE DETAILS

diff --git a/contrib/phabricator.py b/contrib/phabricator.py
--- a/contrib/phabricator.py
+++ b/contrib/phabricator.py
@@ -22,7 +22,8 @@
 url = https://phab.example.com/
 
 # API token. Get it from https://$HOST/conduit/login/
-token = cli-
+# Deprecated: see [phabricator.auth] below
+#token = cli-
 
 # Repo callsign. If a repo has a URL https://$HOST/diffusion/FOO, then its
 # callsign is "FOO".



To: joerg.sonnenberger, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2754: phabricator: print deprecation warning only once

2018-03-09 Thread joerg.sonnenberger (Joerg Sonnenberger)
joerg.sonnenberger created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2754

AFFECTED FILES
  contrib/phabricator.py

CHANGE DETAILS

diff --git a/contrib/phabricator.py b/contrib/phabricator.py
--- a/contrib/phabricator.py
+++ b/contrib/phabricator.py
@@ -99,13 +99,17 @@
 process('', params)
 return util.urlreq.urlencode(flatparams)
 
+printed_token_warning = False
+
 def readlegacytoken(repo):
 """Transitional support for old phabricator tokens.
 
 Remove before the 4.6 release.
 """
+global printed_token_warning
 token = repo.ui.config('phabricator', 'token')
-if token:
+if token and not printed_token_warning:
+printed_token_warning = True
 repo.ui.warn(_('phabricator.token is deprecated - please '
'migrate to the phabricator.auth section.\n'))
 return token



To: joerg.sonnenberger, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2753: graft: check for missing revision first before scanning working copy

2018-03-09 Thread joerg.sonnenberger (Joerg Sonnenberger)
joerg.sonnenberger created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2753

AFFECTED FILES
  mercurial/commands.py

CHANGE DETAILS

diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -2181,10 +2181,10 @@
 raise
 cmdutil.wrongtooltocontinue(repo, _('graft'))
 else:
+if not revs:
+raise error.Abort(_('no revisions specified'))
 cmdutil.checkunfinished(repo)
 cmdutil.bailifchanged(repo)
-if not revs:
-raise error.Abort(_('no revisions specified'))
 revs = scmutil.revrange(repo, revs)
 
 skipped = set()



To: joerg.sonnenberger, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


mercurial@36778: 12 new changesets

2018-03-09 Thread Mercurial Commits
12 new changesets in mercurial:

https://www.mercurial-scm.org/repo/hg/rev/a148c67d8b09
changeset:   36767:a148c67d8b09
user:Vincent Parrett 
date:Wed Mar 07 09:07:34 2018 +1100
summary: archival: fileit should not use atomictemp, causes performance 
regression

https://www.mercurial-scm.org/repo/hg/rev/658ed9c7442b
changeset:   36768:658ed9c7442b
user:Rishabh Madan 
date:Sat Mar 03 23:47:22 2018 +0530
summary: releasenotes: replace abort with warning while parsing (issue5775)

https://www.mercurial-scm.org/repo/hg/rev/3fff6f30bd7f
changeset:   36769:3fff6f30bd7f
user:Rishabh Madan 
date:Sun Mar 04 00:15:35 2018 +0530
summary: releasenotes: mention changeset with warning and abort

https://www.mercurial-scm.org/repo/hg/rev/a5891e94bfe1
changeset:   36770:a5891e94bfe1
user:Rishabh Madan 
date:Sun Mar 04 00:25:58 2018 +0530
summary: releasenotes: allow notes for multiple directives in a single 
changeset

https://www.mercurial-scm.org/repo/hg/rev/f7e3fe95b663
changeset:   36771:f7e3fe95b663
user:Martin von Zweigbergk 
date:Wed Mar 07 10:31:01 2018 -0800
summary: rebase: delete obsolete internal "keepopen" option

https://www.mercurial-scm.org/repo/hg/rev/0f3116c08e65
changeset:   36772:0f3116c08e65
user:Martin von Zweigbergk 
date:Wed Mar 07 09:46:53 2018 -0800
summary: rebase: remove unused argument "state" from rebasenode()

https://www.mercurial-scm.org/repo/hg/rev/1004fd71810f
changeset:   36773:1004fd71810f
user:Martin von Zweigbergk 
date:Thu Mar 01 20:12:25 2018 -0800
summary: rebase: reduce scope of "dsguard" variables a bit

https://www.mercurial-scm.org/repo/hg/rev/a835bf3fe40a
changeset:   36774:a835bf3fe40a
user:Martin von Zweigbergk 
date:Tue Mar 06 09:39:24 2018 -0800
summary: rebase: collapse two nested if-conditions

https://www.mercurial-scm.org/repo/hg/rev/6dab3bdb1f00
changeset:   36775:6dab3bdb1f00
user:Martin von Zweigbergk 
date:Tue Mar 06 14:29:20 2018 -0800
summary: rebase: only store collapse message once

https://www.mercurial-scm.org/repo/hg/rev/c164a3a282c1
changeset:   36776:c164a3a282c1
user:Martin von Zweigbergk 
date:Wed Mar 07 10:55:57 2018 -0800
summary: tests: .hg/merge is a directory, so use `test -d`

https://www.mercurial-scm.org/repo/hg/rev/66c569e57c70
changeset:   36777:66c569e57c70
user:Martin von Zweigbergk 
date:Wed Mar 07 11:00:17 2018 -0800
summary: tests: add test for issue 5494 but with --collapse

https://www.mercurial-scm.org/repo/hg/rev/7aae39d03139
changeset:   36778:7aae39d03139
bookmark:@
tag: tip
user:Augie Fackler 
date:Sun Mar 04 16:20:24 2018 -0500
summary: debugcommands: fix some %r output with bytestr() wrappers

-- 
Repository URL: https://www.mercurial-scm.org/repo/hg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2727: bookmarks: test for exchanging long bookmark names (issue5165)

2018-03-09 Thread durin42 (Augie Fackler)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG09b58af83d44: bookmarks: test for exchanging long bookmark 
names (issue5165) (authored by durin42, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D2727?vs=6733=6762

REVISION DETAIL
  https://phab.mercurial-scm.org/D2727

AFFECTED FILES
  tests/test-bookmarks-pushpull.t

CHANGE DETAILS

diff --git a/tests/test-bookmarks-pushpull.t b/tests/test-bookmarks-pushpull.t
--- a/tests/test-bookmarks-pushpull.t
+++ b/tests/test-bookmarks-pushpull.t
@@ -1030,6 +1030,34 @@
   no changes found
   [1]
 
+Pushing a really long bookmark should work fine (issue5165)
+===
+
+#if b2-binary
+  >>> open('longname', 'w').write('wat' * 100)
+  $ hg book `cat longname`
+  $ hg push -B `cat longname` ../unchanged-b
+  pushing to ../unchanged-b
+  searching for changes
+  no changes found
+  exporting bookmark (wat){100} (re)
+  [1]
+  $ hg -R ../unchanged-b book --delete `cat longname`
+
+Test again but forcing bundle2 exchange to make sure that doesn't regress.
+
+  $ hg push -B `cat longname` ../unchanged-b --config 
devel.legacy.exchange=bundle1
+  pushing to ../unchanged-b
+  searching for changes
+  no changes found
+  exporting bookmark (wat){100} (re)
+  [1]
+  $ hg -R ../unchanged-b book --delete `cat longname`
+  $ hg book --delete `cat longname`
+  $ hg co @
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  (activating bookmark @)
+#endif
 
 Check hook preventing push (issue4455)
 ==



To: durin42, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


mercurial@36766: new changeset

2018-03-09 Thread Mercurial Commits
New changeset in mercurial:

https://www.mercurial-scm.org/repo/hg/rev/d382344c69aa
changeset:   36766:d382344c69aa
tag: tip
user:Gregory Szorc 
date:Sat Mar 03 18:55:43 2018 -0500
summary: perf: teach perfbdiff to call blocks() and to use xdiff

-- 
Repository URL: https://www.mercurial-scm.org/repo/hg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 5 of 5] templater: split template functions to new module

2018-03-09 Thread Yuya Nishihara
# HG changeset patch
# User Yuya Nishihara 
# Date 1520515382 -32400
#  Thu Mar 08 22:23:02 2018 +0900
# Node ID b3f764c8098d6de1dca102f8b5b5d05721a6341f
# Parent  c22e2d75938bc761554c8da98a49b85aa6bff0de
templater: split template functions to new module

It has grown enough to be a dedicated module.

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -132,8 +132,9 @@ i18n/hg.pot: $(PYFILES) $(DOCFILES) i18n
$(PYTHON) i18n/hggettext mercurial/commands.py \
  hgext/*.py hgext/*/__init__.py \
  mercurial/fileset.py mercurial/revset.py \
- mercurial/templatefilters.py mercurial/templatekw.py \
- mercurial/templater.py \
+ mercurial/templatefilters.py \
+ mercurial/templatefuncs.py \
+ mercurial/templatekw.py \
  mercurial/filemerge.py \
  mercurial/hgweb/webcommands.py \
  mercurial/util.py \
diff --git a/mercurial/extensions.py b/mercurial/extensions.py
--- a/mercurial/extensions.py
+++ b/mercurial/extensions.py
@@ -290,8 +290,8 @@ def loadall(ui, whitelist=None):
 fileset,
 revset,
 templatefilters,
+templatefuncs,
 templatekw,
-templater,
 )
 
 # list of (objname, loadermod, loadername) tuple:
@@ -307,7 +307,7 @@ def loadall(ui, whitelist=None):
 ('internalmerge', filemerge, 'loadinternalmerge'),
 ('revsetpredicate', revset, 'loadpredicate'),
 ('templatefilter', templatefilters, 'loadfilter'),
-('templatefunc', templater, 'loadfunction'),
+('templatefunc', templatefuncs, 'loadfunction'),
 ('templatekeyword', templatekw, 'loadkeyword'),
 ]
 _loadextra(ui, newindex, extraloaders)
diff --git a/mercurial/help.py b/mercurial/help.py
--- a/mercurial/help.py
+++ b/mercurial/help.py
@@ -26,8 +26,8 @@ from . import (
 pycompat,
 revset,
 templatefilters,
+templatefuncs,
 templatekw,
-templater,
 util,
 )
 from .hgweb import (
@@ -309,7 +309,7 @@ addtopicsymbols('merge-tools', '.. inter
 addtopicsymbols('revisions', '.. predicatesmarker', revset.symbols)
 addtopicsymbols('templates', '.. keywordsmarker', templatekw.keywords)
 addtopicsymbols('templates', '.. filtersmarker', templatefilters.filters)
-addtopicsymbols('templates', '.. functionsmarker', templater.funcs)
+addtopicsymbols('templates', '.. functionsmarker', templatefuncs.funcs)
 addtopicsymbols('hgweb', '.. webcommandsmarker', webcommands.commands,
 dedent=True)
 
diff --git a/mercurial/registrar.py b/mercurial/registrar.py
--- a/mercurial/registrar.py
+++ b/mercurial/registrar.py
@@ -368,7 +368,7 @@ class templatefunc(_templateregistrarbas
 extension, if an instance named as 'templatefunc' is used for
 decorating in extension.
 
-Otherwise, explicit 'templater.loadfunction()' is needed.
+Otherwise, explicit 'templatefuncs.loadfunction()' is needed.
 """
 _getname = _funcregistrarbase._parsefuncdecl
 
diff --git a/mercurial/templater.py b/mercurial/templatefuncs.py
copy from mercurial/templater.py
copy to mercurial/templatefuncs.py
--- a/mercurial/templater.py
+++ b/mercurial/templatefuncs.py
@@ -1,24 +1,21 @@
-# templater.py - template expansion for output
+# templatefuncs.py - common template functions
 #
 # Copyright 2005, 2006 Matt Mackall 
 #
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
-from __future__ import absolute_import, print_function
+from __future__ import absolute_import
 
-import os
 import re
 
 from .i18n import _
 from . import (
 color,
-config,
 encoding,
 error,
 minirst,
 obsutil,
-parser,
 pycompat,
 registrar,
 revset as revsetmod,
@@ -39,425 +36,8 @@ evalstring = templateutil.evalstring
 evalstringliteral = templateutil.evalstringliteral
 evalastype = templateutil.evalastype
 
-# template parsing
-
-elements = {
-# token-type: binding-strength, primary, prefix, infix, suffix
-"(": (20, None, ("group", 1, ")"), ("func", 1, ")"), None),
-".": (18, None, None, (".", 18), None),
-"%": (15, None, None, ("%", 15), None),
-"|": (15, None, None, ("|", 15), None),
-"*": (5, None, None, ("*", 5), None),
-"/": (5, None, None, ("/", 5), None),
-"+": (4, None, None, ("+", 4), None),
-"-": (4, None, ("negate", 19), ("-", 4), None),
-"=": (3, None, None, ("keyvalue", 3), None),
-",": (2, None, None, ("list", 2), None),
-")": (0, None, None, None, None),
-"integer": (0, "integer", None, None, None),
-"symbol": (0, "symbol", None, None, None),
-"string": (0, "string", None, None, None),
-"template": (0, "template", None, None, None),
-"end": (0, None, None, None, None),
-}
-
-def tokenize(program, start, end, term=None):
-"""Parse a template expression into a stream of tokens, which must end
-with term if specified"""
-pos 

  1   2   >