RE: [PATCH 1 of 5] extdata: add extdatasource reader
> -Original Message- > From: Mercurial-devel [mailto:mercurial-devel-bounces@mercurial- > scm.org] On Behalf Of Matt Mackall > Sent: donderdag 6 oktober 2016 23:36 > To: mathias.dem...@gmail.com > Cc: Mercurial-devel > Subject: Re: [PATCH 1 of 5] extdata: add extdatasource reader > > On Thu, 2016-10-06 at 07:52 +0200, Mathias De Maré wrote: > > On Thu, Sep 22, 2016 at 8:21 PM, Matt Mackall <m...@selenic.com> > wrote: > > > > > > > > # HG changeset patch > > > # User Matt Mackall <m...@selenic.com> # Date 1473794045 18000 > # > > > Tue Sep 13 14:14:05 2016 -0500 # Node ID > > > 19bf2776dfe39befdc479253e1e7d030b41c08f9 > > > # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 > > > extdata: add extdatasource reader > > > > > > This adds basic support for extdata, a way to add external data > > > sources for revsets and templates. An extdata data source is simply > > > a list of lines of the form: > > > > > > []\n > > > > > > An extdata source is configured thusly: > > > > > > [extdata] > > > name = > > > > > > urls of the form shell: are launch shell commands to generate data. > > > > > This looks great! > > We've been using a small extension to load an external file with tags > > into one of our repositories. Do you think that could fit here as > well? > > > > Something like this would be nice: > > [extdata] > > name = > > name.type = tags > > Do you actually need it to be a 'tag' or is it good enough that it be a > named identifier in the current union of namespaces? > > Extdata as presently designed doesn't add new names for nodes to the > namespaces. > But having a way to import an arbitrary name mapping would indeed be > handy. For instance, for having a mapping to other version control > system identifiers. An identifier would be fine. We currently use tags, but I guess it's not strictly required. Our users mostly need to be able to do things like 'hg update someid' or 'hg diff -r someid -r someotherid'. > > > name.checkrev = > before loading the tags> > > Can you explain more about checkrev? One of our repositories uses about 1 identifiers (or tags, using our current extension). We only load these if a specific revision is present, to avoid the overhead on any other repositories. I guess this could be a general command (which in our case would check for revision presence) to allow alternative checks. > > -- > Mathematics is the supreme nostalgia of our time. > > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] extdata: add extdatasource reader
On Thu, 2016-10-06 at 07:52 +0200, Mathias De Maré wrote: > On Thu, Sep 22, 2016 at 8:21 PM, Matt Mackallwrote: > > > > > # HG changeset patch > > # User Matt Mackall > > # Date 1473794045 18000 > > # Tue Sep 13 14:14:05 2016 -0500 > > # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9 > > # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 > > extdata: add extdatasource reader > > > > This adds basic support for extdata, a way to add external data > > sources for revsets and templates. An extdata data source is simply a > > list of lines of the form: > > > > []\n > > > > An extdata source is configured thusly: > > > > [extdata] > > name = > > > > urls of the form shell: are launch shell commands to generate data. > > > This looks great! > We've been using a small extension to load an external file with tags into > one of our repositories. Do you think that could fit here as well? > > Something like this would be nice: > [extdata] > name = > name.type = tags Do you actually need it to be a 'tag' or is it good enough that it be a named identifier in the current union of namespaces? Extdata as presently designed doesn't add new names for nodes to the namespaces. But having a way to import an arbitrary name mapping would indeed be handy. For instance, for having a mapping to other version control system identifiers. > name.checkrev = loading the tags> Can you explain more about checkrev? -- Mathematics is the supreme nostalgia of our time. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] extdata: add extdatasource reader
On Thu, Sep 22, 2016 at 8:21 PM, Matt Mackallwrote: > # HG changeset patch > # User Matt Mackall > # Date 1473794045 18000 > # Tue Sep 13 14:14:05 2016 -0500 > # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9 > # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 > extdata: add extdatasource reader > > This adds basic support for extdata, a way to add external data > sources for revsets and templates. An extdata data source is simply a > list of lines of the form: > > []\n > > An extdata source is configured thusly: > > [extdata] > name = > > urls of the form shell: are launch shell commands to generate data. > This looks great! We've been using a small extension to load an external file with tags into one of our repositories. Do you think that could fit here as well? Something like this would be nice: [extdata] name = name.type = tags name.checkrev = ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] extdata: add extdatasource reader
On Sat, 24 Sep 2016 17:34:20 -0500, Matt Mackall wrote: > On Sat, 2016-09-24 at 10:17 +0900, Yuya Nishihara wrote: > > On Thu, 22 Sep 2016 13:21:35 -0500, Matt Mackall wrote: > > > > > > # HG changeset patch > > > # User Matt Mackall> > > # Date 1473794045 18000 > > > # Tue Sep 13 14:14:05 2016 -0500 > > > # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9 > > > # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 > > > extdata: add extdatasource reader > > > > > > +def extdatasource(repo, source): > > > +"""gather a map of rev -> value dict from the specified source > > > + > > > +A source spec is treated as a URL, with a special case shell: type > > > +for parsing the output from a shell command. > > > + > > > +The data is parsed as a series of newline-separated records where > > > +each record is a revision specifier optionally followed by a space > > > +and a freeform string value. If the revision is known locally, it > > > +is converted to a rev, otherwise the record is skipped. > > > + > > > +Note that both key and value are treated as UTF-8 and converted to > > > +the local encoding. This allows uniformity between local and > > > +remote data sources. > > > +""" > > > + > > > +spec = repo.ui.config("extdata", source) > > > +if not spec: > > > +raise util.Abourt(_("unknown extdata source '%s'") % source) > > > + > > > +try: > > > +# prepare for future expansion > > > +expand = spec % () > > > +except TypeError: > > > +raise error.Abort(_("extdata doesn't support parameters yet"), > > > + hint=_("use double % for escaping")) > > Using '%' as a parameter symbol would be a bit unfortunate since URLs have > > %HEX-escape syntax. How about using '$' or '{...}' ? > > I picked it because it's how Firefox search URLs work. Was in my RFC. It appears that Firefox uses '{}' these days. I remember '%' was used in old Firefox (or maybe Netscape) though. https://developer.mozilla.org/en/Add-ons/Creating_OpenSearch_plugins_for_Firefox > > > +os.chdir(cwd) > > Temporary chdir() may cause problems in hgweb, which can be threaded. > > True. But we've had the same problem with hooks for a decade. Hooks are run by subprocess, so only child process will chdir to repo.root. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] extdata: add extdatasource reader
On Thu, Sep 22, 2016 at 06:20:03PM -0500, Kevin Bullock wrote: > > On Sep 22, 2016, at 13:21, Matt Mackallwrote: > > > > # HG changeset patch > > # User Matt Mackall > > # Date 1473794045 18000 > > # Tue Sep 13 14:14:05 2016 -0500 > > # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9 > > # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 > > extdata: add extdatasource reader > > [...] > > +""" > > + > > +spec = repo.ui.config("extdata", source) > > +if not spec: > > +raise util.Abourt(_("unknown extdata source '%s'") % source) > > Typo here. I suppose there's no great way to test against this. Actually something like pytype or mypy should let us catch stuff like this. I want to carve out some time to try and find a way to get one of those tools running on our code regularly. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] extdata: add extdatasource reader
> On Sep 23, 2016, at 12:47, Matt Mackallwrote: > > On Thu, 2016-09-22 at 18:20 -0500, Kevin Bullock wrote: >>> >>> On Sep 22, 2016, at 13:21, Matt Mackall wrote: [...] >>> +try: >>> +src = util.popen(cmd) >> Erm, don't we want to use util.popen2 or one of the other variants that use >> subprocess instead? > > The universal advantages of subprocess are overstated. For the simple task of > reading stdout from a subprocess, util.popen is perfectly suited. If it > wasn't.. > we'd fix util.popen. Related to my reply below: if we use popen, does stderr not get captured? I.e. will the user see the stderr output in their terminal? >> ...and maybe handle ENOENT gracefully? > > We can't, because cmd is an arbitrary shell expression. I mean that it would be nice to inform a user somehow that their arbitrary shell expression failed and what the error was (ENOENT meaning "command not found" in this case). pacem in terris / мир / शान्ति / سَلاَم / 平和 Kevin R. Bullock ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] extdata: add extdatasource reader
On Thu, 2016-09-22 at 18:20 -0500, Kevin Bullock wrote: > > > > On Sep 22, 2016, at 13:21, Matt Mackallwrote: > > > > # HG changeset patch > > # User Matt Mackall > > # Date 1473794045 18000 > > # Tue Sep 13 14:14:05 2016 -0500 > > # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9 > > # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 > > extdata: add extdatasource reader > > > > This adds basic support for extdata, a way to add external data > > sources for revsets and templates. An extdata data source is simply a > > list of lines of the form: > > > > []\n > > > > An extdata source is configured thusly: > > > > [extdata] > > name = > > > > urls of the form shell: are launch shell commands to generate data. > > > > diff -r 5271ae666152 -r 19bf2776dfe3 mercurial/scmutil.py > > --- a/mercurial/scmutil.py Wed Sep 21 17:05:27 2016 -0400 > > +++ b/mercurial/scmutil.py Tue Sep 13 14:14:05 2016 -0500 > > @@ -29,6 +29,7 @@ > > phases, > > revset, > > similar, > > +url, > > util, > > ) > > > > @@ -1418,3 +1419,66 @@ > > return > > > > self._queue.put(fh, block=True, timeout=None) > > + > > +def extdatasources(repo): > > +sources = set() > > +for k, v in repo.ui.configitems("extdata"): > > +sources.add(k) > > +return sources > > + > > +def extdatasource(repo, source): > > +"""gather a map of rev -> value dict from the specified source > > + > > +A source spec is treated as a URL, with a special case shell: type > > +for parsing the output from a shell command. > > + > > +The data is parsed as a series of newline-separated records where > > +each record is a revision specifier optionally followed by a space > > +and a freeform string value. If the revision is known locally, it > > +is converted to a rev, otherwise the record is skipped. > > + > > +Note that both key and value are treated as UTF-8 and converted to > > +the local encoding. This allows uniformity between local and > > +remote data sources. > That's a bit unfortunate. If we're expecting them to be read as UTF-8, can't > we just keep them in UTF-8 all the way thru? We always work internally in the local encoding. Sane local encodings include: utf-8. We've normally got two strategies: files owned by users and stored in local encoding (hgrc) and files owned by Mercurial and stored in utf-8 (bookmarks). So this note is to point out that this is different from the usual case.. because URLs might (or might not) be remote, shared resources that need to be in utf-8 for portability. > > > > +""" > > + > > +spec = repo.ui.config("extdata", source) > > +if not spec: > > +raise util.Abourt(_("unknown extdata source '%s'") % source) > Typo here. I suppose there's no great way to test against this. Indeed, both of the current callers prevent this code from being reached. > > > > + > > +try: > > +# prepare for future expansion > > +expand = spec % () > > +except TypeError: > > +raise error.Abort(_("extdata doesn't support parameters yet"), > > + hint=_("use double % for escaping")) > > + > > +data = {} > > +if spec.startswith("shell:"): > > +# external commands should be run relative to the repo root > > +cmd = spec[6:] > > +cwd = os.getcwd() > > +os.chdir(repo.root) > > +try: > > +src = util.popen(cmd) > Erm, don't we want to use util.popen2 or one of the other variants that use > subprocess instead? The universal advantages of subprocess are overstated. For the simple task of reading stdout from a subprocess, util.popen is perfectly suited. If it wasn't.. we'd fix util.popen. > ...and maybe handle ENOENT gracefully? We can't, because cmd is an arbitrary shell expression. > pacem in terris / мир / शान्ति / سَلاَم / 平和 > Kevin R. Bullock -- Mathematics is the supreme nostalgia of our time. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] extdata: add extdatasource reader
> On Sep 22, 2016, at 13:21, Matt Mackallwrote: > > # HG changeset patch > # User Matt Mackall > # Date 1473794045 18000 > # Tue Sep 13 14:14:05 2016 -0500 > # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9 > # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 > extdata: add extdatasource reader > > This adds basic support for extdata, a way to add external data > sources for revsets and templates. An extdata data source is simply a > list of lines of the form: > > []\n > > An extdata source is configured thusly: > > [extdata] > name = > > urls of the form shell: are launch shell commands to generate data. > > diff -r 5271ae666152 -r 19bf2776dfe3 mercurial/scmutil.py > --- a/mercurial/scmutil.pyWed Sep 21 17:05:27 2016 -0400 > +++ b/mercurial/scmutil.pyTue Sep 13 14:14:05 2016 -0500 > @@ -29,6 +29,7 @@ > phases, > revset, > similar, > +url, > util, > ) > > @@ -1418,3 +1419,66 @@ > return > > self._queue.put(fh, block=True, timeout=None) > + > +def extdatasources(repo): > +sources = set() > +for k, v in repo.ui.configitems("extdata"): > +sources.add(k) > +return sources > + > +def extdatasource(repo, source): > +"""gather a map of rev -> value dict from the specified source > + > +A source spec is treated as a URL, with a special case shell: type > +for parsing the output from a shell command. > + > +The data is parsed as a series of newline-separated records where > +each record is a revision specifier optionally followed by a space > +and a freeform string value. If the revision is known locally, it > +is converted to a rev, otherwise the record is skipped. > + > +Note that both key and value are treated as UTF-8 and converted to > +the local encoding. This allows uniformity between local and > +remote data sources. That's a bit unfortunate. If we're expecting them to be read as UTF-8, can't we just keep them in UTF-8 all the way thru? > +""" > + > +spec = repo.ui.config("extdata", source) > +if not spec: > +raise util.Abourt(_("unknown extdata source '%s'") % source) Typo here. I suppose there's no great way to test against this. > + > +try: > +# prepare for future expansion > +expand = spec % () > +except TypeError: > +raise error.Abort(_("extdata doesn't support parameters yet"), > + hint=_("use double % for escaping")) > + > +data = {} > +if spec.startswith("shell:"): > +# external commands should be run relative to the repo root > +cmd = spec[6:] > +cwd = os.getcwd() > +os.chdir(repo.root) > +try: > +src = util.popen(cmd) Erm, don't we want to use util.popen2 or one of the other variants that use subprocess instead? ...and maybe handle ENOENT gracefully? pacem in terris / мир / शान्ति / سَلاَم / 平和 Kevin R. Bullock ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 5] extdata: add extdatasource reader
# HG changeset patch # User Matt Mackall# Date 1473794045 18000 # Tue Sep 13 14:14:05 2016 -0500 # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9 # Parent 5271ae66615207f39cc41d78f4541bc6f8ca6ff6 extdata: add extdatasource reader This adds basic support for extdata, a way to add external data sources for revsets and templates. An extdata data source is simply a list of lines of the form: []\n An extdata source is configured thusly: [extdata] name = urls of the form shell: are launch shell commands to generate data. diff -r 5271ae666152 -r 19bf2776dfe3 mercurial/scmutil.py --- a/mercurial/scmutil.py Wed Sep 21 17:05:27 2016 -0400 +++ b/mercurial/scmutil.py Tue Sep 13 14:14:05 2016 -0500 @@ -29,6 +29,7 @@ phases, revset, similar, +url, util, ) @@ -1418,3 +1419,66 @@ return self._queue.put(fh, block=True, timeout=None) + +def extdatasources(repo): +sources = set() +for k, v in repo.ui.configitems("extdata"): +sources.add(k) +return sources + +def extdatasource(repo, source): +"""gather a map of rev -> value dict from the specified source + +A source spec is treated as a URL, with a special case shell: type +for parsing the output from a shell command. + +The data is parsed as a series of newline-separated records where +each record is a revision specifier optionally followed by a space +and a freeform string value. If the revision is known locally, it +is converted to a rev, otherwise the record is skipped. + +Note that both key and value are treated as UTF-8 and converted to +the local encoding. This allows uniformity between local and +remote data sources. +""" + +spec = repo.ui.config("extdata", source) +if not spec: +raise util.Abourt(_("unknown extdata source '%s'") % source) + +try: +# prepare for future expansion +expand = spec % () +except TypeError: +raise error.Abort(_("extdata doesn't support parameters yet"), + hint=_("use double % for escaping")) + +data = {} +if spec.startswith("shell:"): +# external commands should be run relative to the repo root +cmd = spec[6:] +cwd = os.getcwd() +os.chdir(repo.root) +try: +src = util.popen(cmd) +finally: +os.chdir(cwd) +else: +# treat as a URL or file +src = url.open(repo.ui, spec) + +try: +for l in src.readlines(): +if " " in l: +k, v = l.strip().split(" ", 1) +else: +k, v = l.strip(), "" + +k = encoding.tolocal(k) +if k in repo: +# we ignore data for nodes that don't exist locally +data[repo[k].rev()] = encoding.tolocal(v) +finally: +src.close() + +return data ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel