On Fri, Feb 10, 2017 at 1:46 AM, Brett Cannon <br...@python.org> wrote: > As of right now Senthil is (I'm assuming) generating another test repo right > now to see the results. If we're happy with the output then we will go with > it, else we will skip the rewrite. So you're not too late as in a final > decision has been made, but then you could argue you're too early based on > how this test goes. :) >
No need to wait, I put together a script that shows the result of the rewriting :) The script checks against roundup, so invalid ids are not converted. I'm currently tweaking it as I find issues, so it's still a bit messy, but this should give you a good idea. Drop the attached script in the root of your cpython clone and run it with: python convcm.py (to see 100 random converted commit messages) python convcm.py csid1 csid2 (to see the specified changeset ids) The script will create an output.html file that shows the before/after. It will also cache the valid ids in a valid_ids.json files. There are still a few ambiguous cases, in particular: * I left "SF", so "SF issue #12345" becomes "SF bpo-12345"; * I left "patch", so "Apply patch #12345" becomes "Apply patch bpo-12345"; Senthil is also testing the conversion using this regex. Best Regards, Ezio Melotti
from __future__ import print_function import re import sys import json import random import xmlrpclib from mercurial import ui, hg u = ui.ui() repo = hg.repository(ui.ui(), '.') print('Retrieving bpo ids ...', end=' ') try: with open('valid_ids.json') as f: valid_ids = json.load(f) except IOError: url = 'http://bugs.python.org/xmlrpc' roundup = xmlrpclib.ServerProxy(url) valid_ids = roundup.list('issue', 'id') with open('valid_ids.json', 'w') as f: json.dump(valid_ids, f) valid_ids = set(valid_ids) print('[done]') r = r'(?:(?:(?<!org/)issues?|bugs?|SF)(?:\s+id)?\s*#?|#)\s*(\d+)' regex = re.compile(r, flags=re.MULTILINE|re.IGNORECASE) N = 100 if len(sys.argv) == 1: print('Generating revs sample...', end=' ') revs = random.sample(repo, N) print('[done]') else: revs = [repo[rev] for rev in sys.argv[1:]] # uncomment to run on all the revs #revs = [repo[rev] for rev in repo] def re_cb(match): id = match.group(1) if id not in valid_ids: return match.group(0) else: return '<b>bpo-%s</b>' % id print('Generating output ...', end=' ') with open('output.html', 'w') as f: for n, rev in enumerate(revs): desc = repo[rev].description() newdesc = regex.sub(re_cb, desc) f.write('<pre>[<b>%s</b>]: %s\n' % (rev, desc)) f.write('[<b>%s</b>]: %s</pre>\n<hr>\n' % (rev, newdesc)) print('[done]') print(n+1, 'revisions converted') unusual = """ af811172717d 76a9a5131aae 31342913fb1e c4dd30b5d07e 3094843e7b92 e8940d4cd8ca 6d1e8162e855 27b698395d35 077d29384399 81ce9d412a4c c6df85e1d42e fedd6ccc5e5b 4ca32e4f7839 6db0a62b6aa6 69ac672b49b3 a6bcf4df1a85 a74b463bf76d 756c27efe193 df4943d24cb6 04f2801d9977 d9d69060f5e4 """ ambiguous = """ 0e8077cb3dd5 04f2801d9977 76a9a5131aae 3094843e7b92 e8940d4cd8ca fedd6ccc5e5b a4d869ecef33 bd2aa0247ada """ invalid = """ d2ae5affde14 329b28a85947 """ broken = """ """
_______________________________________________ core-workflow mailing list core-workflow@python.org https://mail.python.org/mailman/listinfo/core-workflow This list is governed by the PSF Code of Conduct: https://www.python.org/psf/codeofconduct