Wolfgang Maier added the comment: STINNER Victor added the comment: >BUT when Python is started from a virtual environment (created by the >"venv" module), the re module is important by default. > >haypo@speed-python$ venv/bin/python3 -c 'import sys; print("re" in >sys.modules)' >True
Exciting, I just verified that this is true and running python3 from a venv really seems to be the only situation, in which the re module gets imported during startup (at least it's only this one branch in site.py that uses it). If adding a single enum import to re causes such a big startup time difference I wonder how much more could be gained for the venv case by not importing re at all! Turns out that the complete code block in site.py that is used by venvs and that was partially shown by @haypo is: CONFIG_LINE = r'^(?P<key>(\w|[-_])+)\s*=\s*(?P<value>.*)\s*$' def venv(known_paths): global PREFIXES, ENABLE_USER_SITE env = os.environ if sys.platform == 'darwin' and '__PYVENV_LAUNCHER__' in env: executable = os.environ['__PYVENV_LAUNCHER__'] else: executable = sys.executable exe_dir, _ = os.path.split(os.path.abspath(executable)) site_prefix = os.path.dirname(exe_dir) sys._home = None conf_basename = 'pyvenv.cfg' candidate_confs = [ conffile for conffile in ( os.path.join(exe_dir, conf_basename), os.path.join(site_prefix, conf_basename) ) if os.path.isfile(conffile) ] if candidate_confs: import re config_line = re.compile(CONFIG_LINE) virtual_conf = candidate_confs[0] system_site = "true" # Issue 25185: Use UTF-8, as that's what the venv module uses when # writing the file. with open(virtual_conf, encoding='utf-8') as f: for line in f: line = line.strip() m = config_line.match(line) if m: d = m.groupdict() key, value = d['key'].lower(), d['value'] if key == 'include-system-site-packages': system_site = value.lower() elif key == 'home': sys._home = value sys.prefix = sys.exec_prefix = site_prefix # Doing this here ensures venv takes precedence over user-site addsitepackages(known_paths, [sys.prefix]) # addsitepackages will process site_prefix again if its in PREFIXES, # but that's ok; known_paths will prevent anything being added twice if system_site == "true": PREFIXES.insert(0, sys.prefix) else: PREFIXES = [sys.prefix] ENABLE_USER_SITE = False return known_paths So all the re module is good for here is to parse simple config file records with key/value pairs separated by '='. ´Shouldn't it be straightforward to implement that logic right inside that block directly without requiring a giant import? This should easily be doable for 3.6 still, seems as if it would solve the whole issue and probably speed up the performance tests much more than any reverted changesets could. What do you think? ---------- nosy: +wolma _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28637> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com