[jupyter] Rethinking kernelspecs

Thomas Kluyver Thu, 26 Jan 2017 05:58:22 -0800

This is prompted by a couple of problems I've come across with kernelspecs:

a) In nbval, when you want to check notebooks against multiple Python
versions, the obvious approach is to create an environment (e.g. a Travis
job) for each, and run the tests inside it. But the notebook always runs
with the kernel in its metadata (e.g. if it's saved with Python 3, testing
on Python 2 will still run a Python 3 kernel). We worked around this by
adding a --current-env flag.

b) Anaconda installs a notebook server extension which exposes conda
environments as kernelspecs. But this doesn't affect other code using
Jupyter, causing problems in e.g. nbconvert (
https://github.com/jupyter/nbconvert/issues/515 ). More generally,
identifying kernels with an environment name only makes sense within one
computer.

I've been turning this over in my head for a while. I think there are three
kinds of information relevant to starting a kernel for a notebook:

1. In what programming language does the code make sense? This is mostly
captured by our language_info metadata, and the notebook application's
fallback behaviour when it can't find a named kernel. But there's still a
bit of ambiguity with different versions of a language (e.g. do we treat
Python 3 and Python 2 as one language?).

2. How do we set up an environment with the dependencies for the notebook?
There's some excellent work going on for this at
https://github.com/jupyter/nbformat/pull/60 , but it's not what I want to
discuss here.

3. Which available kernel for this notebook's language should we start to
run it? At present, we use the name of the kernel when the notebook was
saved - this is convenient for some use cases, but leads to problems (a)
and (b) described above.

I propose that we change how we pick a kernel, by depreacting the
kernelspec metadata in notebooks and adding a pluggable KernelPicker class.
The default KernelPicker would follow these rules:

i) If the calling code explicitly specifies a kernel, start that one.
ii) If there is only one kernel available for the notebook's language,
start that one.
iii) If the notebook is in Python and ipykernel is installed in the current
environment, start ipykernel in this environment. This is a bit specific,
but it's often what you want for tools like nbval and nbconvert.
iv) There are either no kernels or multiple kernels installed for the
language in question. Error out, indicating to the user that they should
specify a kernel to be used (see (i)).

For the notebook application, we may plug in a different KernelPicker which
records which kernels have been used for which notebooks, similar to the
present behaviour. Even if we don't, Continuum or other people may
implement something like this. But we wouldn't use this in tools like
nbconvert and nbval.

Once there is a way to store environment descriptions in notebook metadata,
and to create an environment for a notebook, another KernelPicker class may
be involved in associating notebooks with the environment created for them.

This proposal is still rough, but I think that we need to move away from
storing local kernel names in notebook metadata, now that we're getting
more insight into how kernelspecs are used.

Thomas

--
You received this message because you are subscribed to the Google Groups
"Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jupyter/CAOvn4qhcgx-HG8UDQcpBMmoChvy%3D71%3Dyjqw5VK4B3uFz6BHKfA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[jupyter] Rethinking kernelspecs

Reply via email to