Re: Calling Apache Toree from a remote Jupyter instance

Sourav Mazumder Tue, 10 May 2016 12:05:03 -0700

Hi Gino,

Thanks for the same.


I was trying to install the jupyter_ketnel_gateway and run the same.

However, when I run the command 'jupyter kernelgateway' I get following
error -

File "/home/biadmin/.local/bin/jupyter-kernelgateway", line 7, in <module>
    from kernel_gateway import launch_instance
  File
"/home/biadmin/.local/lib/python2.6/site-packages/kernel_gateway/__init__.py",
line 4, in <module>
    from .gatewayapp import launch_instance
  File
"/home/biadmin/.local/lib/python2.6/site-packages/kernel_gateway/gatewayapp.py",
line 9, in <module>
    import nbformat
  File
"/home/biadmin/.local/lib/python2.6/site-packages/nbformat/__init__.py",
line 11, in <module>
    from traitlets.log import get_logger
  File
"/home/biadmin/.local/lib/python2.6/site-packages/traitlets/__init__.py",
line 1, in <module>
    from .traitlets import *
  File
"/home/biadmin/.local/lib/python2.6/site-packages/traitlets/traitlets.py",
line 1331
    return {n: t for (n, t) in cls.class_traits(**metadata).items()

Is there any issue with the current build ?

Regards,
Sourav



On Fri, May 6, 2016 at 11:00 AM, Gino Bustelo <[email protected]> wrote:

> http://s32.postimg.org/47f8a1qo5/toree_provisioned.png
>
> On Fri, May 6, 2016 at 9:54 AM, Gino Bustelo <[email protected]> wrote:
>
> > Ok... I'll try to hang it somewhere on the interwebs and send an url
> >
> > On Fri, May 6, 2016 at 9:29 AM, Luciano Resende <[email protected]>
> > wrote:
> >
> >> I believe the list will remove the image.
> >>
> >> On Thursday, May 5, 2016, Sourav Mazumder <[email protected]>
> >> wrote:
> >>
> >> > Hi Gino,
> >> >
> >> > Thanks for the details.
> >> >
> >> > But I'm not able to see the image - it is coming as inline image.
> >> >
> >> > Could you please send the image once more ?
> >> >
> >> > Regards,
> >> > Sourav
> >> >
> >> > On Thu, May 5, 2016 at 12:44 PM, Gino Bustelo <[email protected]
> >> > <javascript:;>> wrote:
> >> >
> >> > > Sourav,
> >> > >
> >> > > The solution will look something like this picture
> >> > >
> >> > > [image: Inline image 1]
> >> > >
> >> > > There is no need for a separate Toree client if you are using
> Jupyter.
> >> > > Jupyter already knows how to talk to Toree. Now... there are other
> >> > > solutions that can sit on top of Toree that can expose REST or web
> >> > socket,
> >> > > but are currently meant for custom client solutions. See
> >> > > https://github.com/jupyter/kernel_gateway.
> >> > >
> >> > > Thanks,
> >> > > Gino
> >> > >
> >> > > On Thu, May 5, 2016 at 11:46 AM, Sourav Mazumder <
> >> > > [email protected] <javascript:;>> wrote:
> >> > >
> >> > >> Hi Gino,
> >> > >>
> >> > >> Thanks for explaining the scope of Toree.
> >> > >>
> >> > >> What I was looking for is a solution where Toree can play the role
> >> of a
> >> > >> facade between the client application (in this case the notebook)
> and
> >> > the
> >> > >> underlying Spark cluster. So if the client application submit a
> >> command
> >> > it
> >> > >> can accept it and execute it using underlying spark infrastructure
> >> (may
> >> > be
> >> > >> stand alone, on mesos, or on YARN) and return back the result.
> >> > >>
> >> > >> I someway like the option 2 too as I think it is in the similar
> line
> >> of
> >> > my
> >> > >> requirement. However, not sure whether I have got it fully.
> >> > >>
> >> > >> What essentially I'm looking for is a solution where the Jupyter
> >> would
> >> > be
> >> > >> running on individual data scientists' laptop. The Jupyter will
> issue
> >> > the
> >> > >> command from the laptop and the Toree client will accept it and
> send
> >> it
> >> > to
> >> > >> the Toree server running on the Spark Cluster. Toree server will
> run
> >> > that
> >> > >> on Spark and return the results back.
> >> > >>
> >> > >> To achieve this requirement using option 2, can one potentially
> >> change
> >> > >> Jupyter (or add an extension) which can send the request to Toree
> >> > running
> >> > >> on the provision layer over Zero MQ (or any other protocol like
> >> REST) ?
> >> > >>
> >> > >> Regards,
> >> > >> Sourav
> >> > >>
> >> > >> On Thu, May 5, 2016 at 6:47 AM, Gino Bustelo <[email protected]
> >> > <javascript:;>> wrote:
> >> > >>
> >> > >> > >>>>>>>>>>>>>>>>>>>
> >> > >> > Hi Gino,
> >> > >> >
> >> > >> > It does not solve the problem of running a Spark job  (on Yarn)
> >> > remotely
> >> > >> > from a Jupyter notebook which is running on say in a laptop/some
> >> > >> machine.
> >> > >> >
> >> > >> > The issue is in yarn-client mode the laptop needs to get access
> to
> >> all
> >> > >> the
> >> > >> > slave nodes where the executors would be running. In a typical
> >> > security
> >> > >> > scenario of an organization the slave nodes are behind firewall
> and
> >> > >> cannot
> >> > >> > be accessed from any random machine outside.
> >> > >> >
> >> > >> > Regards,
> >> > >> > Sourav
> >> > >> > >>>>>>>>>>>>>>>>>>>
> >> > >> >
> >> > >> >
> >> > >> > Sourav, I'm very much aware about the network implication of
> Spark
> >> > (not
> >> > >> > exclusive to YARN). The typical way that I've seen this problem
> >> solved
> >> > >> is:
> >> > >> >
> >> > >> > 1. You manages/host Jupyter in a privilege network space that can
> >> have
> >> > >> > access to the Spark cluster. This involves no code changes on
> >> either
> >> > >> > Jupyter or Toree, but has the added cost for the service provider
> >> of
> >> > >> > managing this frontend tool
> >> > >> >
> >> > >> > 2. You create a provisioner layer in a privilege network space to
> >> > manage
> >> > >> > Kernels (Toree) and modify Jupyter through extensions to
> understand
> >> > how
> >> > >> to
> >> > >> > communicate with that provisioner layer. The pro of this is that
> >> you
> >> > >> don't
> >> > >> > have to manage the Notebooks, but the service provider still need
> >> to
> >> > >> build
> >> > >> > that provisioning layer and proxy the Kernels communication
> >> channels.
> >> > >> >
> >> > >> > My preference is for #2. I think that frontend tools do not need
> to
> >> > live
> >> > >> > close to Spark, but processes like Toree should be as close to
> the
> >> > >> compute
> >> > >> > cluster as possible.
> >> > >> >
> >> > >> > Toree's scope is to be a Spark Driver program that allows
> >> "interactive
> >> > >> > computing". It is not it's scope to provide a full fledge
> >> > >> > provisioning/hosting solution to access Spark. That is left to
> the
> >> > >> > implementers of Spark offerings to select the best way to manage
> >> Toree
> >> > >> > kernels (i.e. Yarn, Mesos, Docker, etc...).
> >> > >> >
> >> > >> > Thanks,
> >> > >> > Gino
> >> > >> >
> >> > >> > On Sat, Apr 30, 2016 at 9:53 PM, Gino Bustelo <
> [email protected]
> >> > <javascript:;>>
> >> > >> wrote:
> >> > >> >
> >> > >> > > This is not possible without extending Jupyter. By default,
> >> Jupyter
> >> > >> start
> >> > >> > > kernels as local processes. To be able to launch remote kernels
> >> you
> >> > >> need
> >> > >> > to
> >> > >> > > provide an extension to the KernelManager and have some sort of
> >> > kernel
> >> > >> > > provisioner to then manage the remote kernels. It is not
> >> something
> >> > >> hard
> >> > >> > to
> >> > >> > > do, but there is really nothing out there that I know of that
> you
> >> > can
> >> > >> use
> >> > >> > > out of the box.
> >> > >> > >
> >> > >> > > Gino B.
> >> > >> > >
> >> > >> > > > On Apr 30, 2016, at 6:25 PM, Sourav Mazumder <
> >> > >> > > [email protected] <javascript:;>> wrote:
> >> > >> > > >
> >> > >> > > > Hi,
> >> > >> > > >
> >> > >> > > >
> >> > >> > > > is there any documentation which can be user to configure a
> >> local
> >> > >> > Jupyter
> >> > >> > > > process to talk remotely to a remote Apache Toree server ?
> >> > >> > > >
> >> > >> > > > Regards,
> >> > >> > > > Sourav
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> >>
> >> --
> >> Sent from my Mobile device
> >>
> >
> >
>

Re: Calling Apache Toree from a remote Jupyter instance

Reply via email to