Ok... I'll try to hang it somewhere on the interwebs and send an url On Fri, May 6, 2016 at 9:29 AM, Luciano Resende <[email protected]> wrote:
> I believe the list will remove the image. > > On Thursday, May 5, 2016, Sourav Mazumder <[email protected]> > wrote: > > > Hi Gino, > > > > Thanks for the details. > > > > But I'm not able to see the image - it is coming as inline image. > > > > Could you please send the image once more ? > > > > Regards, > > Sourav > > > > On Thu, May 5, 2016 at 12:44 PM, Gino Bustelo <[email protected] > > <javascript:;>> wrote: > > > > > Sourav, > > > > > > The solution will look something like this picture > > > > > > [image: Inline image 1] > > > > > > There is no need for a separate Toree client if you are using Jupyter. > > > Jupyter already knows how to talk to Toree. Now... there are other > > > solutions that can sit on top of Toree that can expose REST or web > > socket, > > > but are currently meant for custom client solutions. See > > > https://github.com/jupyter/kernel_gateway. > > > > > > Thanks, > > > Gino > > > > > > On Thu, May 5, 2016 at 11:46 AM, Sourav Mazumder < > > > [email protected] <javascript:;>> wrote: > > > > > >> Hi Gino, > > >> > > >> Thanks for explaining the scope of Toree. > > >> > > >> What I was looking for is a solution where Toree can play the role of > a > > >> facade between the client application (in this case the notebook) and > > the > > >> underlying Spark cluster. So if the client application submit a > command > > it > > >> can accept it and execute it using underlying spark infrastructure > (may > > be > > >> stand alone, on mesos, or on YARN) and return back the result. > > >> > > >> I someway like the option 2 too as I think it is in the similar line > of > > my > > >> requirement. However, not sure whether I have got it fully. > > >> > > >> What essentially I'm looking for is a solution where the Jupyter would > > be > > >> running on individual data scientists' laptop. The Jupyter will issue > > the > > >> command from the laptop and the Toree client will accept it and send > it > > to > > >> the Toree server running on the Spark Cluster. Toree server will run > > that > > >> on Spark and return the results back. > > >> > > >> To achieve this requirement using option 2, can one potentially change > > >> Jupyter (or add an extension) which can send the request to Toree > > running > > >> on the provision layer over Zero MQ (or any other protocol like REST) > ? > > >> > > >> Regards, > > >> Sourav > > >> > > >> On Thu, May 5, 2016 at 6:47 AM, Gino Bustelo <[email protected] > > <javascript:;>> wrote: > > >> > > >> > >>>>>>>>>>>>>>>>>>> > > >> > Hi Gino, > > >> > > > >> > It does not solve the problem of running a Spark job (on Yarn) > > remotely > > >> > from a Jupyter notebook which is running on say in a laptop/some > > >> machine. > > >> > > > >> > The issue is in yarn-client mode the laptop needs to get access to > all > > >> the > > >> > slave nodes where the executors would be running. In a typical > > security > > >> > scenario of an organization the slave nodes are behind firewall and > > >> cannot > > >> > be accessed from any random machine outside. > > >> > > > >> > Regards, > > >> > Sourav > > >> > >>>>>>>>>>>>>>>>>>> > > >> > > > >> > > > >> > Sourav, I'm very much aware about the network implication of Spark > > (not > > >> > exclusive to YARN). The typical way that I've seen this problem > solved > > >> is: > > >> > > > >> > 1. You manages/host Jupyter in a privilege network space that can > have > > >> > access to the Spark cluster. This involves no code changes on either > > >> > Jupyter or Toree, but has the added cost for the service provider of > > >> > managing this frontend tool > > >> > > > >> > 2. You create a provisioner layer in a privilege network space to > > manage > > >> > Kernels (Toree) and modify Jupyter through extensions to understand > > how > > >> to > > >> > communicate with that provisioner layer. The pro of this is that you > > >> don't > > >> > have to manage the Notebooks, but the service provider still need to > > >> build > > >> > that provisioning layer and proxy the Kernels communication > channels. > > >> > > > >> > My preference is for #2. I think that frontend tools do not need to > > live > > >> > close to Spark, but processes like Toree should be as close to the > > >> compute > > >> > cluster as possible. > > >> > > > >> > Toree's scope is to be a Spark Driver program that allows > "interactive > > >> > computing". It is not it's scope to provide a full fledge > > >> > provisioning/hosting solution to access Spark. That is left to the > > >> > implementers of Spark offerings to select the best way to manage > Toree > > >> > kernels (i.e. Yarn, Mesos, Docker, etc...). > > >> > > > >> > Thanks, > > >> > Gino > > >> > > > >> > On Sat, Apr 30, 2016 at 9:53 PM, Gino Bustelo <[email protected] > > <javascript:;>> > > >> wrote: > > >> > > > >> > > This is not possible without extending Jupyter. By default, > Jupyter > > >> start > > >> > > kernels as local processes. To be able to launch remote kernels > you > > >> need > > >> > to > > >> > > provide an extension to the KernelManager and have some sort of > > kernel > > >> > > provisioner to then manage the remote kernels. It is not something > > >> hard > > >> > to > > >> > > do, but there is really nothing out there that I know of that you > > can > > >> use > > >> > > out of the box. > > >> > > > > >> > > Gino B. > > >> > > > > >> > > > On Apr 30, 2016, at 6:25 PM, Sourav Mazumder < > > >> > > [email protected] <javascript:;>> wrote: > > >> > > > > > >> > > > Hi, > > >> > > > > > >> > > > > > >> > > > is there any documentation which can be user to configure a > local > > >> > Jupyter > > >> > > > process to talk remotely to a remote Apache Toree server ? > > >> > > > > > >> > > > Regards, > > >> > > > Sourav > > >> > > > > >> > > > >> > > > > > > > > > > > -- > Sent from my Mobile device >
