http://s32.postimg.org/47f8a1qo5/toree_provisioned.png
On Fri, May 6, 2016 at 9:54 AM, Gino Bustelo <[email protected]> wrote: > Ok... I'll try to hang it somewhere on the interwebs and send an url > > On Fri, May 6, 2016 at 9:29 AM, Luciano Resende <[email protected]> > wrote: > >> I believe the list will remove the image. >> >> On Thursday, May 5, 2016, Sourav Mazumder <[email protected]> >> wrote: >> >> > Hi Gino, >> > >> > Thanks for the details. >> > >> > But I'm not able to see the image - it is coming as inline image. >> > >> > Could you please send the image once more ? >> > >> > Regards, >> > Sourav >> > >> > On Thu, May 5, 2016 at 12:44 PM, Gino Bustelo <[email protected] >> > <javascript:;>> wrote: >> > >> > > Sourav, >> > > >> > > The solution will look something like this picture >> > > >> > > [image: Inline image 1] >> > > >> > > There is no need for a separate Toree client if you are using Jupyter. >> > > Jupyter already knows how to talk to Toree. Now... there are other >> > > solutions that can sit on top of Toree that can expose REST or web >> > socket, >> > > but are currently meant for custom client solutions. See >> > > https://github.com/jupyter/kernel_gateway. >> > > >> > > Thanks, >> > > Gino >> > > >> > > On Thu, May 5, 2016 at 11:46 AM, Sourav Mazumder < >> > > [email protected] <javascript:;>> wrote: >> > > >> > >> Hi Gino, >> > >> >> > >> Thanks for explaining the scope of Toree. >> > >> >> > >> What I was looking for is a solution where Toree can play the role >> of a >> > >> facade between the client application (in this case the notebook) and >> > the >> > >> underlying Spark cluster. So if the client application submit a >> command >> > it >> > >> can accept it and execute it using underlying spark infrastructure >> (may >> > be >> > >> stand alone, on mesos, or on YARN) and return back the result. >> > >> >> > >> I someway like the option 2 too as I think it is in the similar line >> of >> > my >> > >> requirement. However, not sure whether I have got it fully. >> > >> >> > >> What essentially I'm looking for is a solution where the Jupyter >> would >> > be >> > >> running on individual data scientists' laptop. The Jupyter will issue >> > the >> > >> command from the laptop and the Toree client will accept it and send >> it >> > to >> > >> the Toree server running on the Spark Cluster. Toree server will run >> > that >> > >> on Spark and return the results back. >> > >> >> > >> To achieve this requirement using option 2, can one potentially >> change >> > >> Jupyter (or add an extension) which can send the request to Toree >> > running >> > >> on the provision layer over Zero MQ (or any other protocol like >> REST) ? >> > >> >> > >> Regards, >> > >> Sourav >> > >> >> > >> On Thu, May 5, 2016 at 6:47 AM, Gino Bustelo <[email protected] >> > <javascript:;>> wrote: >> > >> >> > >> > >>>>>>>>>>>>>>>>>>> >> > >> > Hi Gino, >> > >> > >> > >> > It does not solve the problem of running a Spark job (on Yarn) >> > remotely >> > >> > from a Jupyter notebook which is running on say in a laptop/some >> > >> machine. >> > >> > >> > >> > The issue is in yarn-client mode the laptop needs to get access to >> all >> > >> the >> > >> > slave nodes where the executors would be running. In a typical >> > security >> > >> > scenario of an organization the slave nodes are behind firewall and >> > >> cannot >> > >> > be accessed from any random machine outside. >> > >> > >> > >> > Regards, >> > >> > Sourav >> > >> > >>>>>>>>>>>>>>>>>>> >> > >> > >> > >> > >> > >> > Sourav, I'm very much aware about the network implication of Spark >> > (not >> > >> > exclusive to YARN). The typical way that I've seen this problem >> solved >> > >> is: >> > >> > >> > >> > 1. You manages/host Jupyter in a privilege network space that can >> have >> > >> > access to the Spark cluster. This involves no code changes on >> either >> > >> > Jupyter or Toree, but has the added cost for the service provider >> of >> > >> > managing this frontend tool >> > >> > >> > >> > 2. You create a provisioner layer in a privilege network space to >> > manage >> > >> > Kernels (Toree) and modify Jupyter through extensions to understand >> > how >> > >> to >> > >> > communicate with that provisioner layer. The pro of this is that >> you >> > >> don't >> > >> > have to manage the Notebooks, but the service provider still need >> to >> > >> build >> > >> > that provisioning layer and proxy the Kernels communication >> channels. >> > >> > >> > >> > My preference is for #2. I think that frontend tools do not need to >> > live >> > >> > close to Spark, but processes like Toree should be as close to the >> > >> compute >> > >> > cluster as possible. >> > >> > >> > >> > Toree's scope is to be a Spark Driver program that allows >> "interactive >> > >> > computing". It is not it's scope to provide a full fledge >> > >> > provisioning/hosting solution to access Spark. That is left to the >> > >> > implementers of Spark offerings to select the best way to manage >> Toree >> > >> > kernels (i.e. Yarn, Mesos, Docker, etc...). >> > >> > >> > >> > Thanks, >> > >> > Gino >> > >> > >> > >> > On Sat, Apr 30, 2016 at 9:53 PM, Gino Bustelo <[email protected] >> > <javascript:;>> >> > >> wrote: >> > >> > >> > >> > > This is not possible without extending Jupyter. By default, >> Jupyter >> > >> start >> > >> > > kernels as local processes. To be able to launch remote kernels >> you >> > >> need >> > >> > to >> > >> > > provide an extension to the KernelManager and have some sort of >> > kernel >> > >> > > provisioner to then manage the remote kernels. It is not >> something >> > >> hard >> > >> > to >> > >> > > do, but there is really nothing out there that I know of that you >> > can >> > >> use >> > >> > > out of the box. >> > >> > > >> > >> > > Gino B. >> > >> > > >> > >> > > > On Apr 30, 2016, at 6:25 PM, Sourav Mazumder < >> > >> > > [email protected] <javascript:;>> wrote: >> > >> > > > >> > >> > > > Hi, >> > >> > > > >> > >> > > > >> > >> > > > is there any documentation which can be user to configure a >> local >> > >> > Jupyter >> > >> > > > process to talk remotely to a remote Apache Toree server ? >> > >> > > > >> > >> > > > Regards, >> > >> > > > Sourav >> > >> > > >> > >> > >> > >> >> > > >> > > >> > >> >> >> -- >> Sent from my Mobile device >> > >
