I have one more qs/requirement in this context. Say I have 2 intercepters (spark1 and spark2) created out of the same base intercepter (spark). The spark1 connects to a local spark environment where as spark2 connects to a remote stand alone spark cluster. Both of them use same Hive on a hadoop cluster.
What I want to do is , uisng spark1 I would read a local file and save it to Hive as a table. Then using spark2 I want to process that data with other data in Hive. I want to use spark2 because of bigger infrastructure for computation intensive processing. Now I can do the same using 2 different note books by specifying spark1 and spark 2 respectively as the interecpter in them separately. But I cannot do the same in the same notebook because both spark1 and spak2 uses same set of tags like %sql, %dep etc. Any idea whether this is still doable with some different configuration/work around ? Regards, Sourav On Thu, Sep 24, 2015 at 10:53 PM, tog <guillaume.all...@gmail.com> wrote: > Hi Alex > > Yep, i think the multitenancy set-up has raised numerous questions > recently. It might be interesting to dedicate a web page to your container > approach in the doc. > > Thanks > Guillaume > > On Friday, 25 September 2015, Alex <abezzu...@nflabs.com> wrote: > >> Hi, >> >> Spark context is bounded to Spark interpreter instance, each running in a >> separate process. >> >> All notes that share the same interpreter - are sharing the context too >> (among other things) >> >> You can archive the desired behaviour in multiuser environment right now, >> I.e by creating a separate spark interpreter for each user in case all >> users share access to the same Zeppelin instance. >> >> Another approach that we use for our customers is to host a separate >> Zeppelin instance in container, one per-user, and have a balancing >> reverse-proxy in front of it. >> >> I can share more details on this multi-tenancy setup, if enough people >> from community are interested in it. >> >> Hole this helps! >> >> -- >> Kind regards, >> Alexander >> >> On 25 Sep 2015, at 00:54, Yian Shang <yian.sh...@gmail.com> wrote: >> >> Are there any plans to change this so that there will be a separate Spark >> context per Notebook? In a multi-user environment, it is hard to deal with >> the accidental overwriting of user variables. >> >> On Thu, Sep 24, 2015 at 7:19 AM, Rick Moritz <rah...@gmail.com> wrote: >> >>> Different instances of Zeppelin (even under the same user) are indeed >>> separate, which is (currently) the only way to get any kind of independence >>> into notebooks. In comparison, spark-notebook spawns one spark context per >>> Notebook, which is somehwat better design, since concurrent useres of the >>> same application aren't overwriting each other's variables accidentally, >>> and each notebook is indeed "repeatable" and "stand-alone", which is a >>> current deficit of Zeppelin, especially ina multi-user environment. >>> So yes, closing one context in one instance of Zeppelin will not >>> interefere with the other Spark context in the other instance of Zeppelin. >>> >>> On Thu, Sep 24, 2015 at 4:02 PM, Hammad <ham...@flexilogix.com> wrote: >>> >>>> Very useful indeed, Rick! >>>> >>>> If I have two zeppelin instances running as two different users with >>>> same Spark Master - I see them as two different applications in Spark Web >>>> UI. >>>> >>>> 1. will they have their own 'context' of execution in this case? If I >>>> understand, this would mean that closing a spark context in one user's >>>> zeppelin will have no impact on another user's zeppelin environment or its >>>> not true? >>>> >>>> On Thu, Sep 24, 2015 at 4:47 PM, Rick Moritz <rah...@gmail.com> wrote: >>>> >>>>> 1) >>>>> Zeppelin uses the spark-shell REPL API. Therefore it behaves similarly >>>>> to the scala shell. >>>>> You do not write applications in the shell, in the technical sense, >>>>> but instead evaluate individual expressions with the goal of interacting >>>>> with a dataset. >>>>> You can (manually) export some of the code that you find useful in >>>>> Zeppelin to applications, for example to provide batch-pre-processing. >>>>> I recommend you look at demos/descriptions of the interactive shell >>>>> functionality to get an idea, of what Zeppelin offers over an application. >>>>> Also: You still have to manage most of your imports ;) >>>>> >>>>> 2) >>>>> There are two benefits: >>>>> - You can import and export/share notebooks. This means it makes sense >>>>> to split content. >>>>> - You also reduce the load of the browser, by splitting heavy >>>>> visualizations into multiple notebooks. Once you start rendering tens of >>>>> thousands of points, you start reaching the limits of a browser's >>>>> capability. >>>>> >>>>> Hopefully this helps you get started. >>>>> >>>>> On Thu, Sep 24, 2015 at 1:04 PM, Hammad <ham...@flexilogix.com> wrote: >>>>> >>>>>> Hi mates, >>>>>> >>>>>> I was struggling with anatomy of Zeppelin in context of Spark and >>>>>> could not find anywhere that could answer my questions in mind as below; >>>>>> >>>>>> 1. Usually a scala application structure is; >>>>>> >>>>>> import org.apache.<whatever> >>>>>> >>>>>> obect MyApp{ >>>>>> def main(args: Array[String]){ >>>>>> //something >>>>>> } >>>>>> } >>>>>> >>>>>> whereas, on zeppelin we only write //something. Does it mean that one >>>>>> zeppelin daemon is one application? What if I want to write multiple >>>>>> applications on one zeppelin daemon instance? >>>>>> >>>>>> 2. Related to (1), if same spark context is shared across all >>>>>> notebooks, whats the benefit of having multiple notebooks? >>>>>> >>>>>> I really appreciate if someone may help me understand above two. >>>>>> >>>>>> Thanks, >>>>>> Hmad >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Flexilogix >>>> Ph: +92 618090374 >>>> Fax: +92 612011810 >>>> http://www.flexilogix.com >>>> i...@flexilogix.com >>>> >>>> Disclaimer: This transmission (including any attachments) may contain >>>> confidential information, privileged material or constitute non-public >>>> information. Any use of this information by anyone other than the intended >>>> recipient is prohibited. If you have received this transmission in error, >>>> please immediately reply to the sender and delete this information from >>>> your system. >>>> >>> >>> >> > > -- > PGP KeyID: 2048R/EA31CFC9 subkeys.pgp.net >