I was thinking, for an enterprise that if I used the "instance" per user,
(using Docker etc) That using a nginx frontend to Allow any user to connect
to johndoe-notebook.zeppelin.marathon.mesos  if they set a client SSL cert,
that would "Authenticate" them to the nginx server. From there, Nginx would
have a authentication to "container" list that it would update and
basically if johndoe (based on the cert) has access to johndoe-notebook, it
would proxy the web and websocket ports to the proper container.

In some perfect world where I am a hot java developer I would setup a
widget in Zeppelin that would be able to show the users "containers", the
provisioned ssl certs, and then if I am a owner of x container, I could map
other users to access. Perhaps.  Seems complicated but doable if well
thought out.



On Sun, Jun 14, 2015 at 4:01 AM, Ophir Cohen <[email protected]> wrote:

> Personally I found this discussion very interesting as those exactly the
> issues we encountered (as many others) in our company.
>
> Actually, my main issues were:
> 1. Notebooks persistence and sharing across clusters.
> 2. Users management and especially sharing notebooks that we want (and not
> sharing does we don't want...).
>
> I would love to have tree-like notebooks mng with user mng on top of that.
> Also, currently I find the notebook.json file a bit too sensitive to
> format changes. Actually I encountered few times were a corrupted notebook
> format prevented Zeppelin from start.
>
> My two cents regarding persistency: I'm using GitHub repo to store my
> notebooks. Each new cluster fetch base branch and than create new branch
> for itself. It push updates and tag its branch daily.
>
> In my company our next challenge is the user management and the way to
> provide data sharing between users.
>
>
> On Sun, Jun 14, 2015 at 3:38 AM, Corneau Damien <[email protected]>
> wrote:
>
>> So far, I know a lot of people using multiple instance of zeppelin to
>> restrain the notebooks access. (For teams or people)
>>
>> Its a great way to not mess with each other notebooks and ressources.
>>
>> For the filesystem file structure, I think it will be a natural evolution
>> from the current flat structure. There already was some discussions about
>> it. Although there would be probably a lot of work related to that feature
>> to do on the UI side (Renaming, Creating Folder, Moving Notebooks etc...)
>> On Jun 14, 2015 1:53 AM, "John Omernik" <[email protected]> wrote:
>>
>>> Moon -
>>>
>>> Thank you, those seem like on the right track. I am not too worried
>>> about a notebook persistence option as much as a way that we can specify
>>> the root folder and then use a tree like navigation that knows the
>>> differences between notebooks folders and regular folders. I think as
>>> people use it, they would want to logically group certain notebooks
>>> together.  This could be the initial "system" to manage notebooks, but I
>>> could also see away to add fields to notebooks including a description
>>> field, and a "indexable" option on either whole notebooks or items in the
>>> notebooks, that way down the line we could add a search in addition to the
>>> tree view.
>>>
>>> Those are all "future" items, but in the short term, a way to get away
>>> from a "flat" structure for folder naming I think would help with
>>> organization for many people.  Consider someone who multiple projects, or
>>> multiple users using them, that list could get long and chaotic very
>>> quickly.
>>>
>>> On the subject of authentication, one thing I'd ask the group and devs
>>> is the long term goal of Zeppelin? Do we want to make a notebook server can
>>> support 1 user? 10 Users? 100? 1000?  How do we scale that?  If we add
>>> Authentication we should consider the usage in an enterprise...
>>> authentication is nice, don't get me wrong, it's needed, but I am curious
>>> on the roadmap/strategy on that subject.  I was looking into individual
>>> docker containers per user. That way each Zeppelin instance can be granted
>>> more resource depending on the user's requirements.  But I am not familiar
>>> with the Zeppelin structures to understand if this method has pitfalls.
>>>
>>> My eventual goal would be to setup scripts for provisioning in a way
>>> that takes a "skeleton" docker image, fills in certain items (each user
>>> gets a pair of ports, each user has defaults for memory, each user has
>>> their own data environments setup) Those could all be auto provisioned and
>>> scripted. Then the Docker container is run on an Apache Mesos cluster in a
>>> way that that username is actually in the marathon app name. This would
>>> allow me to, after auto provisioning, provide a user with a username and
>>> port that, using Mesos DNS, allows them to connect up regardless of where
>>> the container is run on the cluster.
>>>
>>> I know not everyone who uses Zeppelin would use that approach, so I
>>> guess the reason for putting this all here is to see what the strategy is
>>> for Zeppelin, can or should it support methods? are there huge problems
>>> with the approach I am laying out? Can I contribute some of the ideas (if
>>> people who know the project don't have any huge reasons for not having many
>>> Zeppelin instances running).
>>>
>>> This is a great conversation, and I think speaks to the usefulness of
>>> this project.
>>>
>>> John
>>>
>>>
>>> On Sat, Jun 13, 2015 at 11:31 AM, moon soo Lee <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Here's some related pullrequests you might interested.
>>>>
>>>> notebook storage options
>>>> https://github.com/apache/incubator-zeppelin/pull/44
>>>>
>>>> authentication
>>>> https://github.com/apache/incubator-zeppelin/pull/53
>>>>
>>>> Thanks,
>>>> moon
>>>>
>>>> On Fri, Jun 12, 2015 at 11:08 PM Corneau Damien <[email protected]>
>>>> wrote:
>>>>
>>>>> So, except for that notebook naming, what you would like is to have a
>>>>> folder tree strucutre for notebooks instead of a flat structure. That way
>>>>> you could navigate in those folders just like a normal filesystem.
>>>>>
>>>>> One problem with the acl restriction you would like to do though is
>>>>> the 'user'. Zeppelin web interface is just the zeppelin instance and 
>>>>> doesnt
>>>>> have knowledge of which user is using it
>>>>> On Jun 12, 2015 11:32 PM, "John Omernik" <[email protected]> wrote:
>>>>>
>>>>>> Hey all, are there any notebook storage options that are
>>>>>> configurable?  Let me explain what I have observed and go from there with
>>>>>> my specific questions
>>>>>>
>>>>>> I set a NFS share location to be my notebook location
>>>>>>
>>>>>> export ZEPPELIN_NOTEBOOK_DIR=/mnt/zeppelin_notebooks
>>>>>>
>>>>>> My ideas was I could have a directory per user in that folder (with
>>>>>> permissions set to only user) and then a shared directory which would be
>>>>>> usable by a group of users based off.  (This is me not knowing anything
>>>>>> about how notebooks are stored).
>>>>>>
>>>>>> When I implemented it, it APPEARS that Zeppelin uses the base
>>>>>> NOTEBOOK_DIR and just creates a Folder with a random name per notebook. 
>>>>>> In
>>>>>> that folder there is a file named note.json.  It appears that in the 
>>>>>> file,
>>>>>> there is a "Name" json item that is the value you can rename notebooks 
>>>>>> too.
>>>>>>
>>>>>> That is how it "appears" to work. What I am asking by can we change
>>>>>> this, or is it configurable, is Can we set a root directory, that we can
>>>>>> navigate through as tree.  And then click through that tree? This would
>>>>>> allow better organization for individual users and groups of users. It
>>>>>> would also allow some sense of security as users navigate the tree.
>>>>>>
>>>>>> This then comes back to the "directory" per notebook. Is that
>>>>>> required? Are there, at times, other files other than note.json stored in
>>>>>> these directories?  If so, perhaps we could do a prefix that is ignored 
>>>>>> by
>>>>>> the Tree in the GUI.  For example, if the user "johndoe" has a folder
>>>>>> johndoe, it would show up as a folder, but a folder that starts with ZNB-
>>>>>> like ZNB-2ATDB8F8R, in the gui would show up as a notebook (and it would
>>>>>> check the note.json file for the name of the notebook). This would allow
>>>>>> much more intuitive storage and management for a team of users.
>>>>>>
>>>>>> I would "prefer" that the name actually be the directory name, rather
>>>>>> than the identifier that Zeppelin creates (it would allow easier 
>>>>>> management
>>>>>> of the notebooks outside of Zeppelin) however I don't know the reasoning
>>>>>> behind it, therefore it's open to discussion for me.
>>>>>>
>>>>>> So for example
>>>>>>
>>>>>> /mnt/zeppelin_notebooks
>>>>>>
>>>>>> In here I may have These folders
>>>>>> johndoe
>>>>>> janesmith
>>>>>> shared
>>>>>> ZNB-2ATDB8F8R * -> note.json "name" field is "How to use company xyz
>>>>>> notebooks"
>>>>>>
>>>>>> In the gui, it would start at the /mnt/zeppelin_notebooks
>>>>>>
>>>>>> it would list with folder icons:
>>>>>> johndoe
>>>>>> janesmith
>>>>>> shared
>>>>>>
>>>>>> it would list with a notebook icon:
>>>>>> "How to use company xyz notebooks"
>>>>>>
>>>>>> if user johndoe clicked on his folder it would show the notebooks and
>>>>>> other directories as well as a parent (..) link that pulls the user back 
>>>>>> up
>>>>>> a directory.
>>>>>>
>>>>>> If johndoe tries to click on janesmith, it would give an access
>>>>>> denied (because the Zeppelin binary would try to cwd into that directory,
>>>>>> but get a filesystem access denied because it's running as johndoe)
>>>>>>
>>>>>> I am just curious on any other sort of discussion we can have here
>>>>>> that would make this easier for groups of users to use?
>>>>>>
>>>>>>
>>>>>> John
>>>>>>
>>>>>>
>>>>>>
>>>
>

Reply via email to