I don’t see knox shell integration as replacement for other interpreter / 
language supported in Zeppelin. This is merely a usability enhancement. If I 
have a cluster behind knox I can quickly interact with the services via DSL 
also might required less configuration e.g. base, hive interpreter will need 
‘client’ like configuration/jar. This is particularly interesting if notebook 
server is running outside the cluster. So right now I don’t have any other itch 
to scratch :) other than that. 

I am not sure how integration with visualization works in zeppelin, it might be 
specific to interpreter e.g. %jdbc and what data is returned but potentially 
could work the same I guess or could simply leverage built-in display such as 
table  … 
http://zeppelin.apache.org/docs/0.6.2/displaysystem/basicdisplaysystem.html#table

As far as use case, could show some interaction with supported hadoop services 
with knox, could be something like:

%knox
move files to hdfs
perform hive ql to build/transform data via webhcat — assume i did not setup 
hive interpreter or could not

%spark
query hive data via sparkSQL


— Pierre


> On Jan 30, 2017, at 8:40 PM, larry mccay <[email protected]> wrote:
> 
> Hi Pierre -
> 
> Yes, Sumit has already implemented an interpreter to do that.
> 
> I am more interested in what you see as the value-add or the reason that 
> someone would want to use the Knox DSL instead of existing interpreters for 
> largely the same access.
> 
> From a language perspective the DSL does have closures for async operations 
> which are handy but I think they are available in things like scala as well.
> 
> If you haven't had a particular itch in mind that the DSL scratches then 
> maybe it is fine to say that we have the ability to use the same DSL within a 
> notebook as we are from your desktop.
> 
> We would likely want to be able to provide some sort of visualization as is 
> generally wanted in notebooks.
> 
> Can we combine visualizations that are already available in Zeppelin for 
> other interpreters with out own?
> I've seen that there is a MD interpreter for instance - can that be used to 
> format render JSON results in a table for instance?
> 
> What would we want to use as a showcase script in Zeppelin?
> 
> thanks!
> 
> --larry
> 
> On Mon, Jan 30, 2017 at 7:57 PM, Pierre Regazzoni <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi Larry,
> 
> The basic premise would be to be able to open a knox shell within the 
> notebook as follow:
> 
> %knox
> Hdfs.rm(session).file(“/path/to/file”).now()
> 
> knox host, port and credentials would need to be set in the plug-in 
> configuration.
> 
> This would allow directly client interaction with the cluster and leveraging 
> the shell api within the notebook.
> 
> —Pierre
> 
> 
>> On Jan 30, 2017, at 9:29 AM, larry mccay <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi Sumit -
>> 
>> Thanks for the check point summary and DISCUSS thread.
>> The summary actually sounds like we are really making some good progress - 
>> which I knew but hadn't seen it put all together like this!
>> 
>> 1. What are the use cases driving the Zeppelin interpreter? How is
>> that expected to be used and how can we make it easy to use out of
>> box?
>> 
>> <ljm>
>> Excellent question. Much of what we can do with the DSL in the interpreter 
>> is available in other interpreters.
>> The DSL has async operations which are handy and a similar programming mode 
>> across all the APIs - due to the fluent code style of the DSL.
>> One of the advantages of using the DSL over other CLI type approaches is the 
>> ability to source control the scripts - zeppelin would also provide a 
>> similar way to do this with notebooks.
>> 
>> @Pierre - this was a suggestion that you made to me a while ago. 
>> Can you articulate the value add that you envision for it?
>> </ljm>
>> 
>> 2. Do we need a release module for KnoxShell? How do we want to
>> provide the download to users?
>> 
>> <ljm>
>> I believe that we do at some point and probably before we go to a 1.0 
>> release for Knox.
>> If we could add this for 0.12.0 as an early attempt that would be great and 
>> shouldn't be that difficult.
>> </ljm>
>> 
>> 3. Do we need all of KIP-4 in to call this complete or is what we have
>> so far in the works good enough for 0.12.0?
>> 
>> <ljm>
>> I don't believe that 0.12.0 has to be blocked by KIP-4.
>> As with KIP-1 (LDAP Improvements), they are used as the driving usecases for 
>> the releases but can and will continue to need work and completion beyond 
>> the initial target release. Focusing this way seems to be providing a great 
>> way to bootstrap progress in specific areas that can continue to be 
>> completed and improve from release to release.
>> 
>> I do want to get the #2 improvement from KIP-4 (Token service and credential 
>> collector) feature branch merged for 0.12.0.
>> I think this opens up lots of possibilities and will be great to get some 
>> early adopters.
>> </ljm>
>> 
>> I'm sure there are more questions to be had. I am excited by the
>> uptake of the client DSL library and its usefulness to end users. I
>> hope we can make it more useful and easier to consume in 0.12.0.
>> 
>> <ljm>
>> I am also really excited about these improvements and uptake.
>> As we move to more and more cloud deployment scenarios, this aspect of Knox 
>> is going to be more and more important.
>> 
>> One thing that I would really love to have articulated are some usecases 
>> that currently require SSH access by data workers that could be done through 
>> the KnoxShell and eliminate the need for SSH. Without some of these usecases 
>> we will likely fall short by a task or two and it will be difficult to cut 
>> off SSH.
>> </ljm>
>> 
>> On Mon, Jan 30, 2017 at 11:38 AM, sumit gupta <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hey everyone,
>> 
>> The list of JIRAs for 0.12.0 have steadily increased over the last few
>> weeks. We have also had a lot of great activity and contributions
>> related to KIP-4 and KnoxShell improvements. I wanted to start a
>> discuss thread to tie things up a little bit for a reasonable
>> deliverable in this area in the 0.12.0 release.
>> 
>> Just to reiterate where we are:
>> 
>> We have had a lot of contributions that can be mapped to KIP-4 goals,
>> especially improvement number 4 in the list of improvements on KIP-4.
>> 
>> I believe Larry Mccay has a feature branch going for improvement number 2.
>> 
>> I have taken a stab at a Zeppelin interpreter (improvement number 3)
>> in a forked zeppelin repo that can be found here (the branch is
>> 'knoxshell-interpreter'):
>> 
>> https://github.com/sumitg/zeppelin/tree/knoxshell-interpreter 
>> <https://github.com/sumitg/zeppelin/tree/knoxshell-interpreter>
>> 
>> and we have added some tests as part of KNOX-845 (improvement number 5).
>> 
>> Some open questions I have:
>> 
>> 1. What are the use cases driving the Zeppelin interpreter? How is
>> that expected to be used and how can we make it easy to use out of
>> box?
>> 
>> 2. Do we need a release module for KnoxShell? How do we want to
>> provide the download to users?
>> 
>> 3. Do we need all of KIP-4 in to call this complete or is what we have
>> so far in the works good enough for 0.12.0?
>> 
>> I'm sure there are more questions to be had. I am excited by the
>> uptake of the client DSL library and its usefulness to end users. I
>> hope we can make it more useful and easier to consume in 0.12.0.
>> 
>> Thanks,
>> Sumit
>> 
> 
> 

Reply via email to