Ahyoung created ZEPPELIN-457:
--------------------------------

             Summary: Add documentation about Spark on EMR using Zeppelin 
Sandbox
                 Key: ZEPPELIN-457
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-457
             Project: Zeppelin
          Issue Type: Improvement
            Reporter: Ahyoung
            Assignee: Ahyoung
            Priority: Minor


Nowadays many people is using Spark on AWS EMR clusters. 
So, it would be helpful for the users if Zeppelin provides a step by step guide 
documentation. 

This documentation may include below contents.
 - How to create clusters and install "Zeppelin-Sandbox".
 - Establishing a connection to the master node using SSH.
 - How can we browse web interfaces hosted on our clusters that we made ? (How 
to set up a SSH tunnel to the master node using Local / Dynamic port forwarding)
 - Some information about predefined Zeppelin-Sandbox environment variables( 
such as Zeppelin itself, log and notebook directory locations in the master 
node), Hadoop, Spark, Zeppelin service port number and etc ..
 - Tutorials for beginners like attached image.
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/SparkDataframe.png?raw=true!
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/Result.png?raw=true!

Any ideas are welcome !




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to