Ahyoung created ZEPPELIN-457:
--------------------------------
Summary: Add documentation about Spark on EMR using Zeppelin
Sandbox
Key: ZEPPELIN-457
URL: https://issues.apache.org/jira/browse/ZEPPELIN-457
Project: Zeppelin
Issue Type: Improvement
Reporter: Ahyoung
Assignee: Ahyoung
Priority: Minor
Nowadays many people is using Spark on AWS EMR clusters.
So, it would be helpful for the users if Zeppelin provides a step by step guide
documentation.
This documentation may include below contents.
- How to create clusters and install "Zeppelin-Sandbox".
- Establishing a connection to the master node using SSH.
- How can we browse web interfaces hosted on our clusters that we made ? (How
to set up a SSH tunnel to the master node using Local / Dynamic port forwarding)
- Some information about predefined Zeppelin-Sandbox environment variables(
such as Zeppelin itself, log and notebook directory locations in the master
node), Hadoop, Spark, Zeppelin service port number and etc ..
- Tutorials for beginners like attached image.
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/SparkDataframe.png?raw=true!
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/Result.png?raw=true!
Any ideas are welcome !
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)