HyukjinKwon commented on code in PR #34:
URL: https://github.com/apache/spark-docker/pull/34#discussion_r1178156904


##########
OVERVIEW.md:
##########
@@ -0,0 +1,60 @@
+# What is Apache Spark™?
+
+Apache Spark™ is a multi-language engine for executing data engineering, data 
science, and machine learning on single-node machines or clusters. It provides 
high-level APIs in Scala, Java, Python, and R, and an optimized engine that 
supports general computation graphs for data analysis. It also supports a rich 
set of higher-level tools including Spark SQL for SQL and DataFrames, pandas 
API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph 
processing, and Structured Streaming for stream processing.
+
+https://spark.apache.org/
+
+## Online Documentation
+
+You can find the latest Spark documentation, including a programming guide, on 
the [project web page](https://spark.apache.org/documentation.html). This 
README file only contains basic setup instructions.
+
+## Interactive Scala Shell
+
+The easiest way to start using Spark is through the Scala shell:
+
+```
+docker run -it apache/spark /opt/spark/bin/spark-shell
+```
+
+Try the following command, which should return 1,000,000,000:
+
+```
+scala> spark.range(1000 * 1000 * 1000).count()
+```
+
+## Interactive Python Shell
+
+The easiest way to start using PySpark is through the Python shell:
+
+```
+docker run -it apache/spark /opt/spark/bin/pyspark
+```
+
+And run the following command, which should also return 1,000,000,000:
+
+```
+>>> spark.range(1000 * 1000 * 1000).count()
+```
+
+## Interactive R Shell
+
+The easiest way to start using R on Spark is through the R shell:
+
+```
+docker run -it apache/spark:r /opt/spark/bin/sparkR
+```
+
+## Running Spark on Kubernetes
+
+https://spark.apache.org/docs/latest/running-on-kubernetes.html
+
+## Supported tags and respective Dockerfile links
+
+Currently, the `apache/spark` docker image supports 4 types for each version:
+
+Such as for v3.4.0:
+-  [3.4.0-scala2.12-java11-python3-ubuntu, 3.4.0-python3, 3.4.0, python3, 
latest](https://github.com/apache/spark-docker/tree/fe05e38f0ffad271edccd6ae40a77d5f14f3eef7/3.4.0/scala2.12-java11-python3-ubuntu)

Review Comment:
   ```suggestion
   - [3.4.0-scala2.12-java11-python3-ubuntu, 3.4.0-python3, 3.4.0, python3, 
latest](https://github.com/apache/spark-docker/tree/fe05e38f0ffad271edccd6ae40a77d5f14f3eef7/3.4.0/scala2.12-java11-python3-ubuntu)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to