[GitHub] spark pull request #22746: [SPARK-24499][SQL][DOC] Split the page of sql-pro...

gatorsmile Tue, 16 Oct 2018 21:55:43 -0700

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22746#discussion_r225780740
  
    --- Diff: docs/sql-getting-started.md ---
    @@ -0,0 +1,369 @@
    +---
    +layout: global
    +title: Getting Started
    +displayTitle: Getting Started
    +---
    +
    +* Table of contents
    +{:toc}
    +
    +## Starting Point: SparkSession
    +
    +<div class="codetabs">
    +<div data-lang="scala"  markdown="1">
    +
    +The entry point into all functionality in Spark is the 
[`SparkSession`](api/scala/index.html#org.apache.spark.sql.SparkSession) class. 
To create a basic `SparkSession`, just use `SparkSession.builder()`:
    +
    +{% include_example init_session 
scala/org/apache/spark/examples/sql/SparkSQLExample.scala %}
    +</div>
    +
    +<div data-lang="java" markdown="1">
    +
    +The entry point into all functionality in Spark is the 
[`SparkSession`](api/java/index.html#org.apache.spark.sql.SparkSession) class. 
To create a basic `SparkSession`, just use `SparkSession.builder()`:
    +
    +{% include_example init_session 
java/org/apache/spark/examples/sql/JavaSparkSQLExample.java %}
    +</div>
    +
    +<div data-lang="python"  markdown="1">
    +
    +The entry point into all functionality in Spark is the 
[`SparkSession`](api/python/pyspark.sql.html#pyspark.sql.SparkSession) class. 
To create a basic `SparkSession`, just use `SparkSession.builder`:
    +
    +{% include_example init_session python/sql/basic.py %}
    +</div>
    +
    +<div data-lang="r"  markdown="1">
    +
    +The entry point into all functionality in Spark is the 
[`SparkSession`](api/R/sparkR.session.html) class. To initialize a basic 
`SparkSession`, just call `sparkR.session()`:
    +
    +{% include_example init_session r/RSparkSQLExample.R %}
    +
    +Note that when invoked for the first time, `sparkR.session()` initializes 
a global `SparkSession` singleton instance, and always returns a reference to 
this instance for successive invocations. In this way, users only need to 
initialize the `SparkSession` once, then SparkR functions like `read.df` will 
be able to access this global instance implicitly, and users don't need to pass 
the `SparkSession` instance around.
    +</div>
    +</div>
    +
    +`SparkSession` in Spark 2.0 provides builtin support for Hive features 
including the ability to
    +write queries using HiveQL, access to Hive UDFs, and the ability to read 
data from Hive tables.
    +To use these features, you do not need to have an existing Hive setup.
    +
    +## Creating DataFrames
    +
    +<div class="codetabs">
    +<div data-lang="scala"  markdown="1">
    +With a `SparkSession`, applications can create DataFrames from an 
[existing `RDD`](#interoperating-with-rdds),
    +from a Hive table, or from [Spark data sources](#data-sources).
    --- End diff --
    
    The link `[Spark data sources](#data-sources)` does not work after this 
change. Could you fix all the similar cases? Thanks!



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22746: [SPARK-24499][SQL][DOC] Split the page of sql-pro...

Reply via email to