spark git commit: [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R
Repository: spark Updated Branches: refs/heads/branch-2.0 4fc4eb943 -> dbf7f48b6 [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R ## What changes were proposed in this pull request? Update doc as per discussion in PR #13592 ## How was this patch tested? manual shivaram liancheng Author: Felix CheungCloses #13799 from felixcheung/rsqlprogrammingguide. (cherry picked from commit 58f6e27dd70f476f99ac8204e6b405bced4d6de1) Signed-off-by: Cheng Lian Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dbf7f48b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dbf7f48b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dbf7f48b Branch: refs/heads/branch-2.0 Commit: dbf7f48b6e73f3500b0abe9055ac204a3f756418 Parents: 4fc4eb9 Author: Felix Cheung Authored: Tue Jun 21 13:56:37 2016 +0800 Committer: Cheng Lian Committed: Tue Jun 21 13:57:03 2016 +0800 -- docs/sparkr.md| 2 +- docs/sql-programming-guide.md | 34 -- 2 files changed, 17 insertions(+), 19 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/dbf7f48b/docs/sparkr.md -- diff --git a/docs/sparkr.md b/docs/sparkr.md index 023bbcd..f018901 100644 --- a/docs/sparkr.md +++ b/docs/sparkr.md @@ -152,7 +152,7 @@ write.df(people, path="people.parquet", source="parquet", mode="overwrite") ### From Hive tables -You can also create SparkDataFrames from Hive tables. To do this we will need to create a SparkSession with Hive support which can access tables in the Hive MetaStore. Note that Spark should have been built with [Hive support](building-spark.html#building-with-hive-and-jdbc-support) and more details can be found in the [SQL programming guide](sql-programming-guide.html#starting-point-sqlcontext). In SparkR, by default it will attempt to create a SparkSession with Hive support enabled (`enableHiveSupport = TRUE`). +You can also create SparkDataFrames from Hive tables. To do this we will need to create a SparkSession with Hive support which can access tables in the Hive MetaStore. Note that Spark should have been built with [Hive support](building-spark.html#building-with-hive-and-jdbc-support) and more details can be found in the [SQL programming guide](sql-programming-guide.html#starting-point-sparksession). In SparkR, by default it will attempt to create a SparkSession with Hive support enabled (`enableHiveSupport = TRUE`). {% highlight r %} http://git-wip-us.apache.org/repos/asf/spark/blob/dbf7f48b/docs/sql-programming-guide.md -- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index d93f30b..4206f73 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -107,19 +107,17 @@ spark = SparkSession.build \ -Unlike Scala, Java, and Python API, we haven't finished migrating `SQLContext` to `SparkSession` for SparkR yet, so -the entry point into all relational functionality in SparkR is still the -`SQLContext` class in Spark 2.0. To create a basic `SQLContext`, all you need is a `SparkContext`. +The entry point into all functionality in Spark is the [`SparkSession`](api/R/sparkR.session.html) class. To initialize a basic `SparkSession`, just call `sparkR.session()`: {% highlight r %} -spark <- sparkRSQL.init(sc) +sparkR.session() {% endhighlight %} -Note that when invoked for the first time, `sparkRSQL.init()` initializes a global `SQLContext` singleton instance, and always returns a reference to this instance for successive invocations. In this way, users only need to initialize the `SQLContext` once, then SparkR functions like `read.df` will be able to access this global instance implicitly, and users don't need to pass the `SQLContext` instance around. +Note that when invoked for the first time, `sparkR.session()` initializes a global `SparkSession` singleton instance, and always returns a reference to this instance for successive invocations. In this way, users only need to initialize the `SparkSession` once, then SparkR functions like `read.df` will be able to access this global instance implicitly, and users don't need to pass the `SparkSession` instance around. -`SparkSession` (or `SQLContext` for SparkR) in Spark 2.0 provides builtin support for Hive features including the ability to +`SparkSession` in Spark 2.0 provides builtin support for Hive features including the ability to write queries using HiveQL, access to Hive UDFs, and the ability to read data from Hive tables. To use
spark git commit: [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R
Repository: spark Updated Branches: refs/heads/master 07367533d -> 58f6e27dd [SPARK-15863][SQL][DOC][SPARKR] sql programming guide updates to include sparkSession in R ## What changes were proposed in this pull request? Update doc as per discussion in PR #13592 ## How was this patch tested? manual shivaram liancheng Author: Felix CheungCloses #13799 from felixcheung/rsqlprogrammingguide. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/58f6e27d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/58f6e27d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/58f6e27d Branch: refs/heads/master Commit: 58f6e27dd70f476f99ac8204e6b405bced4d6de1 Parents: 0736753 Author: Felix Cheung Authored: Tue Jun 21 13:56:37 2016 +0800 Committer: Cheng Lian Committed: Tue Jun 21 13:56:37 2016 +0800 -- docs/sparkr.md| 2 +- docs/sql-programming-guide.md | 34 -- 2 files changed, 17 insertions(+), 19 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/58f6e27d/docs/sparkr.md -- diff --git a/docs/sparkr.md b/docs/sparkr.md index 023bbcd..f018901 100644 --- a/docs/sparkr.md +++ b/docs/sparkr.md @@ -152,7 +152,7 @@ write.df(people, path="people.parquet", source="parquet", mode="overwrite") ### From Hive tables -You can also create SparkDataFrames from Hive tables. To do this we will need to create a SparkSession with Hive support which can access tables in the Hive MetaStore. Note that Spark should have been built with [Hive support](building-spark.html#building-with-hive-and-jdbc-support) and more details can be found in the [SQL programming guide](sql-programming-guide.html#starting-point-sqlcontext). In SparkR, by default it will attempt to create a SparkSession with Hive support enabled (`enableHiveSupport = TRUE`). +You can also create SparkDataFrames from Hive tables. To do this we will need to create a SparkSession with Hive support which can access tables in the Hive MetaStore. Note that Spark should have been built with [Hive support](building-spark.html#building-with-hive-and-jdbc-support) and more details can be found in the [SQL programming guide](sql-programming-guide.html#starting-point-sparksession). In SparkR, by default it will attempt to create a SparkSession with Hive support enabled (`enableHiveSupport = TRUE`). {% highlight r %} http://git-wip-us.apache.org/repos/asf/spark/blob/58f6e27d/docs/sql-programming-guide.md -- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index d93f30b..4206f73 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -107,19 +107,17 @@ spark = SparkSession.build \ -Unlike Scala, Java, and Python API, we haven't finished migrating `SQLContext` to `SparkSession` for SparkR yet, so -the entry point into all relational functionality in SparkR is still the -`SQLContext` class in Spark 2.0. To create a basic `SQLContext`, all you need is a `SparkContext`. +The entry point into all functionality in Spark is the [`SparkSession`](api/R/sparkR.session.html) class. To initialize a basic `SparkSession`, just call `sparkR.session()`: {% highlight r %} -spark <- sparkRSQL.init(sc) +sparkR.session() {% endhighlight %} -Note that when invoked for the first time, `sparkRSQL.init()` initializes a global `SQLContext` singleton instance, and always returns a reference to this instance for successive invocations. In this way, users only need to initialize the `SQLContext` once, then SparkR functions like `read.df` will be able to access this global instance implicitly, and users don't need to pass the `SQLContext` instance around. +Note that when invoked for the first time, `sparkR.session()` initializes a global `SparkSession` singleton instance, and always returns a reference to this instance for successive invocations. In this way, users only need to initialize the `SparkSession` once, then SparkR functions like `read.df` will be able to access this global instance implicitly, and users don't need to pass the `SparkSession` instance around. -`SparkSession` (or `SQLContext` for SparkR) in Spark 2.0 provides builtin support for Hive features including the ability to +`SparkSession` in Spark 2.0 provides builtin support for Hive features including the ability to write queries using HiveQL, access to Hive UDFs, and the ability to read data from Hive tables. To use these features, you do not need to have an existing Hive setup. @@ -175,7 +173,7 @@ df.show() -With a `SQLContext`,