[
https://issues.apache.org/jira/browse/SPARK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-15031:
----------------------------------
Description:
This PR aims to update Scala/Python/Java examples by replacing `SQLContext`
with newly added `SparkSession`.
- Use *SparkSession Builder Pattern* in 154(Scala 55, Java 52, Python 47) files.
- Add `getConf` in Python `SparkContext` class: python/pyspark/context.py
- Replace *SQLContext Singleton Pattern* with *SparkSession Singleton Pattern*:
- `SqlNetworkWordCount.scala`
- `JavaSqlNetworkWordCount.java`
- `sql_network_wordcount.py`
Now, `SQLContexts` are used only in R examples and the following two Python
examples. The python examples are untouched in this PR since it already fails
some unknown issue.
- `simple_params_example.py`
- `aft_survival_regression.py`
was:
This PR aims to update Scala/Python/Java examples by replacing `SQLContext`
with newly added `SparkSession`. For this, two new `SparkSesion` ctor are
added, and also fixes the following examples.
**sql.py**
{code}
- people = sqlContext.jsonFile(path)
+ people = sqlContext.read.json(path)
- people.registerAsTable("people")
+ people.registerTempTable("people")
{code}
**dataframe_example.py**
{code}
- features = df.select("features").map(lambda r: r.features)
+ features = df.select("features").rdd.map(lambda r: r.features)
{code}
Note that the following examples are untouched in this PR since it fails some
unknown issue.
- `simple_params_example.py`
- `aft_survival_regression.py`
> Use SparkSession in Scala/Python/Java example.
> ----------------------------------------------
>
> Key: SPARK-15031
> URL: https://issues.apache.org/jira/browse/SPARK-15031
> Project: Spark
> Issue Type: Sub-task
> Components: Examples
> Reporter: Dongjoon Hyun
> Assignee: Dongjoon Hyun
>
> This PR aims to update Scala/Python/Java examples by replacing `SQLContext`
> with newly added `SparkSession`.
> - Use *SparkSession Builder Pattern* in 154(Scala 55, Java 52, Python 47)
> files.
> - Add `getConf` in Python `SparkContext` class: python/pyspark/context.py
> - Replace *SQLContext Singleton Pattern* with *SparkSession Singleton
> Pattern*:
> - `SqlNetworkWordCount.scala`
> - `JavaSqlNetworkWordCount.java`
> - `sql_network_wordcount.py`
> Now, `SQLContexts` are used only in R examples and the following two Python
> examples. The python examples are untouched in this PR since it already fails
> some unknown issue.
> - `simple_params_example.py`
> - `aft_survival_regression.py`
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]