[jira] [Updated] (SPARK-15031) Use SparkSession in Scala/Python example.

Dongjoon Hyun (JIRA) Sat, 30 Apr 2016 04:11:55 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dongjoon Hyun updated SPARK-15031:
----------------------------------
    Description: 
This PR aims to update Scala/Python examples by replacing SQLContext with newly 
added SparkSession. Also, this fixes the following examples.

**sql.py**
{code}
-    people = sqlContext.jsonFile(path)
+    people = sqlContext.read.json(path)
-    people.registerAsTable("people")
+    people.registerTempTable("people")
{code}

**dataframe_example.py**
{code}
- features = df.select("features").map(lambda r: r.features)
+ features = df.select("features").rdd.map(lambda r: r.features)
{code}

Note that the following examples are untouched in this PR since it fails some 
unknown issue.

- `simple_params_example.py`
- `aft_survival_regression.py`

  was:
Currently, Python SQL example, `sql.py`, fails.

{code}
bin/spark-submit examples/src/main/python/sql.py
Traceback (most recent call last):
  File 
"/Users/dongjoon/spark-release/spark-2.0/examples/src/main/python/sql.py", line 
60, in <module>
    people = sqlContext.jsonFile(path)
AttributeError: 'SQLContext' object has no attribute 'jsonFile'
{code}

{code}
Traceback (most recent call last):
  File 
"/Users/dongjoon/spark-release/spark-2.0/examples/src/main/python/sql.py", line 
72, in <module>
    people.registerAsTable("people")
  File "/Users/dongjoon/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", 
line 795, in __getattr__
AttributeError: 'DataFrame' object has no attribute 'registerAsTable'
{code}

This issue fixes them by the following fix.
{code}
-    people = sqlContext.jsonFile(path)
+    people = sqlContext.read.json(path)
...
-    people.registerAsTable("people")
+    people.registerTempTable("people")
{code}


> Use SparkSession in Scala/Python example.
> -----------------------------------------
>
>                 Key: SPARK-15031
>                 URL: https://issues.apache.org/jira/browse/SPARK-15031
>             Project: Spark
>          Issue Type: Improvement
>          Components: Examples
>            Reporter: Dongjoon Hyun
>            Priority: Trivial
>
> This PR aims to update Scala/Python examples by replacing SQLContext with 
> newly added SparkSession. Also, this fixes the following examples.
> **sql.py**
> {code}
> -    people = sqlContext.jsonFile(path)
> +    people = sqlContext.read.json(path)
> -    people.registerAsTable("people")
> +    people.registerTempTable("people")
> {code}
> **dataframe_example.py**
> {code}
> - features = df.select("features").map(lambda r: r.features)
> + features = df.select("features").rdd.map(lambda r: r.features)
> {code}
> Note that the following examples are untouched in this PR since it fails some 
> unknown issue.
> - `simple_params_example.py`
> - `aft_survival_regression.py`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-15031) Use SparkSession in Scala/Python example.

Reply via email to