[ 
https://issues.apache.org/jira/browse/SPARK-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131752#comment-14131752
 ] 

Nicholas Chammas commented on SPARK-3500:
-----------------------------------------

Hmm, you _could_ perhaps consider this a missing feature, though since all base 
RDD operations should also be valid SchemaRDD operations (right?), it 
definitely feels like a bug. And it's not just for SchemaRDDs created by 
jsonRDD (as noted in the title).

It looks like {{repartition}} is missing, too.

{code}
from pyspark.sql import SQLContext
from pyspark.sql import Row
sqlContext = SQLContext(sc)

a = sc.parallelize([Row(field1=1, field2="row1")])
sqlContext.inferSchema(a).coalesce(1)  # Method coalesce does not exist
sqlContext.inferSchema(a).repartition(1)  # Method repartition does not exist
{code}

> SchemaRDD from jsonRDD() has not coalesce() method
> --------------------------------------------------
>
>                 Key: SPARK-3500
>                 URL: https://issues.apache.org/jira/browse/SPARK-3500
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.1.0
>            Reporter: Davies Liu
>            Assignee: Davies Liu
>            Priority: Critical
>
> {code}
> >>> sqlCtx.jsonRDD(sc.parallelize(['{"foo":"bar"}', 
> >>> '{"foo":"baz"}'])).coalesce(1)
> Py4JError: An error occurred while calling o94.coalesce. Trace:
> py4j.Py4JException: Method coalesce([class java.lang.Integer, class 
> java.lang.Boolean]) does not exist
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to