[ https://issues.apache.org/jira/browse/SPARK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicholas Chammas updated SPARK-2797: ------------------------------------ Description: Looks like something simple got missed in the Java layer? {code} >>> from pyspark.sql import SQLContext >>> sqlContext = SQLContext(sc) >>> raw = sc.parallelize(['{"a": 5}']) >>> events = sqlContext.jsonRDD(raw) >>> events.printSchema() root |-- a: IntegerType >>> events.cache() PythonRDD[45] at RDD at PythonRDD.scala:37 >>> events.unpersist() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/root/spark/python/pyspark/sql.py", line 440, in unpersist self._jschema_rdd.unpersist() File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", line 537, in __call__ File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", line 304, in get_return_value py4j.protocol.Py4JError: An error occurred while calling o108.unpersist. Trace: py4j.Py4JException: Method unpersist([]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) at py4j.Gateway.invoke(Gateway.java:251) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) >>> events.unpersist <bound method SchemaRDD.unpersist of PythonRDD[45] at RDD at PythonRDD.scala:37> {code} Note that the {{persist}} method exists but cannot be called without raising the shown error. This is on {{1.0.2-rc1}}. was: Looks like something simple got missed in the Java layer? {code} >>> from pyspark.sql import SQLContext >>> sqlContext = SQLContext(sc) >>> raw = sc.parallelize(['{"a": 5}']) >>> events = sqlContext.jsonRDD(raw) >>> events.printSchema() root |-- a: IntegerType >>> events.cache() PythonRDD[45] at RDD at PythonRDD.scala:37 >>> events.unpersist() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/root/spark/python/pyspark/sql.py", line 440, in unpersist self._jschema_rdd.unpersist() File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", line 537, in __call__ File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", line 304, in get_return_value py4j.protocol.Py4JError: An error occurred while calling o108.unpersist. Trace: py4j.Py4JException: Method unpersist([]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) at py4j.Gateway.invoke(Gateway.java:251) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) >>> events.unpersist <bound method SchemaRDD.unpersist of PythonRDD[45] at RDD at PythonRDD.scala:37> {code} This is on {{1.0.2-rc1}}. > SchemaRDDs don't support unpersist() > ------------------------------------ > > Key: SPARK-2797 > URL: https://issues.apache.org/jira/browse/SPARK-2797 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL > Affects Versions: 1.0.2 > Reporter: Nicholas Chammas > > Looks like something simple got missed in the Java layer? > {code} > >>> from pyspark.sql import SQLContext > >>> sqlContext = SQLContext(sc) > >>> raw = sc.parallelize(['{"a": 5}']) > >>> events = sqlContext.jsonRDD(raw) > >>> events.printSchema() > root > |-- a: IntegerType > >>> events.cache() > PythonRDD[45] at RDD at PythonRDD.scala:37 > >>> events.unpersist() > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/root/spark/python/pyspark/sql.py", line 440, in unpersist > self._jschema_rdd.unpersist() > File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", line > 537, in __call__ > File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", line > 304, in get_return_value > py4j.protocol.Py4JError: An error occurred while calling o108.unpersist. > Trace: > py4j.Py4JException: Method unpersist([]) does not exist > at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333) > at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) > at py4j.Gateway.invoke(Gateway.java:251) > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) > at py4j.commands.CallCommand.execute(CallCommand.java:79) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:745) > >>> events.unpersist > <bound method SchemaRDD.unpersist of PythonRDD[45] at RDD at > PythonRDD.scala:37> > {code} > Note that the {{persist}} method exists but cannot be called without raising > the shown error. > This is on {{1.0.2-rc1}}. -- This message was sent by Atlassian JIRA (v6.2#6252)