[
https://issues.apache.org/jira/browse/SPARK-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shixiong Zhu resolved SPARK-15981.
----------------------------------
Resolution: Fixed
Fix Version/s: 2.0.0
> Fix bug in python DataStreamReader
> ----------------------------------
>
> Key: SPARK-15981
> URL: https://issues.apache.org/jira/browse/SPARK-15981
> Project: Spark
> Issue Type: Sub-task
> Components: SQL, Streaming
> Reporter: Tathagata Das
> Assignee: Tathagata Das
> Priority: Blocker
> Fix For: 2.0.0
>
>
> Bug in Python DataStreamReader API made it unusable. Because a single path
> was being converted to a array before calling Java DataStreamReader method
> (which takes a string only), it gave the following error.
> {code}
> File "/Users/tdas/Projects/Spark/spark/python/pyspark/sql/readwriter.py",
> line 947, in pyspark.sql.readwriter.DataStreamReader.json
> Failed example:
> json_sdf = spark.readStream.json(os.path.join(tempfile.mkdtemp(),
> 'data'), schema = sdf_schema)
> Exception raised:
> Traceback (most recent call last):
> File
> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/doctest.py",
> line 1253, in __run
> compileflags, 1) in test.globs
> File "<doctest pyspark.sql.readwriter.DataStreamReader.json[0]>", line
> 1, in <module>
> json_sdf = spark.readStream.json(os.path.join(tempfile.mkdtemp(),
> 'data'), schema = sdf_schema)
> File
> "/Users/tdas/Projects/Spark/spark/python/pyspark/sql/readwriter.py", line
> 963, in json
> return self._df(self._jreader.json(path))
> File
> "/Users/tdas/Projects/Spark/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py",
> line 933, in __call__
> answer, self.gateway_client, self.target_id, self.name)
> File "/Users/tdas/Projects/Spark/spark/python/pyspark/sql/utils.py",
> line 63, in deco
> return f(*a, **kw)
> File
> "/Users/tdas/Projects/Spark/spark/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py",
> line 316, in get_return_value
> format(target_id, ".", name, value))
> Py4JError: An error occurred while calling o121.json. Trace:
> py4j.Py4JException: Method json([class java.util.ArrayList]) does not
> exist
> at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
> at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
> at py4j.Gateway.invoke(Gateway.java:272)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:211)
> at java.lang.Thread.run(Thread.java:744)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]