[ https://issues.apache.org/jira/browse/SPARK-9032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen resolved SPARK-9032. ------------------------------- Resolution: Fixed Fix Version/s: 1.4.1 Just confirmed that this is fixed in 1.4.1. > scala.MatchError in DataFrameReader.json(String path) > ----------------------------------------------------- > > Key: SPARK-9032 > URL: https://issues.apache.org/jira/browse/SPARK-9032 > Project: Spark > Issue Type: Bug > Components: Java API, SQL > Affects Versions: 1.4.0 > Environment: Ubuntu 15.04 > Reporter: Philipp Poetter > Fix For: 1.4.1 > > > Executing read().json() of SQLContext e.g. DataFrameReader raises a > MatchError with a stacktrace as follows while trying to read JSON data: > {code} > 15/07/14 11:25:26 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 15/07/14 11:25:26 INFO DAGScheduler: Job 0 finished: json at Example.java:23, > took 6.981330 s > Exception in thread "main" scala.MatchError: StringType (of class > org.apache.spark.sql.types.StringType$) > at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58) > at > org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139) > at > org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138) > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137) > at org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137) > at > org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104) > at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:213) > at com.hp.sparkdemo.Example.main(Example.java:23) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664) > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 15/07/14 11:25:26 INFO SparkContext: Invoking stop() from shutdown hook > 15/07/14 11:25:26 INFO SparkUI: Stopped Spark web UI at http://10.0.2.15:4040 > 15/07/14 11:25:26 INFO DAGScheduler: Stopping DAGScheduler > 15/07/14 11:25:26 INFO SparkDeploySchedulerBackend: Shutting down all > executors > 15/07/14 11:25:26 INFO SparkDeploySchedulerBackend: Asking each executor to > shut down > 15/07/14 11:25:26 INFO MapOutputTrackerMasterEndpoint: > MapOutputTrackerMasterEndpoint stopped! > {code} > Offending code snippet (around line 23): > {code} > JavaSparkContext sctx = new JavaSparkContext(sparkConf); > SQLContext ctx = new SQLContext(sctx); > DataFrame frame = ctx.read().json(facebookJSON); > frame.printSchema(); > {code} > The exception is reproducable using the following JSON: > {code} > { > "data": [ > { > "id": "X999_Y999", > "from": { > "name": "Tom Brady", "id": "X12" > }, > "message": "Looking forward to 2010!", > "actions": [ > { > "name": "Comment", > "link": "http://www.facebook.com/X999/posts/Y999" > }, > { > "name": "Like", > "link": "http://www.facebook.com/X999/posts/Y999" > } > ], > "type": "status", > "created_time": "2010-08-02T21:27:44+0000", > "updated_time": "2010-08-02T21:27:44+0000" > }, > { > "id": "X998_Y998", > "from": { > "name": "Peyton Manning", "id": "X18" > }, > "message": "Where's my contract?", > "actions": [ > { > "name": "Comment", > "link": "http://www.facebook.com/X998/posts/Y998" > }, > { > "name": "Like", > "link": "http://www.facebook.com/X998/posts/Y998" > } > ], > "type": "status", > "created_time": "2010-08-02T21:27:44+0000", > "updated_time": "2010-08-02T21:27:44+0000" > } > ] > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org