I'm using Spark 1.0.0-SNAPSHOT (downloaded and compiled on 2014/06/23). I'm trying to execute the following code:
import org.apache.spark.SparkContext._ val sqlContext = new org.apache.spark.sql.SQLContext(sc) val table = sqlContext.jsonFile("hdfs://host:9100/user/myuser/data.json") table.printSchema() data.json looks like this (3 shortened lines shown here): {"field1":"content","id":12312213,"read":false,"user":{"id":121212,"name":"E. Stark","num_heads":0},"place":"Winterfell","entities":{"weapons":[],"friends":[{"name":"R. Baratheon","id":23234,"indices":[0,16]}]},"lang":"en"} {"field1":"content","id":56756765,"read":false,"user":{"id":121212,"name":"E. Stark","num_heads":0},"place":"Winterfell","entities":{"weapons":[],"friends":[{"name":"R. Baratheon","id":23234,"indices":[0,16]}]},"lang":"en"} {"field1":"content","id":56765765,"read":false,"user":{"id":121212,"name":"E. Stark","num_heads":0},"place":"Winterfell","entities":{"weapons":[],"friends":[{"name":"R. Baratheon","id":23234,"indices":[0,16]}]},"lang":"en"} The JSON-Object in each line is valid according to the JSON-Validator I use, and as jsonFile is defined as def jsonFile(path: String): SchemaRDD Loads a JSON file (one object per line), returning the result as a SchemaRDD. I would assume this should work. However, executing this code return this error: 14/06/25 10:05:09 WARN scheduler.TaskSetManager: Lost TID 11 (task 0.0:11) 14/06/25 10:05:09 WARN scheduler.TaskSetManager: Loss was due to com.fasterxml.jackson.databind.JsonMappingException com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input at [Source: java.io.StringReader@238df2e4; line: 1, column: 1] at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:164) ... Does anyone know where the problem lies? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/jsonFile-function-in-SQLContext-does-not-work-tp8273.html Sent from the Apache Spark User List mailing list archive at Nabble.com.