I have very very large json and I want to save by avoiding Spark to make scan over data to infer the schema. Instead since I already know the data, I would prefer to provide the schema myself with
sqlContext.read().schema(mySchema).json(jsonFilePath) however the problem is the json data format is kind of weird [ { "apiTypeName": "someApi", "allFieldsAndValues": { "Field_1": "Value", "Field_2": "Value", "Field_3": 779.0, "Field_4": "Value", "Field_5": true } }, { "apiTypeName": "someApi", "allFieldsAndValues": { "Field_1": "Value", "Field_2": "Value", "Field_3": 779.0, "Field_4": "Value", "Field_5": true } } ] I can't seem to construct a schema for this kind of data that Spark could use to avoid inferring schema on its own. Every which I have tried to create schema from StructType, StructField or Array combinations to build the schema, spark wouldn't pick it up as i intend it to Any help is appreciated -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-reading-json-with-pre-defined-schema-tp25353.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org