Jingwei Lu created SPARK-13752:
----------------------------------
Summary: JSON array type parsing error
Key: SPARK-13752
URL: https://issues.apache.org/jira/browse/SPARK-13752
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.5.2
Reporter: Jingwei Lu
Due to SPARK-3308, sql json parser will not able to handle invalid payload
field in air_events. This is how payload schema is defined.
}, {
"name" : "payload",
"type" : {
"type" : "array",
"elementType" : {
"type" : "struct",
"fields" : [ {
"name" : "type",
"type" : "string",
"nullable" : true,
}, {
"name" : "name",
"type" : "string",
"nullable" : true,
}, {
"name" : "duration",
"type" : "string",
"nullable" : true,
} ]
},
"containsNull" : false
},
"nullable" : true,
} ]
For some of invalid payload, for example:
"payload":[[],[],[],[],[]], or "payload":[[[js, ...], []] will pass the schema
validation and generate rows. However, the rows are not compatible with spark
sql when it try to access it in the filter. Spark will generate internal
CastClassException.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]