[
https://issues.apache.org/jira/browse/SPARK-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jingwei Lu updated SPARK-13752:
-------------------------------
Attachment: sparkissue.scala
This is a repro case.
> JSON array type parsing error
> -----------------------------
>
> Key: SPARK-13752
> URL: https://issues.apache.org/jira/browse/SPARK-13752
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.5.2
> Reporter: Jingwei Lu
> Attachments: sparkissue.scala
>
>
> Due to SPARK-3308, sql json parser will not able to handle invalid payload
> field in air_events. This is how payload schema is defined.
> }, {
> "name" : "payload",
> "type" : {
> "type" : "array",
> "elementType" : {
> "type" : "struct",
> "fields" : [ {
> "name" : "type",
> "type" : "string",
> "nullable" : true,
> }, {
> "name" : "name",
> "type" : "string",
> "nullable" : true,
> }, {
> "name" : "duration",
> "type" : "string",
> "nullable" : true,
> } ]
> },
> "containsNull" : false
> },
> "nullable" : true,
> } ]
> For some of invalid payload, for example:
> "payload":[[],[],[],[],[]], or "payload":[[[js, ...], []] will pass the
> schema validation and generate rows. However, the rows are not compatible
> with spark sql when it try to access it in the filter. Spark will generate
> internal CastClassException.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]