[ 
https://issues.apache.org/jira/browse/SPARK-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingwei Lu updated SPARK-13752:
-------------------------------
    Attachment: sparkissue.scala

This is a repro case. 

> JSON array type parsing error
> -----------------------------
>
>                 Key: SPARK-13752
>                 URL: https://issues.apache.org/jira/browse/SPARK-13752
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.2
>            Reporter: Jingwei Lu
>         Attachments: sparkissue.scala
>
>
> Due to SPARK-3308, sql json parser will not able to handle invalid payload 
> field in air_events. This is how payload schema is defined. 
>                         }, {
>                             "name" : "payload",
>                             "type" : {
>                                 "type" : "array",
>                                 "elementType" : {
>                                     "type" : "struct",
>                                     "fields" : [ {
>                                         "name" : "type",
>                                         "type" : "string",
>                                         "nullable" : true,
>                                       }, {
>                                         "name" : "name",
>                                         "type" : "string",
>                                         "nullable" : true,
>                                       }, {
>                                         "name" : "duration",
>                                         "type" : "string",
>                                         "nullable" : true,
>                                       } ]
>                                 },
>                             "containsNull" : false
>                           },
>                           "nullable" : true,
>                         } ]
> For some of invalid payload, for example:
> "payload":[[],[],[],[],[]], or "payload":[[[js, ...], []] will pass the 
> schema validation and generate rows. However, the rows are not compatible 
> with spark sql when it try to access it in the filter. Spark will generate 
> internal CastClassException. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to