[jira] [Commented] (SPARK-10840) SparkSQL doesn't work well with JSON

Yin Huai (JIRA) Wed, 18 Nov 2015 15:48:40 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012388#comment-15012388
 ]


Yin Huai commented on SPARK-10840:
----------------------------------

The main cause is the limitation of Hadoop's TextInputFormat. We will take a 
look at how to resolve this issue. But, it is not clear when we will resolve it.

> SparkSQL doesn't work well with JSON
> ------------------------------------
>
>                 Key: SPARK-10840
>                 URL: https://issues.apache.org/jira/browse/SPARK-10840
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Jordan Sarraf
>            Priority: Minor
>              Labels: JSON, Scala, SparkSQL
>
> Well formed JSON doesn't work with the 1.5.1 version while using 
> sqlContext.read.json("<json-file>"):
> {
>   "employees": {
>     "employee": [
>       {
>         "name": "Mia",<newline>
>         "surname": "Radison",<newline>
>         "mobile": "7295913821",<newline>
>         "email": "[email protected]"
>       },
>       {
>         "name": "Thor",<newline>
>         "surname": "Kovaskz",<newline>
>         "mobile": "8829177193",<newline>
>         "email": "[email protected]"
>       },
>       {
>         "name": "Bindy",<newline>
>         "surname": "Kvuls",<newline>
>         "mobile": "5033828845",<newline>
>         "email": "[email protected]"
>       }
>     ]
>   }
> }
> For the above following error is obtained:
> ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 2)
> scala.MatchError: (VALUE_STRING,StructType()) (of class scala.Tuple2)
> Where as, this works fine because all components are in the same line:
>     [
>       {"name": "Mia","surname": "Radison","mobile": "7295913821","email": 
> "[email protected]"},
>       {"name": "Thor","surname": "Kovaskz","mobile": "8829177193","email": 
> "[email protected]"},
>       {"name": "Bindy","surname": "Kvuls","mobile": "5033828845","email": 
> "[email protected]"}
>     ]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-10840) SparkSQL doesn't work well with JSON

Reply via email to