Is there a workaround ? My dataset contains billions of rows, and it would be
nice to ignore/exclude the few lines that are badly formatted.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/org-apache-spark-sql-types-GenericArrayData-cannot-be-cast-to-org-apa
I have found why the exception is raised.
I have defined a JSON schema, using org.apache.spark.sql.types.StructType,
that expects this kind of record :
/{
"request": {
"user": {
"id": 123
}
}
}/
There's a bad record in my dataset, that defines field "user" as an array,
instead of
Hi,
the following error is raised using Spark 1.5.2 or 1.6.0, in stand alone
mode, on my computer.
Has anyone had the same problem, and do you know what might cause this
exception ? Thanks in advance.
/16/03/02 15:12:27 WARN TaskSetManager: Lost task 9.0 in stage 0.0 (TID 9,
192.168.1.36): java.l