Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20705 @gatorsmile and @HyukjinKwon . Two failures are due to limitation of the current JSON data source implementation. Here, we can see that the test suite correctly tests the target data source. 1. `resolveRelation for a FileFormat DataSource without userSchema scan filesystem only once` For Json source, the statistic count becomes 2. 2. `Pre insert nullability check (MapType)` Since Json source save as string, it raises ClassCastException when the given user or table schema is different. ```scala scala> (Tuple1(Map(1 -> (null: Integer))) :: Nil).toDF("a").write.mode("overwrite").save("/tmp/json") scala> spark.read.json("/tmp/json").printSchema root |-- a: struct (nullable = true) | |-- 1: string (nullable = true) scala> (Tuple1(Map(1 -> (null: Integer))) :: Nil).toDF("a").write.mode("overwrite").saveAsTable("map") 18/03/02 21:13:49 WARN HiveExternalCatalog: Couldn't find corresponding Hive SerDe for data source provider json. Persisting data source table `default`.`map` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive. scala> spark.read.json("/tmp/json").printSchema root |-- a: struct (nullable = true) | |-- 1: string (nullable = true) scala> spark.table("map").printSchema root |-- a: map (nullable = true) | |-- key: integer | |-- value: integer (valueContainsNull = true) scala> spark.table("map").show 18/03/02 21:14:12 ERROR Executor: Exception in task 0.0 in stage 10.0 (TID 10) java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101) ``` For JSON format, could you confirm this, @HyukjinKwon ?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org