waterlx edited a comment on issue #457: DataFrame generated by Seq() might have schema conflict with Iceberg URL: https://github.com/apache/incubator-iceberg/issues/457#issuecomment-528725863 String in Spark has default nullable setting as "true", and Iceberg has the following code to decide required or optional with respect to nullable() ``` java // public Type struct(StructType struct, List<Type> types) in SparkTypeToType.java if (field.nullable()) { newFields.add(Types.NestedField.optional(id, field.name(), type, doc)); } else { newFields.add(Types.NestedField.required(id, field.name(), type, doc)); } ``` When in Iceberg, we add a " required" field as StringType, the error message is reported. This issue could be resolved by explicitly specifying "nullable" for schema in Spark, like: ``` scala val schema = StructType(List( StructField("string_column", StringType, nullable = false), ... } ``` But I have no idea on how to resolve it when DataFrame is generated by Seq() or loading from a file
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org