Bram Boogaarts created SPARK-43341: -------------------------------------- Summary: StructType.toDDL does not pick up on non-nullability of column in nested struct Key: SPARK-43341 URL: https://issues.apache.org/jira/browse/SPARK-43341 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.2, 3.3.1, 3.3.0 Reporter: Bram Boogaarts
h2. The problem When converting a StructType instance containing a nested StructType column which in turn contains a column for which {{nullable = false}} to a DDL string using {{{}.toDDL{}}}, the resulting DDL string does not include this non-nullability. For example: {code:java} val testschema = StructType(List( StructField("key", IntegerType, false), StructField("value", StringType, true), StructField("nestedCols", StructType(List( StructField("nestedKey", IntegerType, false), StructField("nestedValue", StringType, true) )), false) )) println(testschema.toDDL) println(StructType.fromDDL(testschema.toDDL)){code} gives: {code:java} key INT NOT NULL,value STRING,nestedCols STRUCT<nestedKey: INT, nestedValue: STRING> NOT NULL StructType( StructField(key,IntegerType,false), StructField(value,StringType,true), StructField(nestedCols,StructType( StructField(nestedKey,IntegerType,true), StructField(nestedValue,StringType,true) ),false) ){code} This is due to the fact that {{StructType.toDDL}} calls {{StructField.toDDL}} for its fields, which in turn calls {{.sql}} for its {{{}dataType{}}}. If {{dataType}} is a {{{}StructType{}}}, the call to {{.sql}} in turn calls {{.sql}} for all the nested fields, and this last method does not include the nullability of the field in its output. h2. Proposed solution {{StructField.toDDL}} should call {{dataType.toDDL}} for a {{{}StructType{}}}, since this will include information about nullability of nested columns. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org