Bram Boogaarts created SPARK-43341:
--------------------------------------

             Summary: StructType.toDDL does not pick up on non-nullability of 
column in nested struct
                 Key: SPARK-43341
                 URL: https://issues.apache.org/jira/browse/SPARK-43341
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.3.2, 3.3.1, 3.3.0
            Reporter: Bram Boogaarts


h2. The problem

When converting a StructType instance containing a nested StructType column 
which in turn contains a column for which {{nullable = false}} to a DDL string 
using {{{}.toDDL{}}}, the resulting DDL string does not include this 
non-nullability. For example:
{code:java}
val testschema = StructType(List(
  StructField("key", IntegerType, false),
  StructField("value", StringType, true),
  StructField("nestedCols", StructType(List(
    StructField("nestedKey", IntegerType, false),
    StructField("nestedValue", StringType, true)
  )), false)
))

println(testschema.toDDL)
println(StructType.fromDDL(testschema.toDDL)){code}
gives:
{code:java}
key INT NOT NULL,value STRING,nestedCols STRUCT<nestedKey: INT, nestedValue: 
STRING> NOT NULL

StructType(
  StructField(key,IntegerType,false),
  StructField(value,StringType,true),
  StructField(nestedCols,StructType(
    StructField(nestedKey,IntegerType,true),
    StructField(nestedValue,StringType,true)
  ),false)
){code}
 

This is due to the fact that {{StructType.toDDL}} calls {{StructField.toDDL}} 
for its fields, which in turn calls {{.sql}} for its {{{}dataType{}}}. If 
{{dataType}} is a {{{}StructType{}}}, the call to {{.sql}} in turn calls 
{{.sql}} for all the nested fields, and this last method does not include the 
nullability of the field in its output.
h2. Proposed solution

{{StructField.toDDL}} should call {{dataType.toDDL}} for a {{{}StructType{}}}, 
since this will include information about nullability of nested columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to