[
https://issues.apache.org/jira/browse/SPARK-43341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17718856#comment-17718856
]
ASF GitHub Bot commented on SPARK-43341:
----------------------------------------
User 'BramBoog' has created a pull request for this issue:
https://github.com/apache/spark/pull/41016
> StructType.toDDL does not pick up on non-nullability of column in nested
> struct
> -------------------------------------------------------------------------------
>
> Key: SPARK-43341
> URL: https://issues.apache.org/jira/browse/SPARK-43341
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.3.0, 3.3.1, 3.3.2
> Reporter: Bram Boogaarts
> Priority: Major
>
> h2. The problem
> When converting a StructType instance containing a nested StructType column
> which in turn contains a column for which {{nullable = false}} to a DDL
> string using {{{}.toDDL{}}}, the resulting DDL string does not include this
> non-nullability. For example:
> {code:java}
> val testschema = StructType(List(
> StructField("key", IntegerType, false),
> StructField("value", StringType, true),
> StructField("nestedCols", StructType(List(
> StructField("nestedKey", IntegerType, false),
> StructField("nestedValue", StringType, true)
> )), false)
> ))
> println(testschema.toDDL)
> println(StructType.fromDDL(testschema.toDDL)){code}
> gives:
> {code:java}
> key INT NOT NULL,value STRING,nestedCols STRUCT<nestedKey: INT, nestedValue:
> STRING> NOT NULL
> StructType(
> StructField(key,IntegerType,false),
> StructField(value,StringType,true),
> StructField(nestedCols,StructType(
> StructField(nestedKey,IntegerType,true),
> StructField(nestedValue,StringType,true)
> ),false)
> ){code}
>
> This is due to the fact that {{StructType.toDDL}} calls {{StructField.toDDL}}
> for its fields, which in turn calls {{.sql}} for its {{{}dataType{}}}. If
> {{dataType}} is a {{{}StructType{}}}, the call to {{.sql}} in turn calls
> {{.sql}} for all the nested fields, and this last method does not include the
> nullability of the field in its output.
> h2. Proposed solution
> {{StructField.toDDL}} should call {{dataType.toDDL}} for a
> {{{}StructType{}}}, since this will include information about nullability of
> nested columns.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]