richardc-db commented on code in PR #46312: URL: https://github.com/apache/spark/pull/46312#discussion_r1586697875
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala: ########## @@ -84,9 +84,16 @@ object ResolveDefaultColumns extends QueryErrorsBase if (SQLConf.get.enableDefaultColumns) { val newFields: Seq[StructField] = tableSchema.fields.map { field => if (field.metadata.contains(CURRENT_DEFAULT_COLUMN_METADATA_KEY)) { - val analyzed: Expression = analyze(field, statementType) + val defaultSql: String = if (field.dataType.isInstanceOf[VariantType]) { + // A variant's SQL/string representation is its JSON string which cannot be directly Review Comment: Hmm, i believe all default values are already stored as string sql expressions (usually literal expressions). I.e. the existing values (as a string sql expression) is retrieved [here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala#L389) and is analyzed using `analyze` and eventually coerced into the target type [here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala#L302). We could "coerce" the variant string by wrapping it with `parse_json` as mentioned in the PR description. However, this feels inefficient. We would evaluate the default expression to produce a variant, transform it to its string representation (by `expr.sql`) and then transform it back to its variant value later. Instead, we could just leave the expression as is and lazily evaluate it to avoid the extra variant-> string and string -> variant conversions. Regarding your other comment, I believe this doesn't work even if `ParseJson` can be constant folded (I quickly tried it) because `analyze` attempts to analyze the expression as a variant `{"k": "v"}`, which isn't valid sql syntax (there aren't even quotations around it). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org