richardc-db commented on code in PR #46312:
URL: https://github.com/apache/spark/pull/46312#discussion_r1586697875
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -84,9 +84,16 @@ object ResolveDefaultColumns extends QueryErrorsBase
if (SQLConf.get.enableDefaultColumns) {
val newFields: Seq[StructField] = tableSchema.fields.map { field =>
if (field.metadata.contains(CURRENT_DEFAULT_COLUMN_METADATA_KEY)) {
- val analyzed: Expression = analyze(field, statementType)
+ val defaultSql: String = if
(field.dataType.isInstanceOf[VariantType]) {
+ // A variant's SQL/string representation is its JSON string which
cannot be directly
Review Comment:
Hmm, i believe all default values are already stored as string sql
expressions (usually literal expressions). I.e. the existing values (as a
string sql expression) is retrieved
[here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala#L389)
and is analyzed using `analyze` and eventually coerced into the target type
[here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala#L302).
We could "coerce" the variant string by wrapping it with `parse_json` as
mentioned in the PR description. However, this feels inefficient. We would
evaluate the default expression to produce a variant, transform it to its
string representation (by `expr.sql`) and then transform it back to its variant
value later. Instead, we could just leave the expression as is and lazily
evaluate it to avoid the extra variant-> string and string ->
variant conversions.
Regarding your other comment, I believe this doesn't work even if
`ParseJson` can be constant folded (I quickly tried it) because `analyze`
attempts to analyze the expression as a variant `{"k": "v"}`, which isn't valid
sql syntax (there aren't even quotations around it).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]