richardc-db commented on code in PR #46312:
URL: https://github.com/apache/spark/pull/46312#discussion_r1586798259
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -84,9 +84,16 @@ object ResolveDefaultColumns extends QueryErrorsBase
if (SQLConf.get.enableDefaultColumns) {
val newFields: Seq[StructField] = tableSchema.fields.map { field =>
if (field.metadata.contains(CURRENT_DEFAULT_COLUMN_METADATA_KEY)) {
- val analyzed: Expression = analyze(field, statementType)
+ val defaultSql: String = if
(field.dataType.isInstanceOf[VariantType]) {
+ // A variant's SQL/string representation is its JSON string which
cannot be directly
+ // casted to a variant type. Thus, we lazily evaluate the default
expression to avoid
Review Comment:
Yep, `parse_json('1')` works (even before this PR, but unintentionally).
This is because the current code inserts a cast to variant, which is ok in this
case because `CAST(1 as VARIANT)` succeeds
In the more complex case such as a default of `parse_json('{'k': 'v'}')`,
the analyzer actually fails because it tries to analyze the sql text `{'k':
'v'}` (note the lack of outer quotations). An alternative to the taken approach
in this PR is to create a string literal from `{'k': 'v'}` and wrap it with
`parse_json` (effectively coercing it to the correct variant type). This feels
inefficient, however, because we have variant->string->variant conversion
rather than lazily evaluating it once
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]