[
https://issues.apache.org/jira/browse/FLINK-39049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18085932#comment-18085932
]
Robert Yokota edited comment on FLINK-39049 at 6/3/26 10:49 PM:
----------------------------------------------------------------
One thing worth considering before we mirror the ANSI JSON functions: I don't
think we actually need both `VARIANT_QUERY` and `VARIANT_VALUE` once we have a
native `VARIANT` type.
The reason ANSI SQL splits `JSON_VALUE` and `JSON_QUERY` is really just a side
effect of static typing — a JSON path can land on either a scalar or a
structured value, and since SQL:2016 had no JSON type (it modeled JSON as a
plain string), the two functions exist to give those two outcomes two distinct,
compile-time-known return types.
A `VARIANT` type makes that distinction unnecessary: a single accessor can
always return `VARIANT`, and the "do I want a concrete scalar?" decision just
moves to an explicit cast at the point of use. That's exactly what Snowflake
and Databricks/Spark do — both expose a single navigation operator (`GET_PATH`
for Snowflake and `variant_get` for Databricks/Spark) returning `VARIANT`, with
no query/value split.
So my suggestion would be to go with a single function here (maybe just called
`VARIANT_GET`) rather than two.
was (Author: rayokota):
One thing worth considering before we mirror the ANSI JSON functions: I don't
think we actually need both `VARIANT_QUERY` and `VARIANT_VALUE` once we have a
native `VARIANT` type.
The reason ANSI SQL splits `JSON_VALUE` and `JSON_QUERY` is really just a side
effect of static typing — a JSON path can land on either a scalar or a
structured value, and since SQL:2016 had no JSON type (it modeled JSON as a
plain string), the two functions exist to give those two outcomes two distinct,
compile-time-known return types.
A `VARIANT` type makes that distinction unnecessary: a single accessor can
always return `VARIANT`, and the "do I want a concrete scalar?" decision just
moves to an explicit cast at the point of use. That's exactly what Snowflake
and Databricks/Spark do — both expose a single navigation operator (`GET_PATH`
for Snowflake and `variant_get` for Databricks/Spark) returning `VARIANT`, with
no query/value split.
So my suggestion would be to go with a single function here (maybe just called
`VARIANT_GET` or `VARIANT_PATH`) rather than two.
> Introduce built-in functions `VARIANT_VALUE` and `VARIANT_QUERY` to extract
> values from a variant
> -------------------------------------------------------------------------------------------------
>
> Key: FLINK-39049
> URL: https://issues.apache.org/jira/browse/FLINK-39049
> Project: Flink
> Issue Type: Sub-task
> Components: Table SQL / API
> Reporter: Jinkun Liu
> Priority: Major
>
> |SQL Function|Table Function|Description|
> |VARIANT_VALUE(variant, path RETURNING type [ \{ NULL \\| ERROR \\| DEFAULT }
> ON EMPTY ] [ \{ NULL \| ERROR \| DEFAULT }ON ERROR
> ])|VARIANT.variantValue(STRING path [, type, onEmpty, defaultOnEmpty,
> onError, defaultOnError])|Extracts a scalar as the given type from a variant.
>
> This method queries the variant for the given path expression and returns the
> value if the value at that path is scalar. Non-scalar values cannot be
> returned.
>
> For empty path expressions or errors a behavior can be defined to either
> return null, raise an error, or return a defined default value instead. When
> omitted, the default is NULL ON EMPTY or NULL ON ERROR, respectively. The
> default value may be a literal or an expression. If the default value itself
> raises an error, it falls through to the error behavior for ON EMPTY, and
> raises an error for ON ERROR.
>
>
> For path contains special characters such as spaces, you can use ['property']
> or ["property"] to select the specified property in a parent object. Be sure
> to put single or double quotes around the property name. When using
> VARIANT_VALUE in SQL, the path is a character parameter which is already
> single quoted, so you have to escape the single quotes around property name,
> such as JSON_VALUE(' \{"a b": "true"}
> ', '$.[''a b'']').|
> |VARIANT_QUERY(variant, path RETURNING type [ \{ WITHOUT \\| WITH CONDITIONAL
> \\| WITH UNCONDITIONAL }[ ARRAY ] WRAPPER ] [{ NULL \\| EMPTY ARRAY \\| EMPTY
> OBJECT \\| ERROR } ON EMPTY ] [ \{ NULL \| EMPTY ARRAY \| EMPTY OBJECT \|
> ERROR }
> ON ERROR ])|VARIANT.variantQuery(path, returnType [, variantQueryWrapper,
> variantQueryOnEmptyOrError, variantQueryOnEmptyOrError ])|Extracts values as
> type from a variant.
>
> The wrappingBehavior determines whether the extracted value should be wrapped
> into an array, and whether to do so unconditionally or only if the value
> itself isn’t an array already.
>
> onEmpty and onError determine the behavior in case the path expression is
> empty, or in case an error was raised, respectively. By default, in both
> cases null is returned. Other choices are to use an empty array, an empty
> object, or to raise an error.|
--
This message was sent by Atlassian Jira
(v8.20.10#820010)