[
https://issues.apache.org/jira/browse/SPARK-57626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chao Sun resolved SPARK-57626.
------------------------------
Fix Version/s: 5.0.0
Resolution: Fixed
Issue resolved by pull request 56685
[https://github.com/apache/spark/pull/56685]
> Extend shared get_json_object parsing to nested named paths
> -----------------------------------------------------------
>
> Key: SPARK-57626
> URL: https://issues.apache.org/jira/browse/SPARK-57626
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Chao Sun
> Assignee: Chao Sun
> Priority: Major
> Labels: pull-request-available
> Fix For: 5.0.0
>
>
> SPARK-47670 introduced opt-in shared parsing for repeated get_json_object
> calls over the same JSON input. Its implementation intentionally shares only
> simple top-level fields, so repeated literal nested object paths still parse
> the input independently.
> For example:
> {code:sql}
> SELECT
> get_json_object(json, '$.payload.user.id') AS user_id,
> get_json_object(json, '$.payload.user.name') AS user_name,
> get_json_object(json, '$.payload.request_id') AS request_id
> FROM events
> {code}
> With spark.sql.optimizer.getJsonObjectSharedParsing.enabled=true, these
> prefix-free named paths should be extracted in one streaming scan without
> requiring any query changes.
> The follow-up should preserve the existing get_json_object behavior for
> malformed input, duplicate keys, nulls, and rendering failures. An ancestor
> and its descendant must not share the same parse, because each requested path
> needs independent legacy semantics. Dynamic paths, wildcards, array
> subscripts, and excessively deep paths should continue using the existing
> evaluation.
> This is distinct from SPARK-53764, which collapses nested get_json_object
> function calls rather than sharing sibling paths over the same input.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]