Zheng Shao created SPARK-47670: ---------------------------------- Summary: Multiple calls to GET_JSON_OBJECT with the same JSON str should parse it just one time Key: SPARK-47670 URL: https://issues.apache.org/jira/browse/SPARK-47670 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.5.2 Reporter: Zheng Shao
For a query like the following: {{SELECT}} {{ GET_JSON_OBJECT(json_col, '$.a.b'),}} {{ GET_JSON_OBJECT(json_col, '$.a.c')}} {{FROM t}} SparkSQL would generate a plan that parse the json_col twice. Ideally, SparkSQL should only parse the `json_col` once. The optimizer should find out the common JSON parsing, and modify the plan to parse the JSON once, get the result out, and flatten it back. An alternative way to support this is the ":" notation (JSON Path) as in other systems where the query optimizer will automatically share a single JSON parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org