[GitHub] [spark] monkeyboy123 commented on pull request #42376: [SPARK-44700][SQL] Rule OptimizeCsvJsonExprs should not be applied to expression like from_json(regexp_replace)

via GitHub Tue, 08 Aug 2023 05:01:56 -0700


monkeyboy123 commented on PR #42376:
URL: https://github.com/apache/spark/pull/42376#issuecomment-1669480999


   > cc @wangyum do you have any ideas? It seems any optimization that changes 
the expression shape may break common subexpression elimination (CSE). It's 
hard to come up with a good cost model to fix it. I think a better idea is to 
make CSE a plan-level optimization, so that we can find all common 
subexpressions before optimizing expressions. But it's hard to do.
   > 
   > @monkeyboy123 is it possible to rewrite your query and use subquery alias 
or CTE to hold the expression result, to avoid repeated execution? or you can 
disable this optimization by setting `spark.sql.optimizer.excludedRules` to 
include this rule.
   
   I can disable this optimization by  setting 
spark.sql.optimizer.enableJsonExpressionOptimization to false, but i think it 
is a common case that someone will encounter, maybe can we add more case ,such 
as RegExpReplace or RegExpExtract etc, to deal with this case?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] monkeyboy123 commented on pull request #42376: [SPARK-44700][SQL] Rule OptimizeCsvJsonExprs should not be applied to expression like from_json(regexp_replace)

Reply via email to