waynexia commented on code in PR #9700:
URL: https://github.com/apache/arrow-datafusion/pull/9700#discussion_r1533433201


##########
datafusion/optimizer/src/common_subexpr_eliminate.rs:
##########
@@ -53,10 +53,32 @@ type ExprSet = HashMap<Identifier, (Expr, usize, DataType)>;
 /// here is not such a good choose.
 type Identifier = String;
 
-/// Perform Common Sub-expression Elimination optimization.
+/// Performs Common Sub-expression Elimination optimization.
 ///
-/// Currently only common sub-expressions within one logical plan will
+/// This optimization improves query performance by computing expressions that
+/// appear more than once and reusing those results rather than re-computing 
the
+/// same value
+///
+/// Currently only common sub-expressions within a single `LogicalPlan` are
 /// be eliminated.
+///
+/// # Example
+///
+/// Given a projection that computes the same expensive expression
+/// multiple times such as parsing as string as a date with `to_date` twice:
+///
+/// ```text
+/// ProjectionExec(expr=[extract (day from to_date(c1)), extract (year from 
to_date(c1))])
+/// ```
+///
+/// This optimization will rewrite the plan to compute the common expression 
once
+/// using a new `ProjectionExec` and then rewrite the original expressions to
+/// refer to that new column.
+///
+/// ```text
+/// ProjectionExec(exprs=[extract (day from new_col), extract (year from 
new_col)]) <-- reuse here

Review Comment:
   It is possible. The optimized plan will have an obvious intermediate 
`Projection` plan that does some computation and the result will be referred 
later by other following plans (this pattern might occur in other reasons though



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to