Github user ptkool commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17708#discussion_r112549462
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 ---
    @@ -387,6 +387,13 @@ case class BroadcastHint(child: LogicalPlan) extends 
UnaryNode {
     }
     
     /**
    + * A hint for the optimizer that we should not merge two projections.
    + */
    +case class NoCollapseHint(child: LogicalPlan) extends UnaryNode {
    --- End diff --
    
    I originally thought about putting it at the expression level, but 
ultimately decided it made more sense at the LogicalPlan node level, since the 
purpose was in fact to `disrupt` the optimizer. In some respects, it's meant to 
have the same effect as `df.cache()`, but without the caching. There may, in 
fact, be situations where predicate pushdown is not desired because the 
resulting condition would become complex and expensive to evaluate.
    
    In Spark SQL, I think it also makes more sense to specify the hint at the 
derived table level, as opposed to a single expression. For instance,
    
    SELECT SNO, PNO, C1 +1, C1 + 2
    FROM ( SELECT /*+ NO_COLLAPSE */ SNO, PNO, QTY * 10 AS C1 FROM T ) T
    
    This is similar to the NO_MERGE query hint in Oracle, which prevents the 
query from being flattened.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to