[ 
https://issues.apache.org/jira/browse/SPARK-12957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866239#comment-15866239
 ] 

Nick Dimiduk commented on SPARK-12957:
--------------------------------------

[~sameerag] thanks for the comment. From a naive scan of the tickets, I believe 
I am seeing the benefits of SPARK-13871 in that a {{IsNotNull}} constraint is 
applied from the names of the join columns. However, I don't see the boon of 
SPARK-13789, specifically the {{a = 5, a = b}} mentioned in the description. My 
query is a join between a very small relation (100's of rows) and a very large 
one (10's of billions). I've hinted the planner to broadcast the smaller table, 
which it honors. After SPARK-13789, I expected to see the join column values 
pushed down as well. This is not the case.

Any tips on debugging this further? I've set breakpoints in the 
{{RelationProvider}} implementation and see that it's only receiving the 
{{IsNotNull}} filters, nothing further from the planner.

Thanks a lot!

> Derive and propagate data constrains in logical plan 
> -----------------------------------------------------
>
>                 Key: SPARK-12957
>                 URL: https://issues.apache.org/jira/browse/SPARK-12957
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Yin Huai
>            Assignee: Sameer Agarwal
>         Attachments: ConstraintPropagationinSparkSQL.pdf
>
>
> Based on the semantic of a query plan, we can derive data constrains (e.g. if 
> a filter defines {{a > 10}}, we know that the output data of this filter 
> satisfy the constrain of {{a > 10}} and {{a is not null}}). We should build a 
> framework to derive and propagate constrains in the logical plan, which can 
> help us to build more advanced optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to