Github user ankurdave commented on the pull request:

    https://github.com/apache/spark/pull/2249#issuecomment-54371860
  
    Thanks for this PR - it addresses a very good point. I updated 
[SPARK-3373](https://issues.apache.org/jira/browse/SPARK-3373) with details on 
the problem that I think you're addressing.
    
    A few minor comments:
    
    1. For binary compatibility, rather than adding an optional flag to each 
filtering operation, it would be better to add an overload that requires the 
flag and have the original version call this version with the flag's default 
value.
    2. The name `trim` doesn't give much guidance on when to use it, and it 
sounds like it's a cheap operation when in fact it involves a shuffle to 
rebuild the routing table. How about something like `highlySelective` (since I 
think we want to do this when the filtering is highly selective) or simply 
`rebuildRoutingTables`?
    3. It would be better to factor out the `if (trim) ...` conditional into a 
GraphImpl constructor, maybe by adding a `rebuildRoutingTables` flag to 
`GraphImpl.fromExistingRDDs`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to