Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/469#discussion_r11879393
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
    @@ -236,6 +237,24 @@ abstract class RDD[T: ClassTag](
       }
     
       /**
    +   * Return the ancestors of the given RDD that are related to it only 
through a sequence of
    +   * narrow dependencies. This traverses the given RDD's dependency tree 
using DFS, but maintains
    +   * no ordering on the RDDs returned.
    +   */
    +  private[spark] def getNarrowAncestors(
    --- End diff --
    
    Nit: but it might be nicer if this function didn't expose the `ancestors` 
to the outside world, since there is no real reason to have that in the public 
contract of the function. Could you instead write an inner function and recurse 
using that, instead of recursing down the top level function? If you did that 
you could probably avoid passing `ancestors` around at all, and instead just 
define the ancestors set in the outer function and the inner function will have 
a direct reference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to