Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1362#discussion_r15018547
  
    --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala ---
    @@ -32,8 +32,6 @@ abstract class Dependency[T](val rdd: RDD[T]) extends 
Serializable
     
     /**
      * :: DeveloperApi ::
    - * Base class for dependencies where each partition of the parent RDD is 
used by at most one
    - * partition of the child RDD.  Narrow dependencies allow for pipelined 
execution.
    --- End diff --
    
    It's true that this doesn't cover CartesianRDD, but at the same time I 
think we shouldn't remove this comment. Maybe change it to "where each 
partition of the child RDD depends on a small number of partitions of the 
parent RDD". That will also cover Cartesian. And leave the part about 
pipelining.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to