[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

erikerlandson Sat, 28 Jun 2014 00:00:07 -0700

Github user erikerlandson commented on the pull request:

    https://github.com/apache/spark/pull/1254#issuecomment-47420288
  
    My reasoning is that most use cases (or at least the ones I had in mind) 
are something like rdd.drop(n), where n is much smaller than rdd.count(), 
generally 1 or some other small number.     FWIW, I implemented it via an 
implicit object, so it's not directly on the RDD class per se.   Another way to 
look at it, these functions aren't worse than rdd.take(), as they use similar 
logic.
    
    However, it's true that if (n) is a large fraction of the size of the RDD, 
then it will invoke computation of a large fraction of the partitions.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

Reply via email to