Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1254#issuecomment-47419953
Thanks - I can see why this might be useful, but it is a pretty high bar
now to add new APIs to the RDD interface, and we need to be very careful about
APIs that might have very bad performance behaviors (dropping a large number
can be very slow, in particular if it crosses many partitions).
For this reason, it might make more sense for this to be an example program
or a blog post that's easily indexable so people can find.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---