[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108055#comment-14108055 ]
Josh Rosen commented on SPARK-2871: ----------------------------------- Thanks a bunch for splitting this into a series of smaller PRs; it really helps when reviewing. If you don't mind, could you split this JIRA into sub-issues and make this into an umbrella issue? This will help in case different missing APIs are added in different releases or if we notice more missing methods. When you do this, can you also update the titles of your PRs to reference those sub-issues instead of this one? > Missing API in PySpark > ---------------------- > > Key: SPARK-2871 > URL: https://issues.apache.org/jira/browse/SPARK-2871 > Project: Spark > Issue Type: Improvement > Reporter: Davies Liu > Assignee: Davies Liu > > There are several APIs missing in PySpark: > RDD.collectPartitions() > RDD.histogram() > RDD.zipWithIndex() > RDD.zipWithUniqueId() > RDD.min(comp) > RDD.max(comp) > A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org