[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1436#comment-1436 ] Apache Spark commented on SPARK-2871: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2142 Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108055#comment-14108055 ] Josh Rosen commented on SPARK-2871: --- Thanks a bunch for splitting this into a series of smaller PRs; it really helps when reviewing. If you don't mind, could you split this JIRA into sub-issues and make this into an umbrella issue? This will help in case different missing APIs are added in different releases or if we notice more missing methods. When you do this, can you also update the titles of your PRs to reference those sub-issues instead of this one? Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108175#comment-14108175 ] Davies Liu commented on SPARK-2871: --- The fact is that these issues will be just a useless cover, the real discussion and comments will happen in the PRs, so I think the PR could be used to track single API ( or a group of APIs), other things could be tracked by this `issue`. Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106446#comment-14106446 ] Apache Spark commented on SPARK-2871: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2091 Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106449#comment-14106449 ] Apache Spark commented on SPARK-2871: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2092 Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106456#comment-14106456 ] Apache Spark commented on SPARK-2871: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2093 Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106462#comment-14106462 ] Apache Spark commented on SPARK-2871: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2094 Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106481#comment-14106481 ] Apache Spark commented on SPARK-2871: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2095 Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu Assignee: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2871) Missing API in PySpark
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091986#comment-14091986 ] Josh Rosen commented on SPARK-2871: --- There's actually an open PR for this that's currently being reviewed (odd that it wasn't automatically linked): https://github.com/apache/spark/pull/1791 Missing API in PySpark -- Key: SPARK-2871 URL: https://issues.apache.org/jira/browse/SPARK-2871 Project: Spark Issue Type: Improvement Reporter: Davies Liu There are several APIs missing in PySpark: RDD.collectPartitions() RDD.histogram() RDD.zipWithIndex() RDD.zipWithUniqueId() RDD.min(comp) RDD.max(comp) A bunch of API related to approximate jobs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org