[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37638763 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13178/ --- If your pr

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37638767 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37638762 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37638768 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13179/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37634990 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37634989 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37634895 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37634896 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37625562 They were added in 2.7.4 onwards though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37623108 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13177/ --- If your pr

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37623106 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37623099 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37623100 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13176/ --- If your pr

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37621013 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37621012 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37620616 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37620617 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37619558 Got it, I think this is okay for now then, but please add some comments in the code to explain that this is an internal API and didn't seem to change across Python versions.

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37618967 Hey Matei, Got rid of copying `heapq.py` and all the license stuff. But resorted to using internal API of heapq though. It should be simpler. I ju

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37509935 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13154/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37509934 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37506805 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37506804 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-13 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37505304 PriorityQueue in a way is just a wrapper over heapq and allows for blocking for put and get(AFAIU). We would need maxheapq to retain the top N smallest elements. One oth

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37505082 BTW as mentioned above please use PriorityQueue here instead of copying their heap. It's just a lot of work to copy the heap.. we can take the performance hit instead. ---

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37479507 takeOrdered should always return the smallest elements according to the ordering, so it's not the same as top. For example takeOrdered(2) on [1,2,3,4] should return [1,2]. B

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-11 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37379097 Hi Matei, Does this mean that when key is None, then it would do the same thing as top ? In case NO, then we would need a maxheap since min heap will only keep N

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-11 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37329386 Hi Prashant, For this feature I think it would be better to use a "key" function instead of a boolean flag for the order. So make the API like this: ``` def

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37165063 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13096/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37165062 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37161749 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37161750 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-10 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37161692 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37158461 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37158464 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13090/ --- If your pr

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37156426 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37156427 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37088856 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13061/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37088855 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37087805 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37087804 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13058/ --- If your pr

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37087802 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37087806 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37086582 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37086581 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37086086 Hey Matei, PSF License is included now, I was not sure if the entire history of license should be included. --- If your project is set up for it, you can rep

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/97#discussion_r10407050 --- Diff: python/pyspark/maxheapq.py --- @@ -0,0 +1,115 @@ +# -*- coding: latin-1 -*- + +"""Heap queue algorithm (a.k.a. priority queue). +

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/97#discussion_r10406957 --- Diff: python/pyspark/maxheapq.py --- @@ -0,0 +1,115 @@ +# -*- coding: latin-1 -*- + +"""Heap queue algorithm (a.k.a. priority queue). + +#

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37046661 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37046662 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13048/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37041118 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37041120 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37040295 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37040297 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13046/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37034710 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37034708 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37016129 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13045/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37016128 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37013191 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread ScrapCodes
GitHub user ScrapCodes opened a pull request: https://github.com/apache/spark/pull/97 Spark 1162 Implemented takeOrdered in pyspark. Since python does not have a library for max heap and usual tricks like inverting values etc.. does not work for all cases. So best thing I could thin

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37013190 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have