[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-16 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Looks like the build completed successfully. Somehow Jenkins didn't post the message. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-13 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 My thought was that there's nothing in the computation that prevents it (unlike 0 weights, which cause NaNs). Why not err on the side of maximum flexibility to the end user? Just because there's

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-09 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Your changes sound fine. I don't have strong feelings one way or another, although I think we should at least throw a warning if we're discarding points. I do want to point out that disallowing

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2017-01-04 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r94641912 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -328,74 +336,80 @@ class IsotonicRegression private

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-04 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Done. It was surprisingly tricky to test, since the exception is thrown in the executor. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-03 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Sorry, I've been on vacation. I'll make the requested changes in the next day or two. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-20 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 @viirya Better to remove them, or throw an error? Personally, I'd rather be alerted that I'm passing invalid input, rather than have it "fixed" for me. --- If your project is set up f

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-12-20 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r9328 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -328,74 +336,69 @@ class IsotonicRegression private

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-19 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 @srowen Addressed your comments and fixed some style issues. Some updated timings: Alternating decreasing input val x = (1 to length).toArray.map(_.toDouble

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-15 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Alright, the PAV algorithm has been completely re-written to follow what's outlined in "Minimizing Separable Convex Functions Subject to Simple Chain Constraints". I've tested it wi

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-15 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Found another input that triggers non-polynomial time with the code in this PR. I'm again borrowing from scikit-learn. I think this is the case they found that led them to re-write

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-14 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Looks like the new scikit-learn implementation suffers from a similar problem to the one that the original Spark implementation had. I left them a note pointing it out. Right now I'm

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-14 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Given that the scikit-learn implementation I based this on has changed (https://github.com/scikit-learn/scikit-learn/pull/7444) since I submitted the PR, and @jkbradley has pointed out a reference

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-12-14 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r92432636 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -344,27 +344,30 @@ class IsotonicRegression private

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-12-14 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r92428442 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -344,27 +344,30 @@ class IsotonicRegression private

[GitHub] spark issue #15396: [SPARK-14804][Spark][Graphx] Fix checkpointing of Vertex...

2016-11-11 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15396 Any update on this? Would love to see it in an upcoming release. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-10-20 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Anyone? @srowen? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #12576: [SPARK-14804][Spark][Graphx] Fix Graph vertexRDD/EdgeRDD...

2016-10-11 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/12576 Not sure what's going on regarding the multiple PRs for this issue, but I cherry-picked this PR on top of 1.6.2 and it fixed the problem for me. --- If your project is set up for it, you can reply

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-10-04 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Could someone please take a look at this? @mengxr maybe? Looks like you merged the initial implementation. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-09-08 Thread neggert
GitHub user neggert opened a pull request: https://github.com/apache/spark/pull/15018 [SPARK-17455][MLlib] Improve PAVA implementation in IsotonicRegression ## What changes were proposed in this pull request? New implementation of the Pool Adjacent Violators Algorithm (PAVA

[GitHub] spark pull request #14140: [SPARK-16426][MLlib] Fix bug that caused NaNs in ...

2016-07-14 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/14140#discussion_r70818220 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -408,9 +409,11 @@ class IsotonicRegression private

[GitHub] spark pull request #14140: [SPARK-16426][MLlib] Fix bug that caused NaNs in ...

2016-07-14 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/14140#discussion_r70816925 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -408,9 +409,11 @@ class IsotonicRegression private

[GitHub] spark pull request #14140: [SPARK-16426][MLlib] Fix bug that caused NaNs in ...

2016-07-12 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/14140#discussion_r70451628 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -408,8 +409,12 @@ class IsotonicRegression private

[GitHub] spark pull request #14140: [SPARK-16426][MLlib] Fix bug that caused NaNs in ...

2016-07-12 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/14140#discussion_r70450795 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -408,8 +409,12 @@ class IsotonicRegression private

[GitHub] spark pull request #14140: [SPARK-16426][MLlib] Fixed bug that caused NaNs i...

2016-07-11 Thread neggert
GitHub user neggert opened a pull request: https://github.com/apache/spark/pull/14140 [SPARK-16426][MLlib] Fixed bug that caused NaNs in IsotonicRegression ## What changes were proposed in this pull request? Fixed a bug that caused `NaN`s in `IsotonicRegression