[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-24 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 Oh OK! Thanks @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-24 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16355 Done, and it synced now. Merged to master/2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-24 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16355 It's an apache-github sync issue: https://github.com/apache/spark/commits/branch-2.1 is missing the latest commit from

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-23 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/16355 > but now I can't get the merge script to merge it for branch-2.1 I just had some issues with that too. But manually merging (git cherry-pick + git push) seems to still work, so maybe try

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 I was able to check out this commit and test it with branch-2.1, but now I can't get the merge script to merge it for branch-2.1. @srowen would you mind trying? Thanks! --- If your project is

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 Merging with master. Will try to backport to branch-2.1 as well. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #3548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3548/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #3548 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3548/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 LGTM Thanks! Will merge after fresh tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-23 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 ping @jkbradley would you be able to take another look at the bisecting kmeans model? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-20 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 ping @jkbradley would you be able to take another look at the bisecting kmeans model? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-18 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 ping @jkbradley would you be able to take another look at the bisecting kmeans model? I've updated with the random seed as requested, and the build succeeded. Thank you! --- If your

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 ping @jkbradley would you be able to take another look at the bisecting kmeans model? I've updated with the random seed as requested. --- If your project is set up for it, you can reply to

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #3538 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3538/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71538/ Test PASSed. ---

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71538 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71538/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #3538 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3538/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71533/ Test PASSed. ---

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71533 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71533/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71538/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @jkbradley done, added seed. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 I was about to say this is ready, but I do think we should add the seed. Other than that, this should be ready! --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71533/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 LGTM pending a fresher run of the tests Thanks @imatiach-msft ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-17 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 ping @jkbradley would you be able to take another look at the bisecting K-Means fix? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71289/ Test PASSed. ---

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71289 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71289/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71289 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71289/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @jkbradley thanks, I've updated the code based on your latest comments - I removed k and the verification for the setters. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-11 Thread filousen
Github user filousen commented on the issue: https://github.com/apache/spark/pull/16355 @imatiach-msft I can confirm the fix works after copying the full source to make sure no mistake was done. I ran it with around 200 datasets, and they all worked 👍 Thank you for you

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-11 Thread carocat
Github user carocat commented on the issue: https://github.com/apache/spark/pull/16355 You are right, I didn't add all changes you had proposed for buildtree. Everything works fine with your changes :-) Many thanks --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @jkbradley @yu-iskw @srowen can you please take another look at the bisecting k-means algorithm fix? Thank you! --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @carocat @filousen Please look at these changes that I updated on December 28: -val height = math.sqrt(Seq(leftIndex, rightIndex).map { childIndex => +val indexes

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 It looks like you don't have all of my changes. I also updated the buildSubTree method. Please take a look at the latest commit. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread carocat
Github user carocat commented on the issue: https://github.com/apache/spark/pull/16355 The changes proposed in _updateAssignments_ solved partially the problem, because the key exception is still there. There another key exception in buildSubTree method, if I do i_f (isInternal

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @filousen please note this fix is still in review and hasn't been checked into spark yet. Can you send me the error you are seeing? Also, are you sure you have ported my entire fix to your

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @filousen I must have fixed your issue, because if I undo my changes and run your code I can reproduce the error, you must be running your code without this fix: Job aborted due to

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 How did you verify that this change does not fix it? I ran the following code and it ran without errors: test("Verify issue from user") { val jsonDs =

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread filousen
Github user filousen commented on the issue: https://github.com/apache/spark/pull/16355 @imatiach-msft thank you for checking this. I'm using a VectorAssembler to transform the dataset I read from the json I pasted earlier. VectorAssembler assembler = new

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @filousen could you please share the code that you used to load and run the dataset and the full error message with stack trace you are seeing? I'm a bit confused since the dataset is not a

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-10 Thread filousen
Github user filousen commented on the issue: https://github.com/apache/spark/pull/16355 Hi [Here](http://pastebin.com/WecrbYQ0) is a dataset that makes it fails with K=100 and maxIter=2 I know K>distinct features but I can reproduce the error with bigger datasets. The fix

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-09 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @jkbradley @yu-iskw @srowen can you please take another look at the bisecting k-means algorithm fix? Thank you! --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70994/ Test PASSed. ---

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70994 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70994/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70994 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70994/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-06 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @jkbradley Thank you for taking a look! I've updated the code based on your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-29 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 the only problem I see is that with this code we generate k-1 clusters instead of k, but it states in the algorithm documentation that it is not guaranteed to generate k clusters, it could be

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-29 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 @yu-iskw Pinging on this since you wrote bisecting k-means originally. Do you have time to take a look? Thanks! --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-29 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @jkbradley @srowen any comments on the changes? Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70688/ Test PASSed. ---

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70688 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70688/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70688/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70682/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70682/ Test FAILed. ---

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70682/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 I've updated with a new commit. I was able to reproduce the issue by generating a synthetic sparse dataset similar to the one Alok sent me, in accordance with the test-style of spark test

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70679/ Test PASSed. ---

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70679/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread alokob
Github user alokob commented on the issue: https://github.com/apache/spark/pull/16355 Nice to know that , codefix I suggested is working. Its really nice to contribute in spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #70679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70679/testReport)** for PR 16355 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16355 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 I have very good news :). I was not only able to repro the issue with your dataset, but I was also able to verify that with the suggested fix the algorithm does not fail (adding the val

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-28 Thread alokob
Github user alokob commented on the issue: https://github.com/apache/spark/pull/16355 Thats ok , enjoy Xmas. Please keep me posted if you find that issue is not resolved. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-26 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 Hi Alok! Sorry I was away for holiday break. I will try to reproduce the failure. Thank you, Ilya --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-25 Thread alokob
Github user alokob commented on the issue: https://github.com/apache/spark/pull/16355 @imatiach-msft Did you find the dataset suitable. Is anything else needed from my side? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-23 Thread alokob
Github user alokob commented on the issue: https://github.com/apache/spark/pull/16355 You can get sample vectors at this location https://github.com/alokob/SparkClusteringDataSet/SampleVectors.txt. Also while executing bisecting K-Means , we have set following configuration

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-22 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 Yep, there is still a TODO to verify the fix. I'm waiting for the dataset from Alok to reproduce the issue: https://issues.apache.org/jira/browse/SPARK-16473 --- If your project is set

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-21 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16355 That makes more sense as a fix, yes. Sounds like there is still a to-do to verify the fix. If it's possible to write a simple unit test to cover it, all the better. --- If your project is set up

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-21 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 Good point. It looks like we should be checking if the map contains the child or not. However, I'm not sure if that is the correct solution either. I need a repro dataset from the bug

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-21 Thread alokob
Github user alokob commented on the issue: https://github.com/apache/spark/pull/16355 @imatiach-msft , thanks for creating pull request and committing change which I have shared , I will try to share some sample dataset for this issue. --- If your project is set up for it, you can

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-20 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/16355 @imatiach-msft Can you add a test case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-20 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/16355 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this