Github user feynmanliang commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123412447
I played with it this morning. The bugs were occurring because `ids =
List()`; apparently Breeze calls `dgemv` with an invalid `LDA` parameter when
you row-index
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123413737
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123413707
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user feynmanliang commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123414657
Ran some local perf tests. Before PR:
```
bin/run-example mllib.LDAExample docs/*.md --maxIterations 100 --algorithm
online --vocabSize 100 --k 3
```
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123414568
[Test build #37968 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37968/consoleFull)
for PR 7454 at commit
Github user feynmanliang commented on a diff in the pull request:
https://github.com/apache/spark/pull/7454#discussion_r35131297
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -387,39 +387,32 @@ final class OnlineLDAOptimizer extends
Github user feynmanliang commented on a diff in the pull request:
https://github.com/apache/spark/pull/7454#discussion_r35131292
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -387,39 +387,32 @@ final class OnlineLDAOptimizer extends
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123413852
Oh, I see. Thanks for investigating! In my example, the numbers of terms
is limited to 10 (so I could print the topics), probably making some documents
empty.
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123090642
I'll make a pass. Can you please make a JIRA for this and put it in the
title?
Also, can you please test this to verify the speedups? It sounds like
local
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/7454#discussion_r35070522
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -387,39 +387,32 @@ final class OnlineLDAOptimizer extends
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123165337
Ohh, actually, it might be from me trying to stats...which might be some
weird Breeze object which does not implement toString properly. Let me retry
---
If your
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123165239
I think there's a bug. I tried running the LDAExample as follows, and it
failed with the following exception:
I ran:
```
bin/run-example
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/7454#discussion_r35070523
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
@@ -387,39 +387,32 @@ final class OnlineLDAOptimizer extends
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123165267
I'm wondering if it's a mis-matched shape issue.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-123173789
Hm, no, I think something is wrong. Can you try running the example as I
wrote above?
---
If your project is set up for it, you can reply to this email and have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122212916
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122212897
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-12260
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-12204
[Test build #37610 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37610/console)
for PR 7454 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122213493
[Test build #37610 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37610/consoleFull)
for PR 7454 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122132823
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122132814
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
GitHub user feynmanliang opened a pull request:
https://github.com/apache/spark/pull/7454
[MLlib]OnlineLDA Performance Improvements
Use range-slicing (coalesced memory access), in-place updates, and reduce
number of transposes in OnlineLDA implementation.
You can merge this pull
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122138072
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122137146
[Test build #37553 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37553/consoleFull)
for PR 7454 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122137002
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122137013
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122138031
[Test build #37550 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37550/console)
for PR 7454 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122141480
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122133223
[Test build #37550 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37550/consoleFull)
for PR 7454 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7454#issuecomment-122141399
[Test build #37553 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37553/console)
for PR 7454 at commit
31 matches
Mail list logo