Re: How to speed up MLlib LDA?

2015-09-22 Thread Marko Asplund
How optimized are the Commons math3 methods that showed up in profiling? Are there any higher performance alternatives to these? marko

Re: How to speed up MLlib LDA?

2015-09-22 Thread Marko Asplund
Hi, I did some profiling for my LDA prototype code that requests topic distributions from a model. According to Java Mission Control more than 80 % of execution time during sample interval is spent in the following methods: org.apache.commons.math3.util.FastMath.log(double); count: 337; 47.07%

Re: How to speed up MLlib LDA?

2015-09-17 Thread Marko Asplund
Hi Feynman, I just tried that, but there wasn't a noticeable change in training performance. On the other hand model loading time was reduced to ~ 5 seconds from ~ 2 minutes (now persisted as LocalLDAModel). However, query / prediction time was unchanged. Unfortunately, this is the critical

How to speed up MLlib LDA?

2015-09-15 Thread Marko Asplund
found here: https://github.com/marko-asplund/tech-protos/blob/master/mllib-lda/src/main/scala/fi/markoa/proto/mllib/LDADemo.scala#L33-L47 marko

Re: How to speed up MLlib LDA?

2015-09-15 Thread Marko Asplund
of end-user operation execution flow, so a ~4 second execution time is very problematic. Am I using the MLlib LDA API correctly or is this just reflecting the current performance characteristics of the LDA implementation? My code can be found here: https://github.com/marko-asplund/tech-protos/blob

Fwd: MLlib LDA implementation questions

2015-09-11 Thread Marko Asplund
LDA algorithm? Any caveats to be aware of with the LDA implementation? For reference, my prototype code can be found here: https://github.com/marko-asplund/tech-protos/blob/master/mllib-lda/src/main/scala/fi/markoa/proto/mllib/LDADemo.scala thanks, marko