subject:"Spark LDA"

spark lda runs out of disk space

2016-02-29 Thread TheGeorge1918 .

Hi guys I was running lda with 2000 topics on 6G compressed data, roughly 1.2 million docs. I used aws 3 r3.8xlarge machines as core nodes. It turned out spark applications crushed after 3 or 4 iterations. From ganglia, it indicated the disk space was all consumed. I believe it’s the shuffle

Re: Spark LDA model reuse with new set of data

2016-01-26 Thread Joseph Bradley

> first contact with ML). > > Ok, I am trying to write a DSL where you can run some commands. > > I did a command that trains the Spark LDA and it produces the topics I want > and I saved it using the save method provided by the LDAModel. > > Now I want to load this LDAModel and u

Spark LDA

2016-01-22 Thread Ilya Ganelin

Hi all - I'm running the Spark LDA algorithm on a dataset of roughly 3 million terms with a resulting RDD of approximately 20 GB on a 5 node cluster with 10 executors (3 cores each) and 14gb of memory per executor. As the application runs, I'm seeing progressively longer execution times

spark lda runs out of disk space

Re: Spark LDA model reuse with new set of data

Spark LDA

3 matches

Site Navigation

Mail list logo

Footer information